{"id":838,"date":"2024-03-03T10:00:00","date_gmt":"2024-03-03T10:00:00","guid":{"rendered":"https:\/\/jacar.es\/proxies-llm-litellm\/"},"modified":"2024-03-03T10:00:00","modified_gmt":"2024-03-03T10:00:00","slug":"proxies-llm-litellm","status":"publish","type":"post","link":"https:\/\/jacar.es\/en\/proxies-llm-litellm\/","title":{"rendered":"LiteLLM: A Proxy to Unify Model Providers"},"content":{"rendered":"<p>The first integration with an LLM is always easy: one key, one SDK, three lines and a prompt. The second, six months later, is no longer so. A second provider appears because Claude reasons better on long tasks, or because a self-hosted model is needed for data that cannot leave the perimeter, or because someone discovers that Cohere multilingual embeddings cost a fraction of what OpenAI charges for equivalent work. At that point the application code stops being clean. Each SDK has its own client, its own message format, its own streaming semantics, its own errors, its own rules for function calling. The team starts writing adapters, and every new cross-cutting requirement \u2014 rate limiting, observability, per-tenant budget, fallback when a provider is down \u2014 has to be implemented twice or three times over.<\/p>\n<p>The pattern that solves this is old and familiar in infrastructure: a proxy. Instead of each application talking directly to each provider, they all talk to a single internal service that talks to the outside world on their behalf. <strong><a href=\"https:\/\/github.com\/BerriAI\/litellm\">LiteLLM<\/a><\/strong> is, as of early 2024, the most serious open-source project for doing this in the LLM space. It offers an OpenAI-compatible API over more than a hundred providers, it can be deployed as a library or as an HTTP server, and it comes with most of the things you would eventually end up writing yourself.<\/p>\n<h2 id=\"why-proxy-at-all\">Why Proxy at All<\/h2>\n<p>The question is not trivial, because any proxy adds latency, another component to maintain, and another failure point. The justification has to be concrete. There are four reasons, and they usually arrive together.<\/p>\n<p>The first is homogeneity. A single OpenAI-compatible client across all applications, pointing at an internal endpoint, replaces half a dozen SDKs. Switching model becomes a configuration field, not a refactor. Migrating an entire app from GPT-4 to Claude 3 Opus is reduced to repointing an alias.<\/p>\n<p>The second is governance. As soon as more than one team uses LLMs, someone asks how much each is spending, and ideally wants to cap it before next month\u2019s invoice surprises finance. A central proxy issues virtual keys per team, per user or per service, with budget and expiry attached. The real provider keys live in exactly one place.<\/p>\n<p>The third is resilience. LLM providers go down, rate-limit, and serve degraded responses more often than one would expect from services at their price point. A proxy can declare fallbacks \u2014 if GPT-4 returns 429 or 5xx, retry on Claude 3 Sonnet; if Anthropic is saturated, fall back to the self-hosted Mistral \u2014 without the applications noticing. This turns provider incidents into silent degradations rather than product outages.<\/p>\n<p>The fourth is observability. Cost, latency and token metrics by model, tenant and route, emitted from a single point to Prometheus or Langfuse, avoid having to instrument every call in every application. It is also the natural place to insert caching, PII redaction, auditing and compliance.<\/p>\n<h2 id=\"library-or-server\">Library or Server<\/h2>\n<p>LiteLLM can be used in two modes, and the choice shapes everything else. In library mode you import <code>litellm.completion<\/code> inside the application code and enjoy the unified API without deploying anything new. This is reasonable for monoliths, prototypes or one-off scripts, but it loses almost all the cross-cutting benefits: every instance of the app needs the keys, every team does its own rate limiting, every service emits metrics its own way.<\/p>\n<p>In proxy mode you deploy a separate binary \u2014 container, pod, systemd unit \u2014 and the applications talk to it as if it were OpenAI. This is the default configuration for any serious use. Its cost is an internal network hop of order 5-20 ms, negligible compared with the hundreds or thousands of milliseconds of an actual LLM call. Its benefit is concentrating all cross-cutting logic in one place.<\/p>\n<h2 id=\"what-the-proxy-declares\">What the Proxy Declares<\/h2>\n<p>A typical configuration is a YAML with three blocks. The first, <code>model_list<\/code>, maps logical names like <code>gpt-4<\/code>, <code>claude-3-sonnet<\/code> or <code>mistral-local<\/code> to concrete provider configurations: the prefix <code>openai\/<\/code>, <code>anthropic\/<\/code> or <code>ollama\/<\/code> identifies the backend, the key is read from an environment variable, and <code>api_base<\/code> can point at an internal Ollama. The second, <code>router_settings<\/code>, declares routing policy and fallbacks: an ordered list per logical model indicates which others to jump to when the first fails, and a global strategy such as <code>least-busy<\/code>, <code>lowest-cost<\/code> or <code>lowest-latency<\/code> decides the tie-breaker when several candidates qualify. The third, <code>general_settings<\/code>, sets the master key used by an administrator to mint virtual keys via API, points at a Postgres to persist budgets and usage, and optionally wires a Redis for caching of semantically equivalent responses.<\/p>\n<p>The minimum fragment \u2014 the only one worth the space here \u2014 captures the three pieces together:<\/p>\n<div class=\"sourceCode\" id=\"cb1\">\n<pre class=\"sourceCode yaml\"><code class=\"sourceCode yaml\"><span id=\"cb1-1\"><a href=\"#cb1-1\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"fu\">model_list<\/span><span class=\"kw\">:<\/span><\/span>\n<span id=\"cb1-2\"><a href=\"#cb1-2\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">  <\/span><span class=\"kw\">-<\/span><span class=\"at\"> <\/span><span class=\"fu\">model_name<\/span><span class=\"kw\">:<\/span><span class=\"at\"> gpt-4<\/span><\/span>\n<span id=\"cb1-3\"><a href=\"#cb1-3\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">    <\/span><span class=\"fu\">litellm_params<\/span><span class=\"kw\">:<\/span><\/span>\n<span id=\"cb1-4\"><a href=\"#cb1-4\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">      <\/span><span class=\"fu\">model<\/span><span class=\"kw\">:<\/span><span class=\"at\"> openai\/gpt-4<\/span><\/span>\n<span id=\"cb1-5\"><a href=\"#cb1-5\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">      <\/span><span class=\"fu\">api_key<\/span><span class=\"kw\">:<\/span><span class=\"at\"> os.environ\/OPENAI_API_KEY<\/span><\/span>\n<span id=\"cb1-6\"><a href=\"#cb1-6\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">  <\/span><span class=\"kw\">-<\/span><span class=\"at\"> <\/span><span class=\"fu\">model_name<\/span><span class=\"kw\">:<\/span><span class=\"at\"> claude-3-sonnet<\/span><\/span>\n<span id=\"cb1-7\"><a href=\"#cb1-7\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">    <\/span><span class=\"fu\">litellm_params<\/span><span class=\"kw\">:<\/span><\/span>\n<span id=\"cb1-8\"><a href=\"#cb1-8\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">      <\/span><span class=\"fu\">model<\/span><span class=\"kw\">:<\/span><span class=\"at\"> anthropic\/claude-3-sonnet-20240229<\/span><\/span>\n<span id=\"cb1-9\"><a href=\"#cb1-9\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">      <\/span><span class=\"fu\">api_key<\/span><span class=\"kw\">:<\/span><span class=\"at\"> os.environ\/ANTHROPIC_API_KEY<\/span><\/span>\n<span id=\"cb1-10\"><a href=\"#cb1-10\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><\/span>\n<span id=\"cb1-11\"><a href=\"#cb1-11\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"fu\">router_settings<\/span><span class=\"kw\">:<\/span><\/span>\n<span id=\"cb1-12\"><a href=\"#cb1-12\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">  <\/span><span class=\"fu\">fallbacks<\/span><span class=\"kw\">:<\/span><\/span>\n<span id=\"cb1-13\"><a href=\"#cb1-13\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">    <\/span><span class=\"kw\">-<\/span><span class=\"at\"> <\/span><span class=\"fu\">gpt-4<\/span><span class=\"kw\">:<\/span><span class=\"at\"> <\/span><span class=\"kw\">[<\/span><span class=\"st\">&quot;claude-3-sonnet&quot;<\/span><span class=\"kw\">]<\/span><\/span>\n<span id=\"cb1-14\"><a href=\"#cb1-14\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">  <\/span><span class=\"fu\">routing_strategy<\/span><span class=\"kw\">:<\/span><span class=\"at\"> least-busy<\/span><\/span>\n<span id=\"cb1-15\"><a href=\"#cb1-15\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><\/span>\n<span id=\"cb1-16\"><a href=\"#cb1-16\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"fu\">general_settings<\/span><span class=\"kw\">:<\/span><\/span>\n<span id=\"cb1-17\"><a href=\"#cb1-17\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">  <\/span><span class=\"fu\">master_key<\/span><span class=\"kw\">:<\/span><span class=\"at\"> os.environ\/LITELLM_MASTER_KEY<\/span><\/span>\n<span id=\"cb1-18\"><a href=\"#cb1-18\" aria-hidden=\"true\" tabindex=\"-1\"><\/a><span class=\"at\">  <\/span><span class=\"fu\">database_url<\/span><span class=\"kw\">:<\/span><span class=\"at\"> os.environ\/DATABASE_URL<\/span><\/span><\/code><\/pre>\n<\/div>\n<p>The rest of the surface \u2014 per-key budgets, Redis caching with TTL, tagging by environment, Langfuse or Helicone integration, Prometheus metrics \u2014 is described in the same file with analogous blocks and applied without touching application code.<\/p>\n<h2 id=\"what-not-to-expect\">What Not to Expect<\/h2>\n<p>LiteLLM translates between different APIs, and the translation is not always perfect. The most provider-specific features \u2014 structured output with complex schemas, OpenAI function calling versus Anthropic tool use, the reasoning modes of certain models \u2014 sometimes do not map 1:1. It is worth reading the changelog before trusting a critical flow to a non-trivial translation. Added latency is small but not zero, and for high-volume embedding workloads it can be more noticeable than expected. The proxy itself is yet another piece to maintain, with its own database, upgrades and metrics. And if real usage is a single provider with no plan to change, the complexity does not pay for itself: a local abstraction layer in the backend is enough.<\/p>\n<h2 id=\"a-pattern-that-works\">A Pattern That Works<\/h2>\n<p>The deployment I have seen stabilise in several teams is always similar. Two replicas of the proxy behind an internal service, a shared Postgres for keys and usage, a Redis for semantic cache, virtual keys per team or service with a monthly budget, fallbacks declared for the two or three critical models, a Prometheus scrape with <code>model<\/code>, <code>tenant<\/code> and <code>route<\/code> labels, and alerts on per-provider error rate. Applications see a single OpenAI-compatible endpoint and send their virtual key in a header; everything else happens inside the proxy.<\/p>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>An LLM proxy is not a revolutionary idea, it is the same indirection layer already placed between applications and databases, between applications and queues, between applications and identity. It earns its place for the same reasons: it isolates decisions that change often, it concentrates governance and observability, and it lets the application ignore the details of the provider. LiteLLM is today the most complete open-source implementation, stable enough for production and flexible enough to absorb the changes that will keep arriving in the model stack over the next quarters. With a single provider and no foreseeable second one, the component is dispensable. From the second model onwards, giving up on hand-written adapters and delegating to a proxy stops being a matter of taste and becomes basic hygiene.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When an application talks to two or more LLM providers, sooner or later a proxy appears in between. LiteLLM offers a concrete one, and this is an honest read of what it gains and what it costs.<\/p>\n","protected":false},"author":1,"featured_media":841,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24,22],"tags":[261,379,382,381,51,380],"class_list":["post-838","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-herramientas","category-inteligencia-artificial","tag-anthropic","tag-litellm","tag-llm-routing","tag-multi-proveedor","tag-openai","tag-proxy"],"translation":{"provider":"WPGlobus","version":"3.0.2","language":"en","enabled_languages":["es","en"],"languages":{"es":{"title":true,"content":true,"excerpt":true},"en":{"title":true,"content":true,"excerpt":true}}},"gutentor_comment":0,"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>LiteLLM: A Proxy to Unify Model Providers - Jacar<\/title>\n<meta name=\"description\" content=\"LiteLLM as unified LLM proxy: multi-provider, caching, rate limiting, fallbacks, and observability with a single OpenAI-compatible API.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/jacar.es\/proxies-llm-litellm\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"LiteLLM: A Proxy to Unify Model Providers - Jacar\" \/>\n<meta property=\"og:description\" content=\"LiteLLM as unified LLM proxy: multi-provider, caching, rate limiting, fallbacks, and observability with a single OpenAI-compatible API.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/jacar.es\/proxies-llm-litellm\/\" \/>\n<meta property=\"og:site_name\" content=\"Jacar\" \/>\n<meta property=\"article:published_time\" content=\"2024-03-03T10:00:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/jcs-wp-jacar-es.fsn1.your-objectstorage.com\/wp-content\/uploads\/2020\/09\/favicon.png\" \/>\n\t<meta property=\"og:image:width\" content=\"252\" \/>\n\t<meta property=\"og:image:height\" content=\"229\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"javi\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"javi\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"13 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/\"},\"author\":{\"name\":\"javi\",\"@id\":\"https:\\\/\\\/jacar.es\\\/#\\\/schema\\\/person\\\/54a7f7b4224b38fafc9866eb3e614208\"},\"headline\":\"LiteLLM: A Proxy to Unify Model Providers\",\"datePublished\":\"2024-03-03T10:00:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/\"},\"wordCount\":2375,\"publisher\":{\"@id\":\"https:\\\/\\\/jacar.es\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/jcs-wp-jacar-es.fsn1.your-objectstorage.com\\\/wp-content\\\/uploads\\\/2024\\\/03\\\/20064502\\\/jwp-2258978-21239.jpg\",\"keywords\":[\"anthropic\",\"litellm\",\"llm routing\",\"multi-proveedor\",\"openai\",\"proxy\"],\"articleSection\":[\"Herramientas\",\"Inteligencia Artificial\"],\"inLanguage\":\"en-US\"},{\"@type\":[\"WebPage\",\"ItemPage\"],\"@id\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/\",\"url\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/\",\"name\":\"LiteLLM: A Proxy to Unify Model Providers - Jacar\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/jacar.es\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/jcs-wp-jacar-es.fsn1.your-objectstorage.com\\\/wp-content\\\/uploads\\\/2024\\\/03\\\/20064502\\\/jwp-2258978-21239.jpg\",\"datePublished\":\"2024-03-03T10:00:00+00:00\",\"description\":\"LiteLLM as unified LLM proxy: multi-provider, caching, rate limiting, fallbacks, and observability with a single OpenAI-compatible API.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/#primaryimage\",\"url\":\"https:\\\/\\\/jcs-wp-jacar-es.fsn1.your-objectstorage.com\\\/wp-content\\\/uploads\\\/2024\\\/03\\\/20064502\\\/jwp-2258978-21239.jpg\",\"contentUrl\":\"https:\\\/\\\/jcs-wp-jacar-es.fsn1.your-objectstorage.com\\\/wp-content\\\/uploads\\\/2024\\\/03\\\/20064502\\\/jwp-2258978-21239.jpg\",\"width\":1200,\"height\":800,\"caption\":\"Router de fibra con cables conectados representando orquestaci\u00f3n de tr\u00e1fico entre proveedores\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/jacar.es\\\/proxies-llm-litellm\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Portada\",\"item\":\"https:\\\/\\\/jacar.es\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"LiteLLM: un proxy para unificar proveedores de modelos\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/jacar.es\\\/#website\",\"url\":\"https:\\\/\\\/jacar.es\\\/\",\"name\":\"Jacar\",\"description\":\"Passion for Technology\",\"publisher\":{\"@id\":\"https:\\\/\\\/jacar.es\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/jacar.es\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/jacar.es\\\/#organization\",\"name\":\"Jacar\",\"url\":\"https:\\\/\\\/jacar.es\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/jacar.es\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/jacar.es\\\/wp-content\\\/uploads\\\/2020\\\/09\\\/favicon.png\",\"contentUrl\":\"https:\\\/\\\/jacar.es\\\/wp-content\\\/uploads\\\/2020\\\/09\\\/favicon.png\",\"width\":252,\"height\":229,\"caption\":\"Jacar\"},\"image\":{\"@id\":\"https:\\\/\\\/jacar.es\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/javiercanetearroyo\\\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/jacar.es\\\/#\\\/schema\\\/person\\\/54a7f7b4224b38fafc9866eb3e614208\",\"name\":\"javi\",\"sameAs\":[\"https:\\\/\\\/jacar.es\"],\"url\":\"https:\\\/\\\/jacar.es\\\/en\\\/author\\\/javi\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"LiteLLM: A Proxy to Unify Model Providers - Jacar","description":"LiteLLM as unified LLM proxy: multi-provider, caching, rate limiting, fallbacks, and observability with a single OpenAI-compatible API.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/jacar.es\/proxies-llm-litellm\/","og_locale":"en_US","og_type":"article","og_title":"LiteLLM: A Proxy to Unify Model Providers - Jacar","og_description":"LiteLLM as unified LLM proxy: multi-provider, caching, rate limiting, fallbacks, and observability with a single OpenAI-compatible API.","og_url":"https:\/\/jacar.es\/proxies-llm-litellm\/","og_site_name":"Jacar","article_published_time":"2024-03-03T10:00:00+00:00","og_image":[{"width":252,"height":229,"url":"https:\/\/jcs-wp-jacar-es.fsn1.your-objectstorage.com\/wp-content\/uploads\/2020\/09\/favicon.png","type":"image\/png"}],"author":"javi","twitter_card":"summary_large_image","twitter_misc":{"Written by":"javi","Est. reading time":"13 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/jacar.es\/proxies-llm-litellm\/#article","isPartOf":{"@id":"https:\/\/jacar.es\/proxies-llm-litellm\/"},"author":{"name":"javi","@id":"https:\/\/jacar.es\/#\/schema\/person\/54a7f7b4224b38fafc9866eb3e614208"},"headline":"LiteLLM: A Proxy to Unify Model Providers","datePublished":"2024-03-03T10:00:00+00:00","mainEntityOfPage":{"@id":"https:\/\/jacar.es\/proxies-llm-litellm\/"},"wordCount":2375,"publisher":{"@id":"https:\/\/jacar.es\/#organization"},"image":{"@id":"https:\/\/jacar.es\/proxies-llm-litellm\/#primaryimage"},"thumbnailUrl":"https:\/\/jcs-wp-jacar-es.fsn1.your-objectstorage.com\/wp-content\/uploads\/2024\/03\/20064502\/jwp-2258978-21239.jpg","keywords":["anthropic","litellm","llm routing","multi-proveedor","openai","proxy"],"articleSection":["Herramientas","Inteligencia Artificial"],"inLanguage":"en-US"},{"@type":["WebPage","ItemPage"],"@id":"https:\/\/jacar.es\/proxies-llm-litellm\/","url":"https:\/\/jacar.es\/proxies-llm-litellm\/","name":"LiteLLM: A Proxy to Unify Model Providers - Jacar","isPartOf":{"@id":"https:\/\/jacar.es\/#website"},"primaryImageOfPage":{"@id":"https:\/\/jacar.es\/proxies-llm-litellm\/#primaryimage"},"image":{"@id":"https:\/\/jacar.es\/proxies-llm-litellm\/#primaryimage"},"thumbnailUrl":"https:\/\/jcs-wp-jacar-es.fsn1.your-objectstorage.com\/wp-content\/uploads\/2024\/03\/20064502\/jwp-2258978-21239.jpg","datePublished":"2024-03-03T10:00:00+00:00","description":"LiteLLM as unified LLM proxy: multi-provider, caching, rate limiting, fallbacks, and observability with a single OpenAI-compatible API.","breadcrumb":{"@id":"https:\/\/jacar.es\/proxies-llm-litellm\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/jacar.es\/proxies-llm-litellm\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jacar.es\/proxies-llm-litellm\/#primaryimage","url":"https:\/\/jcs-wp-jacar-es.fsn1.your-objectstorage.com\/wp-content\/uploads\/2024\/03\/20064502\/jwp-2258978-21239.jpg","contentUrl":"https:\/\/jcs-wp-jacar-es.fsn1.your-objectstorage.com\/wp-content\/uploads\/2024\/03\/20064502\/jwp-2258978-21239.jpg","width":1200,"height":800,"caption":"Router de fibra con cables conectados representando orquestaci\u00f3n de tr\u00e1fico entre proveedores"},{"@type":"BreadcrumbList","@id":"https:\/\/jacar.es\/proxies-llm-litellm\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Portada","item":"https:\/\/jacar.es\/"},{"@type":"ListItem","position":2,"name":"LiteLLM: un proxy para unificar proveedores de modelos"}]},{"@type":"WebSite","@id":"https:\/\/jacar.es\/#website","url":"https:\/\/jacar.es\/","name":"Jacar","description":"Passion for Technology","publisher":{"@id":"https:\/\/jacar.es\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/jacar.es\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/jacar.es\/#organization","name":"Jacar","url":"https:\/\/jacar.es\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jacar.es\/#\/schema\/logo\/image\/","url":"https:\/\/jacar.es\/wp-content\/uploads\/2020\/09\/favicon.png","contentUrl":"https:\/\/jacar.es\/wp-content\/uploads\/2020\/09\/favicon.png","width":252,"height":229,"caption":"Jacar"},"image":{"@id":"https:\/\/jacar.es\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.linkedin.com\/in\/javiercanetearroyo\/"]},{"@type":"Person","@id":"https:\/\/jacar.es\/#\/schema\/person\/54a7f7b4224b38fafc9866eb3e614208","name":"javi","sameAs":["https:\/\/jacar.es"],"url":"https:\/\/jacar.es\/en\/author\/javi\/"}]}},"_links":{"self":[{"href":"https:\/\/jacar.es\/en\/wp-json\/wp\/v2\/posts\/838","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jacar.es\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jacar.es\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jacar.es\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jacar.es\/en\/wp-json\/wp\/v2\/comments?post=838"}],"version-history":[{"count":0,"href":"https:\/\/jacar.es\/en\/wp-json\/wp\/v2\/posts\/838\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/jacar.es\/en\/wp-json\/wp\/v2\/media\/841"}],"wp:attachment":[{"href":"https:\/\/jacar.es\/en\/wp-json\/wp\/v2\/media?parent=838"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jacar.es\/en\/wp-json\/wp\/v2\/categories?post=838"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jacar.es\/en\/wp-json\/wp\/v2\/tags?post=838"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}