LLM wrappers: when they are a business and when they are not
Actualizado: 2026-05-03
Between 2022 and 2024, thousands of startups were born that were essentially an interface over the OpenAI API with some prompt engineering behind it. The term “LLM wrapper” became an insult on Twitter and an investment pitch at conferences. The situation is more interesting than the dichotomy suggested: several wrappers have become serious products with revenue and a moat, and most of the rest have died. This post analyzes what separates them, with concrete cases and without self-pity.
For the context of the tooling ecosystem wrappers typically integrate, the analysis of LangChain as an LLM framework and LLM proxies with LiteLLM cover the underlying technical stack.
Key takeaways
- An LLM wrapper that survives has proprietary data, network effects, or integrated workflow that the model alone can’t provide.
- Margins are structurally bad: inference costs that rise with usage, API pricing that improves for the provider, not for you.
- Margin compression from above (OpenAI cuts prices) and from below (open source improves) is the main business risk.
- Defensibility comes from accumulated data, deep workflow integration, or distribution advantage — not from prompt quality.
- If you’re building a wrapper, the critical question isn’t “what does AI do for my product?” but “what does my product do to make customers unable to leave?”
The structural margin problem
The fundamental problem of an LLM wrapper is the intermediary layer economics. Your main provider (OpenAI, Anthropic, Google) constantly improves its product and charges per token. You charge your customer more than you pay the provider and keep the margin. It sounds simple, but has three structural problems:
The provider cuts prices. OpenAI has significantly reduced the price of GPT-4 Turbo and GPT-4o since launch. When API cost drops, wrapper margin compresses if customer price doesn’t rise proportionally, which is hard in competitive markets.
The provider improves the model. If your value proposition is “access to GPT-4 with a better interface,” every improvement to the base model reduces the difference between your product and using the API directly. Wrappers that only add interface without accumulated data or workflow lose differentiation with every release.
Open source improves. LlamaIndex, LangChain, agent frameworks, RAG libraries: the entire technical stack that was hard to assemble in 2022 has been enormously simplified. A motivated technical team can replicate most of what a basic wrapper does in weeks. The technical moat of most 2022-2023 wrappers has disappeared.
Those that survived: what they have in common
The wrappers that have survived and grown have at least one of three things:
Accumulated proprietary data. Harvey (legal AI) has contracts and legal documents they’ve processed and on which their models have improved. Cursor has code patterns from millions of developers and suggestion acceptance/rejection feedback. That data can’t be replicated with OpenAI API access. Accumulated data is the most defensible moat for a wrapper.
Network effects in the workflow. Notion AI, GitHub Copilot, and other assistants embedded in network-effect tools benefit from the user already being on the platform. It’s not the wrapper retaining the user; it’s the platform. But the AI inside that platform has a distribution advantage that a standalone wrapper can’t replicate.
Deep vertical integration with the customer’s workflow. The best cases are wrappers that not only connect with the LLM but integrate deeply into the customer’s work process: reading the customer’s existing data, writing to their systems, learning from specific user feedback. This integration creates real switching costs, not just preference.
Why most have died
Wrappers that have died share the opposite pattern:
Only interface, no data or workflow. A pretty interface over GPT-4 with some well-written prompts was defensible in 2022, when few knew how to use the API directly. In 2025, any developer can do that in a weekend. Products that didn’t build proprietary data or workflow integration during their years of advantage ran out of differentiation as the market matured.
Too narrow a vertical without distribution. “AI for writing ecommerce product descriptions” may be useful, but if the accessible market is small and competition can be replicated quickly, the business doesn’t scale. Narrow verticals only work if integration with customer systems is deep or the vertical size is large enough.
Direct competition from the providers. OpenAI with ChatGPT Enterprise, Anthropic with Claude for Work, Google with Workspace AI: all major providers are moving into the application layer. Wrappers competing directly in the same space without proprietary data or workflow have been stripped by that expansion.
What to do if you’re building one
If you’re building a wrapper and want it to survive more than two years, there are decisions to make from the start:
Design to accumulate proprietary data. Every user interaction with your product should produce data that improves the product. Explicit feedback (good/bad), suggestion acceptance/rejection, usage data of generated outputs. That data is the asset; the wrapper is the mechanism for generating it.
Integrate as deeply as possible with customer systems. A wrapper that reads and writes to customer systems creates real switching costs. A wrapper that only generates text the customer copies and pastes manually does not. Deep integration is more expensive to build but is what keeps the customer from leaving when OpenAI launches a similar feature.
Choose a vertical where you can build domain knowledge advantage. The LLM is generic; the knowledge of how claims processing works at an insurer or how an M&A contract is structured is not. Wrappers that translate that domain knowledge into prompts, flows, and validations have an advantage that’s hard to replicate.
Don’t compete on price with the model providers. If your value proposition is token pricing, you’re in the worst possible position: providers always have more scale than you. Price is only defensible if there’s a value component the model alone doesn’t provide.
B2B vs. B2C wrappers
There’s an important difference between B2C and B2B wrappers:
- B2C wrappers have high user acquisition costs and high churn if the product doesn’t engage quickly. Competition with ChatGPT directly is brutal.
- B2B wrappers have longer sales cycles but customers with higher switching costs. If the product integrates into the company’s workflow, churn is much lower.
The wrappers that have survived best in the B2C segment are those with a very specific use case for an audience that doesn’t use ChatGPT directly (non-technical users of a specific vertical) or those integrated in a platform with its own network effects.
My read
LLM wrappers are not a good business by default; they’re a good business when there’s something more than the wrapper. The LLM is the input/output commodity; the business is in the data accumulated, the integration built, and the domain knowledge coded. The best wrappers are those that end up not looking like wrappers: they’re products where the LLM is a component, not the product.
The question to ask before building is not “what does AI do for my product?” but “what does my product do to make customers unable to leave when OpenAI adds that function to ChatGPT Enterprise?” The answer to that second question is the real business.