Claude 2, launched by Anthropic in July 2023, is the most solid alternative to GPT-4 in the commercial LLM market. It’s not just “another model” — its 100,000-token context window and different alignment approach make it the best tool for specific cases. We cover how it differs, where it performs better than GPT-4, and where it falls short.
Who’s Behind It
Anthropic was founded in 2021 by Dario and Daniela Amodei along with several ex-OpenAI researchers. Their thesis: LLM safety and alignment shouldn’t be a layer stitched on after, but part of training from the start. Hence their proprietary technique, Constitutional AI.
The company has received substantial investment from Google and, in September 2023, an investment of up to $4 billion was announced from Amazon. This backing changes the competitive landscape — it’s no longer OpenAI without serious alternatives.
The Two Standout Features
100K Token Context
Claude 2 accepts up to 100,000 tokens of input — approximately 75,000 words or 200 pages of text. Standard GPT-4 had 8K at launch; GPT-4-32K came later with 32K, and the turbo 128K version would arrive later in 2023.
For 2023, Claude 2 clearly leads on long context. Practical implications:
- Upload an entire PDF (a technical book, several long articles) and ask questions about the whole.
- Analyse a complete medium-sized codebase without chunking.
- Summarise long documents without splitting.
- Extended conversations without losing memory of early messages.
Of course: cost per token increases proportionally. A 100K input + 1K output query costs more than a 10K + 1K. But the possibility exists, and for cases where complex chunking engineering used to be required, things now simplify drastically.
Constitutional AI
Anthropic’s safety approach is based on a “constitution” — a set of principles written in natural language the model uses to evaluate and refine its own responses during training. The idea is that the model learns to self-critique following explicit principles, rather than relying solely on human feedback.
In practice, Claude tends to:
- Refuse more easily to ambiguous requests. Sometimes frustrating, sometimes correct.
- Reason explicitly about safety when in doubt.
- Give more careful responses on sensitive topics.
For some uses this is ideal (assistants in regulated domains); for others it’s unnecessary friction.
Comparison With GPT-4
Official benchmarks place GPT-4 slightly ahead in most tests, but “slightly” hides important nuances:
- Complex reasoning and maths: GPT-4 remains superior, especially in multi-step problems.
- Coding: GPT-4 with CodeInterpreter performs better on long tasks; Claude 2 is competitive on simple-to-moderate code generation.
- Creative writing and rewriting: very close. Personal-style question.
- Long-document analysis: here Claude 2 wins by its context.
- Multilingual (Spanish): both good, GPT-4 slightly superior in nuance.
- Speed: Claude 2 tends to be somewhat slower in long responses, similar in short ones.
- Cost per token: comparable to standard GPT-4; both significantly more expensive than GPT-3.5.
Summary: if your case needs large context, Claude 2 wins. For nearly everything else, both are competitive and choice may depend on prompt details or preferred provider.
Cases Where Claude 2 Stands Out
- Legal contract analysis. Upload complete contract, ask specific questions without prior chunking.
- Scientific paper reading. Load the full PDF and dialogue about methodology, results, limitations.
- Code assistant needing context. Load several related files and ask for refactor or cross-file analysis.
- Long conversational systems. Assistants where the session can extend to hundreds of messages without losing memory.
- Compliance and documentary review. Verify a document meets certain written criteria.
Cases Where GPT-4 Still Wins
- Mature plugins and function calling. OpenAI’s ecosystem is broader.
- Complex mathematical reasoning.
- CodeInterpreter (sandbox code execution) — Claude has no direct equivalent.
- Fine-tuned model availability and variant diversity.
Access and Integration
Claude 2 is available via:
- claude.ai (web interface, free limited and paid).
- Anthropic API (programmatic access, similar to OpenAI).
- Amazon Bedrock — Claude 2 available as a model on AWS.
- Google Cloud Vertex AI — availability announced.
The API is conceptually very similar to OpenAI’s; migrating code between them is usually a couple of hours’ work.
Conclusion
Claude 2 is a real and mature alternative to GPT-4 in 2023. For cases where long context is valuable or Anthropic’s safety approach fits your product, it’s the best option. For many other cases, both models are interchangeable and worth having access to both not to depend on a single provider. The diversity in the 2023 LLM market is good news for users — competition improves all products.
Follow us on jacar.es for more on commercial LLMs, comparisons, and building products with generative AI.