Written by

CEO - Jacar Systems

Passionate about technology, cloud infrastructure and artificial intelligence. Writes about DevOps, AI, platforms and software from Madrid.

Inteligencia Artificial Tecnología

70b 8b llama 2 llama 3 meta open llm

Llama 3: Meta’s New Open Standard

May 20, 2024 6 min read 58 reads

Table of contents

Key takeaways
Key differences from Llama 2
Benchmarks
Hardware requirements
Licence
Where it excels and where it doesn’t
Conclusion

Actualizado: 2026-05-03

Meta released Llama 3 on April 18, 2024 in two sizes: 8B and 70B, both with Instruct variants for chat. Trained on 15 trillion tokens — 7.5x more than Llama 2 — with a 128k vocabulary tokenizer and Grouped Query Attention on both sizes. It closes the gap that separated open models from closed frontier on many tasks.

Key takeaways

15T training tokens vs 2T for Llama 2: the data scale is the most visible difference in reasoning and instruction following.
GQA on 8B and 70B: more efficient inference without sacrificing quality.
Llama 3 70B competes with Claude 3 Sonnet on MMLU, HumanEval, and GSM8K.
Llama 3 8B beats Llama 2 13B on almost all benchmarks with half the parameters.
The Llama 3 Community License allows commercial use up to 700M MAU at no additional cost.

Key differences from Llama 2

15T training tokens vs 2T: 7.5x more data.
Initial 8k context (extended to 128k in Llama 3.1).
Improved tokenizer with 128k vocabulary vs 32k: more efficient tokenisation, especially for non-English languages.
GQA on both sizes: better quality/inference-cost ratio.
Significantly better instruction tuning: SFT + DPO + RLHF, less verbosity, better instruction adherence.

Benchmarks

Benchmark	Llama 3 8B	Llama 3 70B	Claude 3 Sonnet	GPT-4 Turbo
MMLU	68.4	79.5	79.0	86.4
HumanEval	62.2	81.7	73.0	85.4
GSM8K	79.6	93.0	92.3	92.0

Llama 3 70B is in Claude 3 Sonnet’s league on most tasks. Llama 3 8B beats Llama 2 13B on almost everything.

Hardware requirements

8B Q4 fits in 16 GB Apple Silicon. 70B Q4 requires an A100 80 GB or two A100 40 GB. For serious production throughput, vLLM with tensor parallelism is the standard for the 70B.

Licence

The Llama 3 Community License allows commercial use up to 700M MAU with a “Built with Meta Llama 3” display requirement. For the vast majority of organisations, the licence is permissive enough for production deployments.

Where it excels and where it doesn’t

Strong: code generation (HumanEval 62%/82%), maths reasoning (GSM8K 79-93%), instruction following. Relatively weak: multilingual (Mistral and Qwen are still better), long context (fixed with Llama 3.1), multimodal (fixed with Llama 3.2).

Conclusion

Llama 3 is a real leap over Llama 2 and sets the open reference standard. The 8B is the default option for modest self-hosting; the 70B competes with closed frontier on most tasks. Combined with a massive ecosystem of fine-tunes, quantised variants, and tooling, it’s the safe choice for teams serious about open LLMs. For extreme multilingual or very long context, Mixtral or Gemini remain preferable; for everything else, Llama 3 is the sensible default.

Was this useful?

[Total: 13 · Average: 4.5]

Post Views: 58

Written by

Javier Cañete

CEO - Jacar Systems

Passionate about technology, cloud infrastructure and artificial intelligence. Writes about DevOps, AI, platforms and software from Madrid.

Llama 3: Meta’s New Open Standard

Key takeaways

Key differences from Llama 2

Benchmarks

Hardware requirements

Licence

Where it excels and where it doesn’t

Conclusion

Related posts

Multi-agent systems: LangGraph vs CrewAI vs Autogen in 2026

How to build a production-ready agent with the Anthropic SDK, step by step

Claude Code vs Cursor vs GitHub Copilot in 2026: a comparison with measured tasks

MCP (Model Context Protocol) in 2026: the complete guide for engineering teams