vLLM has consolidated as the most widely adopted LLM serving engine in production. A review of recent improvements, what changes for operators, and what remains a weak point.
Read moreTag: llm
Microsoft’s GraphRAG in enterprise: patterns that work
GraphRAG has been in real enterprise use for a year. A balance of which question types it answers better than classic RAG, what it costs to operate, and when the extra complexity pays off.
Read moreAlignment evaluation: RLHF, DPO, and recent alternatives
Three years after RLHF became popular, the alignment landscape is richer. A review of RLHF, DPO, and recent methods like KTO or ORPO, with criteria for choosing.
Read moreGemma 2: Google’s open model one year later
Google released Gemma 2 in mid-2024 and it’s been in real-world use for a while. A balance of how it competes in the open-model ecosystem, which sizes make sense, and where adoption has taken hold.
Read moreo3 in public: the reasoning leap is confirmed
OpenAI’s o3 series is starting to become available and marks a real shift in complex reasoning. A look at where it shines, where it still fails, and what changes for those building products with LLMs.
Read moreGemini 2.0: integrated tools and agent mode
Google has released Gemini 2.0 with a clear emphasis on tool use and agents. A look at what it brings, where it lags behind competitors, and in what kind of applications it fits best.
Read moreNPU in the PC: faster, cheaper local AI
Copilot+ processors from Qualcomm, Intel, and AMD have normalized NPUs in consumer PCs. What really changes for running local models, and when it’s worth it.
Read moreMistral Large: European Contender Against GPT-4
Mistral Large 2 closes gap with GPT-4 and Claude from Europe. EU residency, pricing, and when to choose vs alternatives.
Read moreGPT-4 Turbo: Long Context and More Reasonable Costs
GPT-4 Turbo doubled GPT-4’s context and cut price 3x. Six months later, does it remain relevant or has GPT-4o replaced everything.
Read moreConstrained Decoding for Structured LLM Outputs
Outlines, Guidance, and jsonformer force LLMs to generate valid JSON, regex, or grammars. How they work and when they beat prompting.
Read more