GraphRAG has been in real enterprise use for a year. A balance of which question types it answers better than classic RAG, what it costs to operate, and when the extra complexity pays off.
Read moreTag: rag
How to Evaluate a RAG System Without Fooling Yourself
Measuring RAG system quality is more subtle than it seems. Metrics, golden sets, and the most common evaluation mistakes.
Read moreHybrid Search: Combining BM25 and Vectors Seriously
Vector-only or keyword-only lose cases. Hybrid search combines BM25 + semantic via RRF. How and when.
Read moreRAG in Production: Patterns That Work and Those That Don’t
After two years of RAG, clear patterns emerge: smart chunking, hybrid search, re-ranking, continuous evaluation. What to avoid.
Read moreOpenAI Assistants API: Stateful Agents Without Your Own Infrastructure
The Assistants API offers persistent threads, tool calling, and file search managed by OpenAI. We examine when it pays off versus Chat Completions with your own logic.
Read moreRe-Ranking in RAG: The Piece That Really Raises Quality
Embeddings alone aren’t enough. A re-ranker over top-100 typically lifts precision 15-30%. When and how to integrate without drama.
Read morenomic-embed-text: Competitive Open Embeddings
Nomic released an open-source embeddings model rivaling OpenAI. When to use, comparison, and how to integrate in your RAG.
Read moreGemini 1.5: Millions of Tokens of Context in Production
Gemini 1.5 Pro proved million-token context is real. What changes in RAG and architectures when the model can swallow an entire book.
Read moreOpenAI text-embedding-3: What Changes vs the Previous One
OpenAI released text-embedding-3 with higher quality and the variable-dimensions trick. How to leverage what’s new without rebuilding your RAG stack.
Read morepgvector in 2024: HNSW Indexes and Real Scaling
pgvector 0.5 added HNSW and changed the conversation. When PostgreSQL with pgvector is enough, how to index well, and where it starts to hurt.
Read more