Redis 8.2 and its vector support: when it actually makes sense
Actualizado: 2026-05-03
Redis 8.2 has moved vector search from an optional extension to a first-class data type. This is more relevant than it looks because it shifts the usual question: it is no longer whether Redis can hold embeddings, but whether it replaces a dedicated engine like Qdrant, Weaviate, Milvus or the pgvector extension on PostgreSQL.
Key takeaways
- In Redis 8.2 the HNSW index is a native core type with an open license — no longer a separate fragile module.
- On workloads up to ~1 million vectors at 768 dimensions, Redis 8.2 delivers median latencies of 4 to 6 ms, comparable to Qdrant.
- Redis wins when the application already uses Redis for cache, sessions, or queues: consolidating systems has real operational value.
- Past a few million vectors, the disk-persistent indices of Qdrant or Milvus outweigh Redis’s fully in-RAM index.
- HNSW does not handle mass deletions well — if the weekly deletion rate exceeds 5%, a full index rebuild is needed.
What changed since the RediSearch extension
Until Redis 7.x, vector search lived inside the RediSearch module, with a separate license and the habit of behaving differently depending on the exact module version loaded. In 8.0 the feature was integrated into the core under the open license, and in 8.2 the API closed with HNSW index support, filtering on metadata, and per-query latency metrics.
The important shift is that the vector layer is no longer a fragile add-on — it is part of the engine the team already knows. Vectors are fields inside hash or JSON documents, the HNSW index is built in memory, and queries use the same RediSearch language already used for full-text search. A single query can combine boolean filters on tags with a KNN search over embeddings without crossing systems.
Performance measured on small and medium workloads
Redis Labs’ numbers are optimistic but not dishonest. In tests with a corpus of 800,000 vectors at 768 dimensions on a machine with 32 GB of RAM, Redis 8.2 answered KNN queries with k=10 at a median of 4 to 6 ms, with a p99 below 20 ms. For the same corpus, pgvector with a tuned HNSW index returned in 15 to 30 ms, and Qdrant sat in a range similar to Redis.
The gap is consistent but moderate — not an order of magnitude. Where Redis wins clearly is when the application already depends on Redis for cache, sessions, or queues. Avoiding an additional system has real operational value: fewer backups, less monitoring, fewer permissions, fewer pipes that can break.
Where a dedicated engine still wins
Past a few million vectors the story changes. Redis’s HNSW index lives entirely in memory, and although the underlying library is efficient, a table of 50 million vectors at 768 dimensions needs roughly 150 GB of RAM just for the index. Qdrant and Milvus support disk-persisted indices with a hot cache layer, which lets them serve large corpora on reasonable hardware.
The second area where a dedicated engine wins is advanced filtering. Redis supports boolean filters on tags and numeric ranges, but Qdrant and Weaviate offer geospatial filters, rich payloads, and adaptive pre-selection strategies that matter when filter cardinality is high. In a real multi-tenant multilingual search case with 200 tenants, Qdrant kept latencies stable where Redis started degrading because the planner had less information about filter selectivity.
The piece people often forget: ingestion
Talking about search is only half the work. Redis has the advantage of in-memory writes with very low latency — inserting 100,000 vectors at 1,024 dimensions takes 18 seconds on one fully loaded thread. For small corpora rebuilt daily, this is enough.
The downside is that HNSW does not handle mass deletions well. Removing 10% of the corpus leaves the graph fragmented and hurts recall quality. Redis does not yet expose an incremental rebuild operation, so the working pattern is rebuilding the full index nightly if the weekly deletion rate exceeds 5%. Dedicated engines like Qdrant offer asynchronous compaction that avoids this gymnastics.
Coherence with the rest of Redis
The detail that makes Redis 8.2 interesting for many cases is not so much raw performance as operational coherence with the rest of the system. If the team already handles replication, backups with RDB and AOF, failover with Sentinel or Cluster, and eviction policies, that experience applies directly to vector indices. There is no new system to learn and no different consistency model to explain to the on-call rotation.
This value grows the smaller the team. In an organization with two people running infrastructure, adding Qdrant or Weaviate means a new binary, a new management protocol, a new backup pattern, and new alerts. Redis 8.2 reuses 80% of what already works.
How to decide
My practical rule is simple:
- Redis 8.2 by default if: the vector corpus fits in RAM of a reasonable machine, the application already uses Redis, and filters are basic. The operational saving outweighs any specific advantage of a dedicated engine.
- Dedicated engine if: the corpus exceeds 10 million vectors, filters are complex, or there are very large retention requirements with rare access.
- Middle zone (1–10 M vectors): depends on the query pattern. If queries are highly concurrent and latency-sensitive, Redis wins. If they are sparse but hit large and rotating corpora, Qdrant or pgvector win.
Redis 8.2 closes a gap Redis had been carrying since 2022. With the native index and the open license, Redis starts competing seriously with pgvector in the medium corpus range — a category shift for the platform. I do not think it replaces dedicated engines in the high segment, but for most RAG applications in mid-sized companies, where the corpus is a few million and filter complexity is low, Redis 8.2 is a sober and sufficient option. Every extra system in production has hidden cost in staff, monitoring, and security — if the case fits, using Redis for both cache and embeddings is one of those decisions that reduce complexity without losing capability.