pgvector is a PostgreSQL extension that adds a vector column type and distance operators. With CREATE EXTENSION vector; and a couple of indexes, you turn your existing Postgres into a reasonable vector database. In 2023 it’s the option more and more teams choose over deploying a dedicated service, especially for projects that already have Postgres in production.
Why pgvector Makes Sense
The key question: do I need a dedicated vector DB (Qdrant, Pinecone, Milvus) or is a Postgres extension enough?
Arguments for pgvector:
- One single database. Relational data (users, posts, metadata) and embeddings in the same place. Real joins, not parallel calls.
- Existing backups and operations. Your Postgres backup, monitoring, alerting, and replication pipelines already cover embeddings without extra work.
- Existing team skill. Your team already knows Postgres. No new system to learn.
- Simplified compliance. One DB to audit, one security policy.
- Cost. If you already pay for Postgres, this is a free extension.
When pgvector isn’t enough:
- Massive volume (hundreds of millions of vectors with sustained high QPS). Dedicated DBs scale better.
- Very high dimensions (>2000). pgvector works but more optimised alternatives exist.
- Need for very advanced hybrid search with multiple filters and vector-specific aggregations.
For 80% of LLM projects using embeddings with under 10M vectors, pgvector is enough and greatly simplifies the architecture.
Installation and Basic Usage
On Postgres 16 (or any supported version >= 11):
CREATE EXTENSION vector;
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT,
embedding vector(1536) -- model dimension (e.g. OpenAI ada-002)
);
INSERT INTO documents (content, embedding)
VALUES ('Example text', '[0.1, 0.2, ..., -0.05]'::vector);
-- Cosine similarity search (top 5)
SELECT id, content, 1 - (embedding <=> '[...]'::vector) AS score
FROM documents
ORDER BY embedding <=> '[...]'::vector
LIMIT 5;
Three main operators:
<=>cosine distance<->Euclidean distance (L2)<#>negative inner product
The expression 1 - (a <=> b) converts cosine distance to similarity (range 0-1).
The Indexes: IVFFlat and HNSW
Without an index, search is sequential — works but scales poorly. pgvector offers two index types:
IVFFlat
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Partitions vectors into lists clusters. At search time, only looks at clusters nearest the query (probes, configurable per query). Recall vs speed trade-off.
Good rule: lists = sqrt(N) where N is vector count. For 1M vectors, lists = 1000. And for 95-98% recall, SET ivfflat.probes = 10 before the query.
HNSW (since pgvector 0.5)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops);
Hierarchical Navigable Small World — the same algorithm Qdrant and Pinecone use. Better quality and speed than IVFFlat in most cases. Cost: more memory and slower build.
For new projects on pgvector 0.5+, HNSW is the default choice.
Patterns That Matter in Production
Three operational details that change the experience:
- Non-blocking index build.
CREATE INDEX CONCURRENTLYavoids locking the table. Important for tables with frequent writes. - Periodic reindexing. If you insert thousands of new vectors per day, the IVFFlat index can degrade. HNSW is more stable. For IVFFlat, occasional
REINDEX. - Filter by metadata before vector search. The right syntax:
WHERE category = 'tech' ORDER BY embedding <=> '[...]' LIMIT 10
Postgres applies the filter first (if it has an index), then vector ranking on the subset. Much faster than filtering after.
Practical Comparison
| Aspect | pgvector | Qdrant | Pinecone |
|---|---|---|---|
| Setup | Extension on PG | Docker service | Managed account |
| Fixed cost | Your PG cost | Additional server | Monthly subscription |
| Joins with relational data | Native | Custom layer needed | Custom layer needed |
| Scale (vectors) | Up to ~10M comfortable | Hundreds of millions | No practical limit |
| Query latency | 5-50ms typical | 1-10ms | 10-50ms |
| Daily operations | Zero (it’s PG) | Additional service | Zero (managed) |
| Privacy | Local | Local | External SaaS |
When Moving to Something Else Pays Off
Signs you’ve outgrown pgvector:
- Sustained query latency >100ms with HNSW correctly configured.
- More than 50M vectors with high growth.
- Insufficient RAM to keep HNSW indexes in cache.
- Need for vector quantization (compression) pgvector doesn’t yet implement.
At those points, Qdrant is probably the best migration: open source, scales well, retains the “you can run it yourself” property.
Conclusion
pgvector is the pragmatic default option in 2023 if you already use Postgres. It reduces the stack’s operational surface at the cost of slightly less performance than dedicated vector DBs — but that difference is rarely the bottleneck in typical projects. Starting here and migrating later if needed is almost always the right decision.
Follow us on jacar.es for more on PostgreSQL, vector databases, and semantic search architectures.