Nomic AI released nomic-embed-text-v1 in February 2024 as a fully open embeddings model: Apache 2.0 weights, disclosed training data, public training code. Quality close to OpenAI’s text-embedding-3-small. For teams wanting open embeddings without compromising quality, it’s a concrete mature option.
What It Offers
- 768 dimensions (vs 1536 OpenAI).
- Context length: 8192 tokens.
- MTEB: ~62.4 (similar to text-embedding-3-small ~62.3).
- Apache 2.0 license: no commercial restrictions.
- Reproducibility: training data and code published.
- Size: ~500MB on disk.
Available on Hugging Face: nomic-ai/nomic-embed-text-v1.
Installation and Usage
With sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("nomic-ai/nomic-embed-text-v1", trust_remote_code=True)
# Important: prefixes for query vs document
query_emb = model.encode("search_query: what is RAG?")
doc_emb = model.encode("search_document: RAG combines retrieval with generation...")
Prefixes (search_query, search_document, classification, clustering) tell the model purpose — like Cohere Embed v3.
With Ollama
Ollama has nomic-embed available:
ollama pull nomic-embed-text
And OpenAI-compatible API:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
response = client.embeddings.create(
model="nomic-embed-text",
input="Text to embed"
)
Ideal for local dev or light self-hosting.
Benchmarks
Against other open embeddings and OpenAI:
| Model | MTEB avg | Dims | License |
|---|---|---|---|
| text-embedding-3-large | 64.6 | 3072 | OpenAI |
| text-embedding-3-small | 62.3 | 1536 | OpenAI |
| nomic-embed-text-v1 | 62.4 | 768 | Apache 2.0 |
| BGE-large-en-v1.5 | 64.2 | 1024 | MIT |
| e5-large-v2 | 63.4 | 1024 | MIT |
| mxbai-embed-large-v1 | 64.7 | 1024 | Apache 2.0 |
Nomic is competitive with text-embedding-3-small and openSource. mxbai-embed-large is slightly better but 1024 dims.
Inference Performance
On typical hardware:
- CPU (16-core server): ~100 embeddings/s.
- GPU (RTX 4090): ~3000 embeddings/s.
- Apple Silicon (M2 Pro): ~500 embeddings/s via mps.
For batch processing, substantial.
pgvector Integration
Nomic with pgvector:
import psycopg
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("nomic-ai/nomic-embed-text-v1", trust_remote_code=True)
con = psycopg.connect("postgres://...")
# Schema
con.execute("""
CREATE TABLE docs (
id bigserial PRIMARY KEY,
content text,
embedding vector(768)
)
""")
con.execute("""
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops)
""")
# Insert
emb = model.encode(f"search_document: {text}")
con.execute("INSERT INTO docs (content, embedding) VALUES (%s, %s)",
(text, emb.tolist()))
When to Choose Nomic
Yes if:
- You want open, no external dependency.
- English is main language.
- Need data residency / on-prem.
- API cost is factor.
- Value training transparency.
No if:
- Serious multilingual (Cohere Embed v3 wins).
- Absolute frontier quality (OpenAI large or mxbai).
- Context >8k tokens.
nomic-embed-text-v1.5
Variant with Matryoshka Representation Learning: you can truncate dimensions without retraining.
- 768 dim → 512 → 256 → 128 → 64 with gradual degradation.
- Similar to OpenAI text-embedding-3’s feature.
For storage-tight RAG, useful.
Multilingual: nomic-embed-text-v1-multilingual
Multilingual variant with ~100 languages covering Spanish, French, German and more.
- Same infrastructure quality, multilingual training.
- Good option for EU RAG.
LangChain / LlamaIndex Integration
Both support Nomic directly:
from langchain_community.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(
model_name="nomic-ai/nomic-embed-text-v1",
model_kwargs={"trust_remote_code": True}
)
Atlas: Nomic’s Platform
Nomic Atlas is Nomic’s commercial platform:
- Visualise your embeddings in interactive 2D.
- Explore semantic structure of large data.
- Search + clustering + filtering.
Useful for exploration and own-embedding debugging. Generous free tier.
Operational Considerations
- Model caching: download model once, local cache.
- ONNX export: for better multi-platform inference.
- Quantisation: INT8 available, minimal loss.
- Batch size: tune per GPU memory.
Conclusion
nomic-embed-text is the sensible open-source option for English embeddings in 2024. Quality close to text-embedding-3-small with full freedom, transparency, and on-prem availability. For serious RAG without wanting OpenAI dependency, it’s the default choice. For multilingual, use the specific variant. For cases where absolute frontier quality matters, OpenAI large or mxbai. Quality open-source embedding democratisation is good news for the ecosystem.
Follow us on jacar.es for more on embeddings, RAG, and open models.