Raw text can’t be compared mathematically. Embeddings translate meaning into geometry: two semantically similar sentences produce vectors that are close together in high-dimensional space, even when they share no words.Semantica uses embeddings for:
Semantic search: find knowledge graph nodes by meaning, not just keywords
Entity resolution: detect that “Apple Inc.” and “Apple Computer” refer to the same entity
Deduplication: semantic_v2 strategy measures entity similarity via embedding distance
ONNX-accelerated local embeddings. No GPU required, no API key. Best starting point.
pip install "semantica[fastembed]"
from semantica.embeddings import EmbeddingGenerator# FastEmbed is the default: no config neededgenerator = EmbeddingGenerator()embedding = generator.generate_embeddings("Text about AI")
Default model is BAAI/bge-small-en-v1.5. Zero cost, zero GPU, works on any machine.
FastEmbed ignores the device parameter. FastEmbed uses ONNX Runtime and manages its own execution providers: passing device="cuda" has no effect. Switch to method="sentence_transformers" if you need explicit GPU control.
Broad model selection via HuggingFace. Runs locally, no API key.
pip install semantica # sentence-transformers included
Popular models: all-MiniLM-L6-v2 (fast, small), all-mpnet-base-v2 (balanced), BAAI/bge-large-en-v1.5 (high accuracy).
Sequence length limits. Most sentence-transformers models have a 512-token limit. Text beyond that is silently truncated. Use TextSplitter(method="hierarchical") + HierarchicalPooling for long documents.
BAAI/bge models via sentence-transformers. State-of-the-art retrieval performance, runs locally.
pip install semantica
from semantica.embeddings import BGEStore, EmbeddingGeneratorstore = BGEStore(model="BAAI/bge-large-en-v1.5")embedding = store.embed("Text about AI")# Or switch model on an existing EmbeddingGeneratorgenerator = EmbeddingGenerator()generator.set_text_model("sentence_transformers", "BAAI/bge-large-en-v1.5")
Cloud embeddings via OpenAI API. Highest quality, requires API key.
EmbeddingGenerator is the fastest path to embeddings: the default method is FastEmbed (ONNX, no GPU needed):
from semantica.embeddings import EmbeddingGenerator# Default: FastEmbed with BAAI/bge-small-en-v1.5generator = EmbeddingGenerator()# Embed a single textembedding = generator.generate_embeddings("Text about AI")# Embed a batchembeddings = generator.generate_embeddings(["Text about AI", "Machine learning concepts"])# Compare two embeddings (cosine similarity: 0.0 to 1.0)score = generator.compare_embeddings(embeddings[0], embeddings[1], method="cosine")print(f"Similarity: {score:.3f}")
Always use the same model for indexing and querying. Vectors from different models are not comparable: they live in different vector spaces. Switching models requires re-embedding your entire corpus.
To switch provider after construction:
# Switch to a sentence-transformers modelgenerator.set_text_model("sentence_transformers", "all-MiniLM-L6-v2")# Switch to BGE largegenerator.set_text_model("sentence_transformers", "BAAI/bge-large-en-v1.5")
Best for: highest quality (text-embedding-3-large), or matching an existing OpenAI pipeline.
from semantica.embeddings import EmbeddingGenerator# Use CUDA via sentence-transformersgenerator = EmbeddingGenerator(config={"text": {"method": "sentence_transformers", "device": "cuda"}})# Apple Silicon (M1/M2/M3)generator = EmbeddingGenerator(config={"text": {"method": "sentence_transformers", "device": "mps"}})
GPU is only applicable with sentence-transformers. FastEmbed uses ONNX and does not use device.
Embedding method: "fastembed" or "sentence_transformers"
device
str
"cpu"
Device for sentence-transformers: "cpu", "cuda", "mps". Ignored for FastEmbed.
normalize
bool
True
L2-normalize output vectors
Key behaviours:
If FastEmbed or sentence-transformers is unavailable, falls back to a 128-dimensional hash-based embedding. Hash embeddings are deterministic but not semantic: do not use in production.
Large batches are chunked internally by the underlying library to avoid OOM.
Dimension mismatch. The dimension you pass to your vector store must exactly match your embedding model’s output. BAAI/bge-small-en-v1.5 → 384, all-MiniLM-L6-v2 → 384, all-mpnet-base-v2 → 768, BAAI/bge-large-en-v1.5 → 1024. Check with embedder.get_embedding_dimension() before creating the store.
Fallback embeddings are not semantic. If neither FastEmbed nor sentence-transformers loads successfully, TextEmbedder silently falls back to 128-dimensional SHA-256 hash embeddings. These are deterministic but carry no semantic meaning. Check embedder.get_method(): if it returns "fallback", install your intended provider.
Use provider stores directly when you need fine-grained control over a single backend:
from semantica.embeddings import ( OpenAIStore, BGEStore, FastEmbedStore, ProviderStoreFactory,)import os# OpenAIstore = OpenAIStore(api_key=os.getenv("OPENAI_API_KEY"), model="text-embedding-3-small")embedding = store.embed("Hello world")# BGE (Sentence-Transformers wrapper): pass model_name= not model=store = BGEStore(model_name="BAAI/bge-large-en-v1.5")embedding = store.embed("Hello world")# FastEmbed: ONNX runtime, no CUDA requiredstore = FastEmbedStore(model_name="BAAI/bge-small-en-v1.5")embedding = store.embed("Hello world")# FastEmbedStore also has an efficient batch methodembeddings = store.embed_batch(["text1", "text2", "text3"])# Auto-select from a name string: useful in config-driven pipelines# Supported providers: "openai", "bge", "fastembed"store = ProviderStoreFactory.create(provider="bge", model_name="BAAI/bge-large-en-v1.5")
LlamaStore exists in the module but is a placeholder: it does not connect to Ollama and always raises ProcessingError at embed time. Do not use it in production.
LlamaStore is not functional.LlamaStore exists in the module but does not connect to Ollama. It always raises ProcessingError at embed time. Use FastEmbedStore for local ONNX-based embeddings or BGEStore for sentence-transformers-based local embeddings instead.
Best for: retrieval, semantic search, and clustering: averages all contributions.
from semantica.embeddings import MaxPoolingpooler = MaxPooling()pooled = pooler.pool(token_embeddings)
Best for: capturing the presence of any feature: takes the max activation per dimension.
from semantica.embeddings import CLSPoolingpooler = CLSPooling()pooled = pooler.pool(token_embeddings)
Best for: classification-style tasks; models explicitly trained with CLS pooling (BERT).
from semantica.embeddings import HierarchicalPoolingpooler = HierarchicalPooling()# chunk_size is passed at pool time, not at constructionpooled = pooler.pool(token_embeddings, chunk_size=10)
Best for: long documents: chunk-level mean pooling, then global mean pooling across chunks.
Strategy
When to Use
mean
Default for retrieval, semantic search, and clustering
max
When you want to capture the presence of any feature, not average presence
cls
Classification-style tasks; models explicitly trained with CLS pooling (BERT)
attention
When token importance varies significantly; slower but more accurate
hierarchical
Long documents with many chunks; combines chunk-level then global pooling
from semantica.embeddings import PoolingStrategyFactorypooler = PoolingStrategyFactory.create(strategy="mean")
from semantica.embeddings import TextEmbedderembedder = TextEmbedder() # default: FastEmbedtexts = [ "Apple Inc. was founded by Steve Jobs.", "Microsoft was co-founded by Bill Gates.", "Amazon was started by Jeff Bezos.",]# All at once: more efficient than calling embed_text() per itemembeddings = embedder.embed_batch(texts)print(f"Shape: {embeddings.shape}") # (3, 384)
from semantica.embeddings import check_available_providers, EmbeddingGenerator# Check what's installedavailable = check_available_providers()# → {"sentence_transformers": True, "fastembed": True, "openai": False}# Use the fastest available providergenerator = EmbeddingGenerator()if available["fastembed"]: generator.set_text_model("fastembed", "BAAI/bge-small-en-v1.5")elif available["sentence_transformers"]: generator.set_text_model("sentence_transformers", "all-MiniLM-L6-v2")embeddings = generator.generate_embeddings(texts)
from semantica.embeddings import calculate_similarity# Cosine similarity: direction only, not magnitude; most common for textscore = calculate_similarity(embedding_a, embedding_b, method="cosine")# → 0.0 (orthogonal / unrelated) to 1.0 (identical direction)# Euclidean distance converted to similarityscore = calculate_similarity(embedding_a, embedding_b, method="euclidean")