Quick Answers
| Question | Answer |
|---|---|
| License? | MIT: free forever, no paywalled features |
| Python version? | 3.8+ (3.11+ recommended) |
| API key required? | Optional: pattern extraction works with no keys |
| Works with LangChain / LlamaIndex? | Yes: Semantica is a layer on top, not a replacement |
| Production-ready? | Yes: 1,000+ tests, v0.5.0 ships with 12 security fixes |
| Latest version? | v0.5.0 (May 2026) |
| Local LLMs? | Yes: Ollama via LiteLLM, HuggingFaceLLM for air-gapped |
General
What is Semantica?
What is Semantica?
What can I build with Semantica?
What can I build with Semantica?
- Knowledge graphs from documents and multi-source data
- GraphRAG systems with graph-grounded retrieval and source attribution
- AI agents with structured decision history and semantic memory
- Compliance-ready pipelines with W3C PROV-O lineage (HIPAA, SOX, GDPR, FDA 21 CFR Part 11)
- Temporal graphs that track how facts change over time
- Ontology-driven knowledge bases with SHACL validation
What makes Semantica different from LangChain or LlamaIndex?
What makes Semantica different from LangChain or LlamaIndex?
Is Semantica free?
Is Semantica free?
What's the latest version?
What's the latest version?
Installation
How do I install Semantica?
How do I install Semantica?
[gpu], [all], provider-specific), and platform-specific troubleshooting.What Python version do I need?
What Python version do I need?
The [all] extra fails on Windows
The [all] extra fails on Windows
pip install "semantica[core]", then add [llm-openai], [gpu], etc.What are the system requirements?
What are the system requirements?
| Requirement | Minimum | Recommended |
|---|---|---|
| Python | 3.8 | 3.11+ |
| RAM | 4 GB | 16 GB+ |
| Storage | 2 GB | 20 GB+ |
| GPU | Optional | CUDA for embeddings and ML models |
Data & Features
What data sources does Semantica support?
What data sources does Semantica support?
| Category | Sources |
|---|---|
| Files | PDF, DOCX, HTML, JSON, CSV, Excel, PPTX, Parquet (v0.5.0), XML (v0.5.0), archives |
| Web | WebIngestor crawl, RSS feeds, sitemaps |
| Databases | PostgreSQL, MySQL, Snowflake via DBIngestor / SnowflakeIngestor |
| NoSQL | MongoDB via MongoIngestor, DuckDB via DuckDBIngestor |
| Streams | Kafka, real-time ingestion via StreamIngestor |
| Protocols | MCP (Model Context Protocol) via MCPIngestor |
| Cloud | Google Drive via GDriveIngestor, HuggingFace datasets |
Can I use my own models?
Can I use my own models?
- Custom NER and extraction models: register via
method_registry - Custom embedding models: any model with a
.encode()interface - Custom LLM providers: via LiteLLM (100+ models) or direct provider integration
- Custom pipeline processors: register via
PluginRegistry
Does Semantica support GPUs?
Does Semantica support GPUs?
How does Semantica handle large datasets?
How does Semantica handle large datasets?
- Batching: process documents in configurable chunks to control memory usage
- Parallel processing:
Pipeline(workers=N)runs extraction steps concurrently - Delta processing: update graphs incrementally without full recompute on new data
- Persistent backends: swap in-memory NetworkX for Neo4j, FalkorDB, or Apache AGE for large-scale production graphs
What is Temporal Intelligence?
What is Temporal Intelligence?
TemporalKnowledgeGraph attaches valid_from / valid_until windows to nodes and edges, enabling point-in-time queries and historical analysis. Supports all 13 Allen interval algebra relations and OWL-Time export.What is the Ontology Hub?
What is the Ontology Hub?
semantica.explorer. Includes:- Visual editor: create and edit classes, properties, and relationships
- SHACL Studio: author, validate, and export SHACL shapes
- Alignment authoring: map concepts across ontologies
- Health dashboard: coverage, consistency, and constraint violation metrics
- Version control: diff and history for ontology changes
What is Distance Intelligence?
What is Distance Intelligence?
- N×N distance matrices across a set of entities
- Ego-mode visualization centered on a single node
- Distance bands:
near/mid/farbased on embedding thresholds - Embedding cache optimization for repeated queries
My NER extractor silently falls back to pattern mode on a custom gateway
My NER extractor silently falls back to pattern mode on a custom gateway
response_format=json_object parameter is now conditionally omitted for incompatible gateways, with a plain generate() plus JSON parsing fallback applied automatically. Upgrade to fix:Technical
What graph databases are supported?
What graph databases are supported?
- Neo4j: industry standard, Cypher query language
- FalkorDB: Redis-protocol, ultra-low latency
- Apache AGE: PostgreSQL extension, OpenCypher
- Amazon Neptune: managed AWS, SPARQL and Gremlin
- NetworkX: in-memory, for development and small graphs
What export formats are available?
What export formats are available?
What vector stores are supported?
What vector stores are supported?
VectorStore API: swap with one line change.What LLM providers are supported?
What LLM providers are supported?
Is Semantica production-ready?
Is Semantica production-ready?
- 1,000+ passing tests across Python 3.8–3.12
PipelineValidatorandFailureHandlerwith exponential backoff and configurable retry policies- W3C PROV-O provenance tracking across all modules
- Change management with SHA-256 checksums and full audit trails
- 12 security vulnerability fixes: eval injection, pickle deserialization, SQL injection, XXE, SSRF, ReDoS, path traversal, and more
Troubleshooting
ModuleNotFoundError: No module named 'semantica'
ModuleNotFoundError: No module named 'semantica'
Installation fails with dependency errors
Installation fails with dependency errors
[all] fails on Windows, install extras individually instead.Memory errors during processing
Memory errors during processing
Slow embedding or inference
Slow embedding or inference
Unicode / cp1252 crash on Windows
Unicode / cp1252 crash on Windows
