v0.5.0 is live: Ontology Hub, Distance Intelligence, SHACL Studio, Parquet & XML ingestion, 12 security fixes. What’s new →
1,000+ Tests
Production-hardened with a full regression suite
25+ Modules
Every capability independently importable
12 LLM Providers
OpenAI, Anthropic, Ollama, Groq, and more
MIT Licensed
Open source, no vendor lock-in, fully forkable
The Problem Every Production AI Team Hits
Powerful agents aren’t automatically trustworthy ones. Five structural blind spots make modern AI systems impossible to deploy in regulated environments:No memory structure
Agents store embeddings, not meaning.
- No way to ask why a fact was recalled
- No link from a recalled fact back to its source document
- Context is a black box that resets on every run
No decision trail
Agents act continuously but record nothing.
- No history to hand to a regulator or auditor
- No way to replay or reproduce a past decision
- Debugging means re-running, not reviewing
No provenance
Outputs can’t be traced to source facts.
- In healthcare, finance, and legal: this is a hard compliance blocker
- No lineage from inference back to the original document
- Impossible to demonstrate what the agent actually relied on
No reasoning transparency
Black-box answers with no explanation.
- Impossible to validate the reasoning path
- Impossible to contest a specific conclusion
- No basis for improving or correcting future behavior
No conflict detection
Contradictory facts silently coexist in vector stores.
- No detection when two sources disagree
- Outputs become inconsistent and unpredictable over time
- Silent failures compound as the knowledge base grows
These aren’t edge cases. They’re why enterprise AI pilots stall: and why your compliance team keeps saying not yet.
What Semantica Adds to Your Stack
Semantica gives every agent the infrastructure it needs to be accountable. Drop it into your existing setup in minutes:Context Graphs
A structured, queryable graph of everything your agent knows, decides, and reasons about.
- Persistent across agent runs: no context loss between sessions
- Queryable with SPARQL and full graph algorithms
- Temporal model with
valid_from/valid_untilon nodes and edges - Point-in-time snapshots of the full knowledge state
Decision Intelligence
Every decision is a first-class object in your system.
record_decision()captures full lifecycle and causal chain- Hybrid precedent search over past decisions for consistency
analyze_decision_impact()shows downstream consequences- Causal chain visualization from trigger to outcome
Full Provenance
Every fact links to its source document and ingestion event.
- W3C PROV-O compliant lineage across all modules
- Full traceability from raw input to final inference
recorded_atstamping with OWL-Time export- Audit-ready for HIPAA, SOX, GDPR, FDA 21 CFR Part 11
Reasoning Engines
Explainable reasoning paths: not black boxes.
- Forward chaining, Rete, deductive, abductive
- SPARQL query-based inference over RDF graphs
- Datalog with recursive Horn clause rules
- Every conclusion backed by a traceable derivation path
Temporal Intelligence
Your graph knows not just what: but when.
- Allen interval algebra: all 13 temporal relations
- Point-in-time queries over historical graph states
- Temporal provenance stamping on every fact
- OWL-Time export for standards-compliant archiving
Ontology Hub
Full ontology lifecycle in the browser.
- Visual editor for schema design and editing
- SHACL Studio for constraint authoring and validation
- Alignment authoring across multiple ontologies
- Health dashboard and version control built in
See It In Action
One pip install. A few lines to connect your agent. Everything else becomes traceable.Full Quickstart
Step-by-step pipeline walkthrough
Cookbook
40+ real-world Jupyter notebooks
Join Discord
Community chat and support
Built for Where Mistakes Have Consequences
Semantica was designed for domains where every decision must be explainable and every fact must be traceable:Healthcare & Life Sciences
- Clinical decision support with full audit trails
- Drug interaction and contraindication graphs
- Patient safety event tracking and root-cause analysis
- HIPAA-compliant provenance chains out of the box
Finance & Risk
- Fraud detection knowledge graphs
- Risk assessment trails built to survive an audit
- SOX, GDPR, and MiFID II compliance infrastructure
- Model decision lineage for regulatory reporting
Legal & Compliance
- Evidence-backed research with every cited fact provenance-linked
- Contract analysis with traceable clause extraction
- Regulatory change tracking across jurisdictions
- Full reasoning paths ready for court-admissible documentation
Cybersecurity
- Threat attribution graphs linking actors, TTPs, and indicators
- Incident response timelines with full event provenance
- Security audit trails across the complete kill chain
- MITRE ATT&CK-aligned knowledge graph integration
Government & Defense
- Policy decision trails from brief to outcome
- Classified information handling with provenance chains
- Chain-of-custody scrutiny for intelligence reporting
- Air-gapped deployment with local LLM support
Critical Infrastructure
- Power grid state tracking with temporal intelligence
- Transportation safety event graphs
- Emergency response coordination with decision audit trails
- Consequence modeling for high-stakes operational decisions
Start Here
Install Semantica
[all], [neo4j], [pinecone]) and environment setup.Run the Quickstart
Build a complete knowledge graph pipeline in 5 minutes:
- Ingest documents from any source
- Extract entities and relationships
- Build and query the graph
- Record and trace a decision
Learn the mental model
Core Concepts covers:
- Knowledge graphs vs. vector stores: when to use each
- What GraphRAG is and how Semantica implements it
- How provenance and decision tracking work together
- The accountability layer architecture
Go deep on any module
Every module has a dedicated reference page with:
- Full class and method documentation
- Parameter tables with types and defaults
- Runnable code examples for each feature
Installation
Get Semantica installed in under a minute
Quickstart
Build a complete knowledge graph pipeline in 5 minutes
Core Concepts
The mental model behind the API
API Reference
Exact module, class, and method details
Cookbook
Domain notebooks for real-world use cases
What’s New
v0.5.0: Ontology Hub & Distance Intelligence
v0.5.0: Ontology Hub & Distance Intelligence
Released May 11, 2026
| Area | Highlights |
|---|---|
| Ontology Hub | Visual editor, SHACL Studio, alignment authoring, health dashboard, version control: full ontology lifecycle in the browser |
| Distance Intelligence | Semantic neighborhoods, N×N distance matrices, ego-mode visualization, distance band classification, embedding cache optimization |
| Parquet Ingestion | ParquetIngestor with PyArrow: single file, partitioned directories, Hive-style discovery, selective column reading |
| XML Ingestion | XMLIngestor with XXE-safe lxml backend, XSD/DTD validation, namespace handling, directory scanning |
| Graph Explorer | Landing page redesign, bidirectional path finding, indexed search (0.004ms on 118k nodes) |
| Security | 12 vulnerability fixes: eval injection, pickle deserialization, SQL injection, XXE, SSRF, ReDoS, path traversal |
| Bug Fixes | NER LLM silent fallback on enterprise gateways, ConflictDetector duplicate definition, Windows [all] install, cp1252 crash |
v0.4.0: Temporal Intelligence & Knowledge Explorer
v0.4.0: Temporal Intelligence & Knowledge Explorer
| Area | Highlights |
|---|---|
| Temporal Intelligence | 6-PR system: temporal data model, point-in-time queries, Allen interval algebra (all 13 relations), OWL-Time export |
| Knowledge Explorer API | Full FastAPI backend: 99 tests, 12 export formats, WebSocket progress, thread-safe sessions, audit trail |
| Ontology Foundations | SHACL generation/validation, SKOS vocabulary, ontology alignment API, diff & migration tooling |
| Datalog Reasoning | Pure-Python bottom-up semi-naive fixpoint, recursive Horn clause rules, guaranteed termination |
| Agno Integration | 5 components: graph-backed memory, multi-hop GraphRAG, decision toolkit, KG toolkit, shared team context; 110 tests |
Full Capabilities
Context & Decision Intelligence
Context & Decision Intelligence
Context Graphs
- Structured, persistent graph of entities, relationships, and decisions
- Temporal model with
valid_from/valid_untilon every node and edge - Point-in-time queries across historical graph states
- Distance Intelligence: semantic neighborhoods and N×N distance matrices
Decision Tracking
record_decision()with full lifecycle management and causal chains- Hybrid similarity search over past decisions for consistency enforcement
analyze_decision_impact()andanalyze_decision_influence()for consequence modeling- Ego-mode exploration for targeted neighborhood investigation
Knowledge Engineering
Knowledge Engineering
Entity & Relation Extraction
- Named entity recognition: pattern, ML, or LLM methods
- Typed triplet extraction via LLM or rule-based pipelines
- Event extraction with temporal and causal linking
Ontology & Schema
- Ontology Hub: visual editor, SHACL Studio, alignments, health dashboard
- Deduplication v2:
blocking_v2,hybrid_v2,semantic_v2: up to 7x faster - Datalog reasoning: recursive Horn clause rules with fixpoint semantics
- SPARQL reasoning: query-based inference over RDF graphs
Provenance & Auditability
Provenance & Auditability
Lineage Tracking
- W3C PROV-O lineage across all modules: every fact has a source
recorded_atstamping with full OWL-Time export- Change management with SHA-256 checksums and version control
- Full audit trails from ingestion event to final inference
Compliance Infrastructure
- HIPAA: patient data handling with audit-ready provenance chains
- SOX / MiFID II: financial decision records with full traceability
- GDPR: data lineage for subject access and right-to-erasure workflows
- FDA 21 CFR Part 11: electronic records and signature compliance
Data Ingestion & Export
Data Ingestion & Export
Ingestion Formats
- Documents: PDF, DOCX, HTML, PPTX, Docling layout analysis
- Structured data: JSON, CSV, Excel, Parquet, XML
- Sources: web crawl, SQL, Snowflake, feeds, email, code repositories, MCP
Vector Stores
- FAISS, Pinecone, Weaviate, Qdrant, Milvus, PgVector, in-memory
Graph Stores
- Neo4j, FalkorDB, Apache AGE, Amazon Neptune
Export Formats
- RDF: Turtle, JSON-LD, N-Triples, RDF/XML
- Tabular: Parquet, CSV, Arrow
- Graph: GraphML, GEXF, DOT, ArangoDB AQL
- Ontology: OWL, SKOS, SHACL
Module Reference
| Module | What it provides |
|---|---|
semantica.context | Context graphs, agent memory, decision tracking, causal analysis, precedent search |
semantica.kg | KG construction, graph algorithms, temporal model, Allen interval algebra |
semantica.semantic_extract | NER, relation extraction, event extraction, triplet generation |
semantica.reasoning | Forward chaining, Rete, deductive, abductive, SPARQL, Datalog |
semantica.ontology | SHACL, SKOS, alignments, diff/migration, auto-generation, OWL/RDF |
semantica.explorer | FastAPI Knowledge Explorer, Ontology Hub, Distance Intelligence, SHACL Studio |
semantica.mcp_server | MCP stdio server: 12 tools for Claude Desktop, VS Code, Cursor, Windsurf, Cline |
semantica.vector_store | FAISS, Pinecone, Weaviate, Qdrant, Milvus, PgVector |
semantica.graph_store | Neo4j, FalkorDB, Apache AGE, Amazon Neptune |
semantica.triplet_store | In-memory and persistent RDF triple store with SPARQL |
semantica.ingest | Files, web, feeds, databases, Snowflake, Parquet, XML, MCP |
semantica.parse | Document parsing: PDF, DOCX, HTML, PPTX, Docling layout analysis |
semantica.split | Text chunking: sentence, paragraph, token, semantic boundary strategies |
semantica.normalize | Text normalization, entity canonicalization, whitespace and encoding cleanup |
semantica.embeddings | Sentence-Transformers, FastEmbed, OpenAI, BGE, Ollama local embeddings |
semantica.pipeline | Pipeline DSL, parallel workers, retry policies, failure handling |
semantica.export | RDF, Parquet, ArangoDB AQL, CSV, OWL, Arrow, GraphML, GEXF, DOT |
semantica.visualization | Programmatic graph rendering: force, hierarchical, circular, spring layouts |
semantica.deduplication | Entity deduplication v1/v2, similarity scoring, blocking, merging |
semantica.conflicts | Conflict detection and resolution across overlapping knowledge sources |
semantica.provenance | W3C PROV-O lineage tracking, source attribution, audit trails |
semantica.change_management | Version control with SHA-256 checksums, diff, rollback |
semantica.llms | Groq, OpenAI, Anthropic, Gemini, Ollama, DeepSeek, Novita AI, LiteLLM, HuggingFace |
semantica.seed | Foundation graph seeding from CSV, JSON, SQL, API, and RDF sources |
semantica.evals | Evaluation harness: KG quality, extraction F1, pipeline benchmarking, regression tracking |
semantica.core | Orchestration, ConfigManager, LifecycleManager, PluginRegistry, MethodRegistry |
semantica.utils | Logging, validation, progress tracking, hash utilities, nested dict helpers |
Why Semantica?
Open Source, MIT
No vendor lock-in. No paywalled features.
- Full source available on GitHub
- Every line auditable by your security team
- Fork, extend, and self-host with no restrictions
- No telemetry, no usage reporting
Production Ready
Built for teams that can’t afford surprises.
- 1,000+ passing tests with full regression coverage
PipelineValidatorcatches configuration errors at startupFailureHandlerwith exponential backoff and dead-letter queues- 12 security vulnerabilities fixed in v0.5.0
Modular by Design
Import only what you need.
- Use
NERExtractorwithout a graph store - Use
ContextGraphwithout vector storage - Every component independently swappable and testable
- No framework lock-in: works with any agent stack
