Where to start:
Prerequisites: Python 3.8+, Jupyter, and an API key for your preferred LLM provider.

Your First Knowledge Graph

Go from raw text to a queryable knowledge graph in 20 minutes.Topics: Extraction, Graph Construction, Visualization · Difficulty: Beginner

GraphRAG Complete

Build a production-ready Graph Retrieval Augmented Generation system with hybrid retrieval and logical inference.Topics: RAG, LLMs, Vector Search, Graph Traversal · Difficulty: Advanced

RAG vs. GraphRAG Comparison

Side-by-side benchmark of standard RAG vs. GraphRAG on real-world data.Topics: RAG, GraphRAG, Benchmarking · Difficulty: Intermediate

Real-Time Anomaly Detection

Detect anomalies in streaming data using dynamic knowledge graphs.Topics: Streaming, Security, Dynamic Graphs · Difficulty: Advanced

Core Tutorials

Essential guides to master the Semantica framework.

Welcome to Semantica

An interactive introduction to the framework’s core philosophy and all modules.Topics: Framework Overview, Architecture · Difficulty: Beginner

Data Ingestion

Loading data from files, web, databases, streams, feeds, repositories, email, and MCP.Topics: FileIngestor, WebIngestor, DBIngestor, Streams · Difficulty: Beginner

Document Parsing

Extracting clean text from complex formats like PDF, DOCX, and HTML.Topics: OCR, PDF Parsing, Text Extraction · Difficulty: Beginner

Data Normalization

Pipelines for cleaning, normalizing, and preparing text.Topics: Text Cleaning, Unicode, Formatting · Difficulty: Beginner

Entity Extraction

Using NER to identify people, organizations, and custom entities.Topics: NER, spaCy, LLM Extraction · Difficulty: Beginner

Relation Extraction

Discovering and classifying relationships between entities.Topics: Relation Classification, Dependency Parsing · Difficulty: Beginner

Embedding Generation

Creating and managing vector embeddings for semantic search.Topics: Embeddings, OpenAI, HuggingFace · Difficulty: Intermediate

Vector Store

Setting up vector stores for similarity search and retrieval.Difficulty: Intermediate

Graph Store

Persisting knowledge graphs in Neo4j or FalkorDB.Topics: Neo4j, Cypher, Persistence · Difficulty: Intermediate

Ontology

Defining domain schemas and ontologies to structure your data.Topics: OWL, RDF, Schema Design · Difficulty: Intermediate

Advanced Concepts

Deep dive into advanced features, customization, and complex workflows.

Advanced Extraction

Custom extractors, LLM-based extraction, and complex pattern matching.Topics: Custom Models, Regex, LLMs · Difficulty: Advanced

Advanced Graph Analytics

Centrality, community detection, and pathfinding algorithms.Topics: PageRank, Louvain, Shortest Path · Difficulty: Advanced

Advanced Context Engineering

Production-grade memory system for AI agents using FAISS and Neo4j.Topics: Agent Memory, GraphRAG, Entity Injection · Difficulty: Advanced

Complete Visualization Suite

Interactive, publication-ready visualizations of your graphs.Topics: PyVis, NetworkX, D3.js · Difficulty: Intermediate

Conflict Resolution

Strategies for handling contradictory information from multiple sources.Topics: Truth Discovery, Voting, Confidence · Difficulty: Advanced

Multi-Format Export

Exporting to RDF, OWL, JSON-LD, and NetworkX formats.Topics: Serialization, Interoperability · Difficulty: Intermediate

Multi-Source Integration

Merging data from disparate sources into a unified graph.Topics: Entity Resolution, Merging, Fusion · Difficulty: Advanced

Pipeline Orchestration

Building robust, automated data processing pipelines.Topics: Workflows, Automation, Error Handling · Difficulty: Advanced

Reasoning and Inference

Using logical reasoning to infer new knowledge from existing facts.Topics: Logic Rules, Inference Engines · Difficulty: Advanced

Temporal Knowledge Graphs

Modeling and querying data that changes over time.Topics: Time Series, Temporal Logic, Allen Algebra · Difficulty: Advanced

Industry Use Cases

Biomedical

Drug Discovery Pipeline

Accelerating drug discovery by connecting genes, proteins, and drugs using PubMed RSS feeds, entity-aware chunking, GraphRAG, and vector similarity search.Topics: Bioinformatics, KG Construction, GraphRAG · Difficulty: Advanced

Genomic Variant Analysis

Analyzing genomic variants and their implications using bioRxiv RSS feeds, temporal KGs, deduplication, and pathway analysis.Topics: Genomics, Temporal KGs, Graph Analytics · Difficulty: Advanced

Finance

Financial Data Integration (MCP)

Merging financial data from Alpha Vantage API, MCP servers, RSS feeds, and market feeds.Topics: Finance, Data Fusion, MCP Integration · Difficulty: Intermediate

Fraud Detection

Identifying fraudulent activities in transaction networks using temporal KGs, conflict detection, and pattern recognition.Topics: Anomaly Detection, Graph Mining, Temporal Analysis · Difficulty: Advanced

Blockchain

DeFi Protocol Intelligence

Analyzing decentralized finance protocols and transaction flows using CoinDesk RSS feeds, ontology-aware chunking, and conflict detection.Topics: Blockchain, DeFi, Smart Contracts, Ontology · Difficulty: Advanced

Transaction Network Analysis

Mapping and analyzing blockchain transaction networks using deduplication and network pattern detection.Topics: Blockchain Analytics, Network Analysis · Difficulty: Advanced

Cybersecurity

Real-Time Anomaly Detection

Detecting anomalies in real-time network traffic streams using CVE RSS feeds, Kafka streams, and temporal KGs.Topics: Network Security, Streaming, Temporal KGs · Difficulty: Advanced

Threat Intelligence Hybrid RAG

Combining enhanced GraphRAG with threat intelligence for security insights.Topics: Threat Intelligence, GraphRAG, Hybrid Retrieval · Difficulty: Advanced

Intelligence

Criminal Network Analysis

Analyze criminal networks with graph analytics and key player detection using OSINT RSS feeds and network centrality analysis.Topics: Forensics, Social Network Analysis · Difficulty: Advanced

Intelligence Analysis Orchestrator

Comprehensive intelligence analysis using pipeline orchestrator with multiple RSS feeds and multi-source integration.Topics: Intelligence Analysis, Pipeline Orchestration · Difficulty: Advanced

Renewable Energy & Supply Chain

Energy Market Analysis

Analyzing trends and pricing in the renewable energy market using EIA API, temporal KGs, and TemporalPatternDetector.Topics: Energy, Time Series, Temporal Analysis · Difficulty: Intermediate

Supply Chain Data Integration

Integrating supply chain data to optimize logistics and reduce risk.Topics: Logistics, Risk Management, Deduplication · Difficulty: Advanced

How to Run

1

Install Semantica

pip install semantica[all]
pip install jupyter
2

Clone the repository (optional, for source install)

git clone https://github.com/semantica-agi/semantica.git
cd semantica
pip install -e ".[all]"
pip install jupyter
3

Launch Jupyter

jupyter notebook
You can also run the cookbook using Docker:
docker run -p 8888:8888 hawksight/semantica-cookbook