GraphRAG combines vector similarity with knowledge graph traversal so retrieval finds structurally connected facts, not just text that sounds related. When a ContextGraph is attached to AgentContext, every retrieval call automatically blends semantic search with multi-hop graph expansion — and query_with_reasoning() returns an auditable reasoning path alongside the LLM answer.
GraphRAG activates automatically when you pass knowledge_graph= to AgentContext. There is no separate mode to switch on. The hybrid_alpha parameter and proximity_weight argument control how much influence graph structure has relative to vector similarity.

Building the graph and loading your intelligence

Before you can query the graph, you need to build it. The setup is three objects: a vector store for embedding-based retrieval, a ContextGraph for structural traversal, and an AgentContext that wires them together.
from semantica.context import AgentContext, ContextGraph
from semantica.vector_store import VectorStore

# FAISS runs locally with no external dependencies
vs = VectorStore(backend="faiss", dimension=768, index_path="intel.faiss")
graph = ContextGraph(advanced_analytics=True)

context = AgentContext(
    vector_store=vs,
    knowledge_graph=graph,
    graph_expansion=True,     # enable multi-hop traversal from seed nodes
    max_expansion_hops=3,     # APT29 → infrastructure → victim → sector is 3 hops
    hybrid_alpha=0.6,         # 60% graph influence, 40% vector similarity
    decision_tracking=True,   # record analyst queries as auditable decisions
)
Now ingest your documents. store() with extract_entities=True runs the full extraction pipeline internally — NER, relation extraction, and entity linking — and populates both the vector index and the graph simultaneously:
intel_documents = [
    {
        "content": "APT29 deployed HAMMERTOSS malware against NATO logistics networks in Jan–Mar 2025. "
                   "C2 infrastructure used Tor exit nodes in AS59796.",
        "metadata": {"source": "FINTEL_2025_0192", "classification": "SECRET//NOFORN"},
    },
    {
        "content": "HAMMERTOSS was subsequently observed on hosts in the LifeCare hospital network "
                   "(AS64496), suggesting lateral movement beyond the initial NATO targets.",
        "metadata": {"source": "FINTEL_2025_0211"},
    },
    {
        "content": "LifeCare operates 47 acute-care hospitals and is classified as Tier-1 "
                   "healthcare critical infrastructure under CISA Sector 6.",
        "metadata": {"source": "CISA_CI_REGISTRY_2025"},
    },
    {
        "content": "Healthcare critical infrastructure has been a high-priority targeting class "
                   "for Russian state-sponsored threat actors since 2022.",
        "metadata": {"source": "NCSC_ADVISORY_2024_12"},
    },
]

stats = context.store(
    intel_documents,
    extract_entities=True,
    extract_relationships=True,
    link_entities=True,    # merge duplicate entity mentions across documents
)

print("Graph built: {} nodes, {} edges".format(
    stats["graph_nodes"], stats["graph_edges"]
))
# Graph built: 18 nodes, 14 edges
# Nodes: APT29, HAMMERTOSS, NATO, LifeCare, AS59796, CISA Sector 6, ...
# Edges: deployed, observed_on, classified_as, targets, operates_in, ...
The graph now contains a connected subgraph linking APT29 to healthcare infrastructure across four document boundaries — something that would be invisible to a pure vector search.

Retrieving the relevant subgraph

With the graph populated, a plain retrieve() call already does more than vector search. When use_graph=True, the retriever seeds the graph traversal from the top-k vector matches and expands outward by following edges, collecting connected facts within max_hops:
results = context.retrieve(
    "APT29 tactics against healthcare",
    use_graph=True,
    proximity_weight=0.5,   # blend structural proximity into the final score
    max_results=10,
    expand_graph=True,
    max_hops=3,
)

for r in results:
    print("[combined={:.3f}  vec={:.3f}  prox={:.3f}]  {}".format(
        r.get("combined_score", r["score"]),
        r["score"],
        r.get("proximity_score", 0.0),
        r["content"][:90],
    ))

# [combined=0.921  vec=0.884  prox=0.957]  APT29 deployed HAMMERTOSS malware against NATO...
# [combined=0.887  vec=0.701  prox=0.972]  HAMMERTOSS was subsequently observed on hosts in the LifeCare...
# [combined=0.841  vec=0.623  prox=0.961]  LifeCare operates 47 acute-care hospitals...
# [combined=0.798  vec=0.590  prox=0.907]  Healthcare critical infrastructure has been a high-priority...
Notice the third and fourth results: their vector scores are modest (0.623 and 0.590) — neither document mentions APT29 or TTPs. But their proximity scores are high because they are structurally adjacent to the seed nodes in the graph. Pure vector retrieval would have ranked them much lower or excluded them entirely. GraphRAG surfaces them because the graph knows they are connected. When you know specifically which entity you want to anchor the traversal to, pass anchor_node:
# Anchor on APT29 explicitly — proximity scores are calculated from this node
apt29_intel = context.retrieve(
    "C2 infrastructure beaconing patterns",
    use_graph=True,
    anchor_node="APT29",
    proximity_weight=0.7,   # strongly favour nodes close to APT29
    max_hops=3,
    max_results=8,
)

Getting a grounded LLM answer with a reasoning path

retrieve() gives you the grounded context. query_with_reasoning() goes one step further: it passes that subgraph context to an LLM and returns the answer together with the multi-hop path the retrieval system traced through the graph. That path is your audit trail.
from semantica.llms import LiteLLM

llm = LiteLLM(model="anthropic/claude-sonnet-4-20250514")

result = context.query_with_reasoning(
    "What are APT29's known TTPs against healthcare infrastructure, "
    "and what is the evidence chain connecting them?",
    llm_provider=llm,
    max_results=12,
    max_hops=3,
)

# The LLM answer — grounded in graph-retrieved context, not training memory
print(result["response"])

# The multi-hop trace: APT29 → deployed → HAMMERTOSS → observed_on → LifeCare → ...
print("\n--- Reasoning Path ---")
print(result["reasoning_path"])

# Confidence reflects how well the retrieved context supports the answer
print("\nConfidence: {:.1%}".format(result["confidence"]))

# Inspect every source the LLM was given
print("\nSources ({} total):".format(result["num_sources"]))
for src in result["sources"]:
    print("  [{:.3f}] {}".format(src["score"], src["content"][:80]))
The reasoning_path field is what separates GraphRAG from a black-box LLM call. When an analyst asks “how do you know APT29 targeted healthcare?”, you can show them the exact traversal the system made across your own documents — not a claim the model generated from training data. The full return structure from query_with_reasoning():
{
    "response":             str,   # LLM-generated answer, grounded in retrieved subgraph
    "reasoning_path":       str,   # multi-hop traversal narrative
    "sources":              list,  # list of retrieved context dicts with scores
    "confidence":           float, # 0–1 aggregate confidence
    "num_sources":          int,
    "num_reasoning_paths":  int,
}

Domain examples

Multi-INT intelligence fusion: OSINT threat feeds, NVD CVE data, and HUMINT summaries ingested into a single graph, then queried with multi-hop reasoning to trace C2 infrastructure chains and attribute campaigns to specific actors.In classified environments the graph can be partitioned by data handling caveat — each AgentContext operates over the subset of documents cleared for the querying user. The reasoning_path output doubles as a sanitisable audit trail for downgraded reporting.
from semantica.context import AgentContext, ContextGraph
from semantica.vector_store import VectorStore
from semantica.llms import LiteLLM

vs    = VectorStore(backend="faiss", dimension=768)
graph = ContextGraph()

context = AgentContext(
    vector_store=vs,
    knowledge_graph=graph,
    graph_expansion=True,
    max_expansion_hops=3,  # actor → infra → victim → attribution chain
    hybrid_alpha=0.6,      # graph-heavy: structured intel benefits from topology
    decision_tracking=True,
)

# Ingest multi-INT corpus
humint_summary = """
HUMINT-2025-Q1-007: Source BRAVO-9 confirms APT29 operating from
infrastructure in AS59796. C2 beacons use Tor exit nodes in DE/NL.
Targets: ITAR-controlled defense contractors in aerospace sector.
"""
cti_report_text = "APT29 exploited CVE-2025-3400 in PAN-OS GlobalProtect to gain initial access..."

context.store(
    [
        {"content": humint_summary,    "metadata": {"source": "HUMINT-2025-Q1-007"}},
        {"content": cti_report_text,   "metadata": {"source": "CTI_RPT_APT29_2025"}},
    ],
    extract_entities=True,
    extract_relationships=True,
    link_entities=True,
)

llm    = LiteLLM(model="anthropic/claude-sonnet-4-20250514")
result = context.query_with_reasoning(
    "Trace the C2 infrastructure chain for APT29 operations targeting "
    "ITAR-controlled contractors in 2025. Include IP ranges, ASNs, and TTPs.",
    llm_provider=llm,
    max_results=15,
    max_hops=3,
)

print(result["response"])
print("\n--- Reasoning Path ---")
print(result["reasoning_path"])
print("Confidence: {:.1%}".format(result["confidence"]))

# Anchor retrieval on APT29 for a proximity-weighted follow-up
proximate = context.retrieve(
    "C2 beaconing patterns Tor exit nodes",
    use_graph=True,
    anchor_node="APT29",
    proximity_weight=0.7,
    max_hops=3,
    max_results=10,
)

Tuning the vector-graph balance

The hybrid_alpha parameter set in the AgentContext constructor establishes a default blend between vector similarity and graph influence. 0.0 is pure vector retrieval; 1.0 is pure graph traversal. The recommended starting point is 0.5. You can override this per call using proximity_weight in retrieve() without changing the constructor default:
# Exploratory query — let semantics lead, graph confirms
results = context.retrieve(query, use_graph=True, proximity_weight=0.2)

# Known-entity tracing — topology drives the retrieval
results = context.retrieve(
    query, use_graph=True, anchor_node="APT29", proximity_weight=0.8
)
Each additional hop in max_hops exponentially increases the subgraph size. Practical defaults by domain:
General Q&A             max_expansion_hops=2  (95% of useful facts within 2 hops)
Threat intel (APT)      max_expansion_hops=3  (actor → infra → victim → attribution)
Drug interactions       max_expansion_hops=3  (drug → enzyme → metabolite → interaction)
Regulatory cross-ref    max_expansion_hops=2  (rule → article → article)
Set globally in the constructor; override per call with the max_hops argument to retrieve().

How GraphRAG works internally

Query text
    |
    v
Vector embedding  ─────────────────────────────────────┐
    |                                                   |
    v                                                   v
Semantic search                         Graph traversal (BFS)
(FAISS / Qdrant)                        from anchor / top-k seeds
    |                                                   |
    └──────────┐                ┌──────────────────────┘
               v                v
         Score fusion (proximity_weight blend)
               |
               v
          Ranked subgraph
               |
               v
         LLM grounding  <── query_with_reasoning()
               |
               v
     {response, reasoning_path, sources, confidence}
The vector search and graph traversal run independently, then their scores are fused. The graph traversal uses breadth-first expansion from the seed nodes identified by the vector search, so the graph component is always anchored in semantic relevance rather than exploring the entire graph blindly.