AgentContext maintains a persistent memory layer for LLM agents — storing observations as vector embeddings, retrieving them by semantic similarity, and optionally blending graph proximity into the ranking. Use it when your agent needs to recall past findings across sessions without re-reading source material on every restart.
This guide covers the memory layer. For graph-enriched traversal and entity linking, see Context Graphs. For decision accountability — recording, auditing, and causally tracing what the agent chose — see Decision Intelligence.

Setting Up a Persistent Memory Context

Configure the vector store, knowledge graph, and AgentContext together at startup so all three components persist to disk at the same path.
from semantica.context import AgentContext, ContextGraph
from semantica.vector_store import VectorStore

# The FAISS index persists to disk at index_path — restart-safe
ti_vs = VectorStore(
    backend="faiss",
    dimension=768,
    index_path="ti_agent/memory.faiss",
)

# The ContextGraph holds entity nodes and their relationships
ti_graph = ContextGraph(advanced_analytics=True)

# AgentContext orchestrates everything
ti_agent = AgentContext(
    vector_store=ti_vs,
    knowledge_graph=ti_graph,
    retention_days=365,          # CTI reports stay relevant for a year
    max_memories=50000,          # ring buffer ceiling — oldest evicted first
    graph_expansion=True,        # enable multi-hop graph traversal in retrieve()
    max_expansion_hops=2,
    hybrid_alpha=0.5,            # 50% semantic score / 50% graph-structural score
    decision_tracking=True,      # enable record_decision() and find_precedents()
    kg_algorithms=True,          # Node2Vec embeddings, centrality, link prediction
)
The hybrid_alpha parameter controls how retrieval blends semantic similarity (pure vector search) with graph-structural similarity (topology of the knowledge graph). At 0.5 the agent treats both signals equally. For a freshly ingested corpus with a sparse graph, you might start closer to 0.0 and increase as the graph fills in.

Storing What the Agent Learns

Single observations

Every piece of intelligence the agent processes can be stored with a single call. The string is embedded and indexed immediately; the optional metadata travels with it and is available in every retrieval result.
# Store a single finding from an OSINT feed
memory_id = ti_agent.store(
    "APT29 uses HAMMERTOSS for C2 communication over Twitter and GitHub",
    metadata={
        "source": "mandiant_apt29_report",
        "actor": "APT29",
        "technique": "T1102",   # Web Service
        "tlp": "WHITE",
    },
)
# memory_id is a UUID string — use it to retrieve or forget this item later

# Tag observations to an active incident so they can be retrieved as a group
ti_agent.store(
    "New C2 indicator: c2-upd4te[.]ru resolves to 185.220.101.47, cert hash a3f4b8c1...",
    metadata={"type": "ioc", "confidence": 0.92, "source": "internal_hunt"},
    conversation_id="incident_ir2025_0847",
    user_id="analyst_zhang",
)
The conversation_id acts as a namespace. Memories tagged with incident_ir2025_0847 can be retrieved as a group later — useful for building a per-incident context window without polluting the global search index.

Ingesting document corpora

When store() receives a list, it treats each element as a document, builds a graph of entities and relationships extracted from the text, and returns statistics about what was created.
stats = ti_agent.store(
    [
        {
            "content": "APT29 infrastructure cluster: 185.220.101.0/24, AS200651",
            "metadata": {"source": "shadowserver", "actor": "APT29", "ioc_type": "network"},
        },
        {
            "content": "SolarWinds supply chain compromise attributed to APT29, 2020",
            "metadata": {"source": "us_cert_aa20-352a", "actor": "APT29", "campaign": "SUNBURST"},
        },
        {
            "content": "NOBELIUM (APT29) leverages OAuth token theft against cloud workloads",
            "metadata": {"source": "msft_blog_2023", "actor": "APT29", "technique": "T1528"},
        },
    ],
    extract_entities=True,       # extract actor, IP, CVE, technique nodes
    extract_relationships=True,  # link actor → campaign → technique → infrastructure
    link_entities=True,          # merge duplicate entity mentions across docs
)

print("Stored: {}, Graph nodes: {}, Graph edges: {}".format(
    stats["stored_count"],   # 3 — one per document
    stats["graph_nodes"],    # entities extracted and upserted into the graph
    stats["graph_edges"],    # relationships between those entities
))
After this call the knowledge graph contains nodes for APT29, HAMMERTOSS, the infrastructure subnet, the SUNBURST campaign, and OAuth token theft — all linked to each other. Those graph links are what enable multi-hop retrieval: ask about “cloud OAuth attacks” and the agent can follow the graph from the technique node back to APT29 and then forward to the infrastructure indicators.

Retrieving Relevant Memory

Semantic retrieval

The most direct retrieval call searches by semantic similarity — no keyword match required. The embedding of your query is compared against all stored memory embeddings, and the top matches are returned with scores.
results = ti_agent.retrieve(
    "cloud OAuth token theft campaigns",
    max_results=8,
    min_score=0.2,
)

for r in results:
    actor = r.get("metadata", {}).get("actor", "unknown")
    print("[{:.3f}]  [{}]  {}".format(r["score"], actor, r["content"][:80]))

# [0.912]  [APT29]  NOBELIUM (APT29) leverages OAuth token theft against cloud workloads
# [0.741]  [APT29]  SolarWinds supply chain compromise attributed to APT29, 2020
# [0.683]  [APT29]  APT29 infrastructure cluster: 185.220.101.0/24, AS200651
The agent found the OAuth finding at the top — not because the query contained the exact phrase, but because the embedding space places “cloud OAuth token theft campaigns” close to “NOBELIUM leverages OAuth token theft against cloud workloads.”

Graph-anchored retrieval with proximity scoring

When you have a specific entity as the center of your investigation, anchor the retrieval to that node and blend semantic score with graph-proximity score.
results = ti_agent.retrieve(
    "cloud OAuth token theft campaigns",
    max_results=10,
    use_graph=True,
    anchor_node="APT29",      # BFS starts from this node in the knowledge graph
    max_hops=3,
    proximity_weight=0.35,    # 65% semantic + 35% proximity — tune to your graph density
    min_score=0.1,
)

for r in results:
    # combined_score blends semantic score and graph proximity
    score = r.get("combined_score", r["score"])
    hop  = r.get("hop_distance", "-")
    band = r.get("distance_band", "-")  # "direct", "near", "mid-range", "distant"
    print("[{:.3f}]  hop={}  band={}  {}".format(score, hop, band, r["content"][:70]))
The proximity_weight parameter is a per-call override — you can use heavy proximity weighting when pivoting on a specific actor and drop back to pure semantic search when exploring broadly.

Graph-grounded reasoning

When you need a natural-language answer that synthesizes multiple memory items, use query_with_reasoning() to retrieve context from the graph and ask the LLM to ground its answer in those sources.
from semantica.llms import Groq

llm = Groq(model="llama-3.1-8b-instant", api_key="YOUR_GROQ_KEY")

result = ti_agent.query_with_reasoning(
    "Which threat actors are associated with SMB lateral movement in EMEA "
    "and what infrastructure do they share with cloud OAuth campaigns?",
    llm_provider=llm,
    max_results=15,
    max_hops=3,
)

print(result["response"])       # grounded natural-language answer
print(result["confidence"])     # aggregated retrieval confidence score

# Inspect the sources the answer is grounded in
for src in result["sources"]:
    print("  -", src["content"][:60])
The result includes a reasoning_path field that traces exactly which graph edges were traversed to reach the answer — useful for analyst review and audit.

Building a Working Memory Window

Use the conversation_id filter to scope retrieval to the active session and combine incident-scoped history with global semantic search.
incident_id = "ir2025_0847"

# Store each new alert as it arrives, tagged to the incident
ti_agent.store(
    "Alert: lateral movement detected from WKSTN-047 to DC01 via SMB (PsExec artifact)",
    metadata={"type": "alert", "severity": "critical", "technique": "T1021.002"},
    conversation_id=incident_id,
    user_id="analyst_zhang",
)

ti_agent.store(
    "Analyst note: WKSTN-047 user jsmith flagged for suspicious login from 10.2.5.40 at 03:14 UTC",
    metadata={"type": "analyst_note"},
    conversation_id=incident_id,
    user_id="analyst_zhang",
)

# Retrieve the full incident thread
incident_history = ti_agent.conversation(
    incident_id,
    max_items=50,
    reverse=False,          # chronological order
    include_metadata=True,
)

for item in incident_history:
    role = item["metadata"].get("type", "note")
    print("[{}] {}".format(role, item["content"][:80]))

# Combine incident-scoped history with a semantic search across global memory
context_items = ti_agent.retrieve(
    "SMB lateral movement PsExec domain controller",
    max_results=5,
    use_graph=True,
    conversation_id=incident_id,  # filter to this incident's memories only
)
This pattern lets the agent build a focused working memory window for each incident while the global vector index accumulates knowledge across all incidents over time.

Domain Examples

A threat-intelligence fusion cell ingests OSINT feeds, MISP events, and internal hunt findings continuously. The agent must correlate new indicators against known actor profiles and produce attribution assessments grounded in accumulated intelligence — not just the latest report.
from semantica.context import AgentContext, ContextGraph
from semantica.vector_store import VectorStore
from semantica.llms import Groq

ti_graph = ContextGraph(advanced_analytics=True, node_embeddings=True)
ti_agent = AgentContext(
    vector_store=VectorStore(backend="faiss", dimension=768, index_path="ti_memory.faiss"),
    knowledge_graph=ti_graph,
    retention_days=365,
    max_memories=50000,
    hybrid_alpha=0.6,
    decision_tracking=True,
)

# Ingest a fresh CTI report — entities and infrastructure flow into the graph
ti_agent.store(
    [
        {"content": "APT29 infrastructure cluster: 185.220.101.0/24, AS200651",
         "metadata": {"source": "shadowserver", "actor": "APT29", "tlp": "WHITE"}},
        {"content": "SolarWinds supply chain compromise attributed to APT29, campaign SUNBURST",
         "metadata": {"source": "us_cert_aa20-352a", "actor": "APT29", "campaign": "SUNBURST"}},
        {"content": "NOBELIUM (APT29) leverages OAuth token theft against cloud workloads",
         "metadata": {"source": "msft_blog_2023", "actor": "APT29", "technique": "T1528"}},
    ],
    extract_entities=True,
    extract_relationships=True,
)

# New hunt finding — is this C2 domain connected to APT29?
ti_agent.store(
    "New C2 indicator: c2-upd4te[.]ru resolves to 185.220.101.47, cert hash a3f4b8...",
    metadata={"type": "ioc", "confidence": 0.92, "source": "internal_hunt"},
    conversation_id="hunt_2025_q3",
)

# Graph-anchored attribution query: start from APT29, traverse 3 hops
llm = Groq(model="llama-3.1-8b-instant", api_key="YOUR_GROQ_KEY")
attribution = ti_agent.query_with_reasoning(
    "Is c2-upd4te[.]ru connected to APT29 based on infrastructure overlap?",
    llm_provider=llm,
    max_results=10,
    max_hops=3,
)
print(attribution["response"])
print("Confidence: {:.0%}".format(attribution["confidence"]))

# Persist intelligence base across analyst shifts
ti_agent.save("ti_state/")

Persisting and Restoring State

At the end of an analyst shift — or before a process restart — call save() to write the full context to disk. On next startup, call load() to restore it completely.
# save() writes memory JSON plus backend-specific vector-store artifacts
# under agent_state/vector_store/ and the graph export at knowledge_graph.json.
# With the default VectorStore implementation this includes:
#   agent_state/agent_memory.json
#   agent_state/vector_store/store_data.pkl
#   agent_state/vector_store/index.bin
#   agent_state/knowledge_graph.json
ti_agent.save("agent_state/")
When a new process starts — or a new analyst logs in — restore from that checkpoint:
from semantica.context import AgentContext, ContextGraph
from semantica.vector_store import VectorStore

# Create a fresh context with matching configuration
ti_agent_restored = AgentContext(
    vector_store=VectorStore(backend="faiss", dimension=768, index_path="ti_memory.faiss"),
    knowledge_graph=ContextGraph(advanced_analytics=True),
    retention_days=365,
    decision_tracking=True,
)

# load() restores all three components from disk
ti_agent_restored.load("agent_state/")

# Every memory, graph edge, and decision precedent is now available
results = ti_agent_restored.retrieve("APT29 OAuth token theft cloud infrastructure")
AgentMemory itself is saved as JSON, but the vector store persists its own index and vector payload separately. load() restores those backend artifacts rather than re-embedding memories on demand, so keep the same vector-store backend, dimension, and scoring setup across sessions.

Taking Checkpoints During Analysis

For long-running analysis loops, take named snapshots before and after key steps so you can diff what the agent added during each phase.
# Snapshot before the analysis loop starts
ti_agent.checkpoint("pre_enrichment")

# ... store new evidence, extract entities, record decisions ...

# Snapshot after enrichment completes
ti_agent.checkpoint("post_enrichment")

# See exactly what changed
diff = ti_agent.diff_checkpoints("pre_enrichment", "post_enrichment")
print("Decisions added:     {}".format(len(diff["decisions_added"])))
print("Relationships added: {}".format(len(diff["relationships_added"])))

# Optionally persist via TemporalVersionManager (requires temporal_version_manager= at init)
# ti_agent.flush_checkpoint("post_enrichment")

Memory Lifecycle and Housekeeping

Retention is applied automatically on every store() call — items older than retention_days are pruned without any manual intervention. You can also remove specific memories or clear a full conversation namespace.
# Forget a specific memory by ID
ti_agent.forget(memory_id="some-uuid-string")

# Clear all memories tagged to a specific incident
cleared = ti_agent.forget(conversation_id="incident_ir2025_0847")
print("Cleared {} items".format(cleared))

# Clear everything older than 90 days
old_cleared = ti_agent.clear(days_old=90)

# Get current memory statistics
s = ti_agent.stats()
print("Total memories: {}".format(s.get("total_items", 0)))
  • Context Graphs — How the underlying ContextGraph stores entity nodes and decision nodes; temporal interval reasoning; deduplication before node insertion; ontology from graph.
  • Decision Intelligence — Recording decisions as graph nodes with causal chains and policy gating.
  • Multi-Agent Systems — Coordinating multiple agents through a shared AgentContext and save/load handoffs.
  • LLM Integrations — Configuring the LLM provider passed to query_with_reasoning().
  • Deduplication Guide — Full reference for DuplicateDetector, EntityMerger, similarity methods, and cluster strategies.
  • Ontology Management — Generate and validate OWL ontologies from the knowledge graph; export to Turtle, OWL/XML, JSON-LD.
  • Context Module Reference — Full API: AgentContext, AgentMemory, MemoryItem, ContextRetriever.
  • Vector Store Reference — FAISS, Qdrant, pgvector, Pinecone backends.