AgentContext maintains a persistent memory layer for LLM agents — storing observations as vector embeddings, retrieving them by semantic similarity, and optionally blending graph proximity into the ranking. Use it when your agent needs to recall past findings across sessions without re-reading source material on every restart.
This guide covers the memory layer. For graph-enriched traversal and entity linking, see Context Graphs. For decision accountability — recording, auditing, and causally tracing what the agent chose — see Decision Intelligence.
Setting Up a Persistent Memory Context
Configure the vector store, knowledge graph, andAgentContext together at startup so all three components persist to disk at the same path.
hybrid_alpha parameter controls how retrieval blends semantic similarity (pure vector search) with graph-structural similarity (topology of the knowledge graph). At 0.5 the agent treats both signals equally. For a freshly ingested corpus with a sparse graph, you might start closer to 0.0 and increase as the graph fills in.
Storing What the Agent Learns
Single observations
Every piece of intelligence the agent processes can be stored with a single call. The string is embedded and indexed immediately; the optional metadata travels with it and is available in every retrieval result.conversation_id acts as a namespace. Memories tagged with incident_ir2025_0847 can be retrieved as a group later — useful for building a per-incident context window without polluting the global search index.
Ingesting document corpora
Whenstore() receives a list, it treats each element as a document, builds a graph of entities and relationships extracted from the text, and returns statistics about what was created.
Retrieving Relevant Memory
Semantic retrieval
The most direct retrieval call searches by semantic similarity — no keyword match required. The embedding of your query is compared against all stored memory embeddings, and the top matches are returned with scores.Graph-anchored retrieval with proximity scoring
When you have a specific entity as the center of your investigation, anchor the retrieval to that node and blend semantic score with graph-proximity score.proximity_weight parameter is a per-call override — you can use heavy proximity weighting when pivoting on a specific actor and drop back to pure semantic search when exploring broadly.
Graph-grounded reasoning
When you need a natural-language answer that synthesizes multiple memory items, usequery_with_reasoning() to retrieve context from the graph and ask the LLM to ground its answer in those sources.
reasoning_path field that traces exactly which graph edges were traversed to reach the answer — useful for analyst review and audit.
Building a Working Memory Window
Use theconversation_id filter to scope retrieval to the active session and combine incident-scoped history with global semantic search.
Domain Examples
- Defense — CTI/Threat
- Security — SOC/Incident
- Life Science — Clinical/Pharma
- Banking — Risk/Compliance
A threat-intelligence fusion cell ingests OSINT feeds, MISP events, and internal hunt findings continuously. The agent must correlate new indicators against known actor profiles and produce attribution assessments grounded in accumulated intelligence — not just the latest report.
Persisting and Restoring State
At the end of an analyst shift — or before a process restart — callsave() to write the full context to disk. On next startup, call load() to restore it completely.
AgentMemory itself is saved as JSON, but the vector store persists its own index and vector payload separately. load() restores those backend artifacts rather than re-embedding memories on demand, so keep the same vector-store backend, dimension, and scoring setup across sessions.Taking Checkpoints During Analysis
For long-running analysis loops, take named snapshots before and after key steps so you can diff what the agent added during each phase.Memory Lifecycle and Housekeeping
Retention is applied automatically on everystore() call — items older than retention_days are pruned without any manual intervention. You can also remove specific memories or clear a full conversation namespace.
Related Guides
- Context Graphs — How the underlying
ContextGraphstores entity nodes and decision nodes; temporal interval reasoning; deduplication before node insertion; ontology from graph. - Decision Intelligence — Recording decisions as graph nodes with causal chains and policy gating.
- Multi-Agent Systems — Coordinating multiple agents through a shared
AgentContextand save/load handoffs. - LLM Integrations — Configuring the LLM provider passed to
query_with_reasoning(). - Deduplication Guide — Full reference for
DuplicateDetector,EntityMerger, similarity methods, and cluster strategies. - Ontology Management — Generate and validate OWL ontologies from the knowledge graph; export to Turtle, OWL/XML, JSON-LD.
- Context Module Reference — Full API:
AgentContext,AgentMemory,MemoryItem,ContextRetriever. - Vector Store Reference — FAISS, Qdrant, pgvector, Pinecone backends.
