ContextGraph distance intelligence answers the structural question that pure semantic similarity cannot: given two nodes, what is their precise relationship in terms of graph topology, path weight, and inferential confidence? Use it to annotate attribution chains with hop counts and confidence decay, rank retrieval results by structural proximity to an anchor node, and surface implied connections for analyst review.
Distance Intelligence feeds into proximity-blended retrieval (proximity_weight on retrieve()), causal chain analysis (trace_decision_causality()), and advanced precedent search (find_precedents_hybrid()). Enable it by passing include_distance_metadata=True on neighbor queries or proximity_weight > 0 on retrieval calls.

Distance Bands: Turning Hop Counts into Meaning

The first tool in distance intelligence is classify_path_distance — it maps any BFS depth to a human-readable band that carries semantic meaning.
from semantica.utils.helpers import classify_path_distance

print(classify_path_distance(0))   # "direct"  — same node
print(classify_path_distance(1))   # "direct"  — single edge
print(classify_path_distance(2))   # "near"    — two-hop neighbourhood
print(classify_path_distance(3))   # "near"
print(classify_path_distance(5))   # "mid-range"
print(classify_path_distance(9))   # "distant" — treat with caution
BandHop RangeWhat it means in practice
"direct"0–1Direct relationship — high confidence inferences
"near"2–3Two-hop neighbourhood — closely related, reliable
"mid-range"4–6Reachable but semantically separated
"distant"7+Weakly coupled — treat inferences with caution
These bands appear automatically on every result that uses include_distance_metadata=True, proximity_weight > 0, or trace_decision_causality(). You do not compute them manually — they are attached to the result.

Confidence Decay: How Trust Erodes Along a Path

Each hop along a path multiplies the accumulated confidence by the edge weight. The product — confidence_decay — is the single most useful signal for deciding whether a multi-hop inference is trustworthy.
from semantica.context import ContextGraph

graph = ContextGraph()
graph.add_node("apt29",       "ThreatActor",   "APT29 / NOBELIUM")
graph.add_node("hammertoss",  "Malware",       "HAMMERTOSS C2 tool")
graph.add_node("twitter_c2",  "Infrastructure","APT29 Twitter C2 channel")
graph.add_node("nato_target", "Target",        "NATO defense contractor")

graph.add_edge("apt29",      "hammertoss",  "deploys", weight=0.95)
graph.add_edge("hammertoss", "twitter_c2",  "uses",    weight=0.88)
graph.add_edge("twitter_c2", "nato_target", "reaches", weight=0.80)

# Request neighbors with full distance metadata attached
neighbors = graph.get_neighbors(
    "apt29",
    hops=3,
    include_distance_metadata=True,
)

for n in neighbors:
    print("{:20s}  band={:10s}  decay={:.3f}  hop={}".format(
        n["id"],
        n["distance_band"],
        n["confidence_decay"],
        n["hop"],
    ))
Output:
hammertoss            band=direct     decay=0.950  hop=1
twitter_c2            band=near       decay=0.836  hop=2
nato_target           band=near       decay=0.669  hop=3
The nato_target node is reachable — but with only 0.669 confidence decay. That means any inference drawn from the connection between APT29 and the NATO contractor carries a 33% uncertainty budget accumulated across three hops. At “near” band, the inference is still usable; at “distant” band with similar decay, you would flag it for human review.

Getting All Neighbors with Distance Metadata

get_neighbor_distances returns every reachable node up to a given hop depth, filtered by a minimum confidence threshold, ordered by nearest hops first and strongest decay first within each hop.
neighbors = graph.get_neighbor_distances(
    "apt29",
    hops=4,
    relationship_types=["deploys", "uses", "reaches"],
    min_confidence=0.60,   # drop nodes where confidence_decay < 0.60
)

# Each result dict contains:
# "id", "type", "content"    — node identity
# "relationship"             — edge type of the last hop
# "weight"                   — edge weight of the last hop
# "hop"                      — BFS depth from anchor
# "distance_band"            — "direct" / "near" / "mid-range" / "distant"
# "confidence_decay"         — product of all edge weights along the path
# "path_to_anchor"           — full node ID list from anchor to this node

for n in neighbors:
    print("[{:10s}] {:20s}  decay={:.3f}  path={}".format(
        n["distance_band"],
        n["id"],
        n["confidence_decay"],
        " → ".join(n["path_to_anchor"]),
    ))

Finding the Shortest Path Between Two Nodes

PathFinder exposes five path algorithms. The right one depends on whether you need the single cheapest path, multiple alternative paths, or all paths from a source.
from semantica.kg import PathFinder

pf = PathFinder()
Dijkstra — weighted shortest path. Use this as the default. It finds the path where the sum of edge weights is minimised.
path = pf.dijkstra_shortest_path(
    graph  = graph,
    source = "apt29",
    target = "nato_target",
)
length = pf.path_length(graph, path)
print("Shortest path:", " → ".join(path))
print("Path length  :", round(length, 3))
BFS — unweighted shortest path. Use when you want fewest hops regardless of edge weights.
path = pf.bfs_shortest_path(graph, "apt29", "nato_target")
print("Hop count:", len(path) - 1)
K-shortest paths — Yen’s algorithm. Use when you need alternative attribution chains, redundancy analysis, or corroboration routes. Finding the three shortest paths and showing they all converge on the same target is stronger evidence than a single path.
k_paths = pf.find_k_shortest_paths(graph, "apt29", "nato_target", k=3)

for i, path in enumerate(k_paths, 1):
    length = pf.path_length(graph, path)
    band   = classify_path_distance(len(path) - 1)
    print("Path {} [{}] length={:.3f}: {}".format(
        i, band, length, " → ".join(path)
    ))
All shortest paths from a source. Use when you want to map everything reachable from an anchor node and understand the structural layout.
all_paths = pf.all_shortest_paths(graph, source="apt29")

for target, paths in all_paths.items():
    path = paths[0]
    print("{:20s}  hops={}  path={}".format(
        target, len(path) - 1, " → ".join(path)
    ))

Proximity-Blended Retrieval

Standard semantic retrieval ranks results by text similarity to the query. Proximity-blended retrieval adds a second signal: how structurally close is each result to an anchor node in the graph? The proximity_weight parameter controls the blend.
from semantica.context import AgentContext, ContextGraph
from semantica.vector_store import VectorStore

graph   = ContextGraph(advanced_analytics=True)
context = AgentContext(
    vector_store=VectorStore(backend="faiss", dimension=768),
    knowledge_graph=graph,
    hybrid_alpha=0.5,
)

context.store([
    "APT29 exploited CVE-2024-3400 in PAN-OS targeting NATO governments.",
    "HAMMERTOSS is APT29's C2 tool using Twitter as a covert channel.",
    "SUNBURST was a supply chain implant targeting SolarWinds Orion.",
], extract_entities=True, extract_relationships=True)

# 70% semantic + 30% graph proximity, anchored at APT29
results = context.retrieve(
    "nation-state C2 infrastructure",
    max_results      = 8,
    use_graph        = True,
    anchor_node      = "APT29",
    max_hops         = 3,
    proximity_weight = 0.30,
    min_score        = 0.20,
)

for r in results:
    print("[{:.3f}]  band={:10s}  hop={}  decay={:.3f}  {}".format(
        r.get("combined_score", r["score"]),
        r.get("distance_band",   "-"),
        r.get("hop_distance",    "-"),
        r.get("confidence_decay", 0),
        r["content"][:70],
    ))
When proximity_weight > 0, each result gains proximity_score, combined_score, hop_distance, distance_band, confidence_decay, and path_to_anchor — giving you a complete picture of why each result ranked where it did.

Finding Structurally Similar Nodes

When you want to know which other nodes in the graph behave like a given node — same connectivity pattern or similar text — find_similar_nodes exposes the modes implemented on ContextGraph today.
# Content similarity — text overlap on node content fields
content_similar = graph.find_similar_nodes(
    "CVE-2024-3400",
    similarity_type = "content",
    top_k           = 5,
)

# Structural similarity — nodes with similar neighbourhood topology
struct_similar = graph.find_similar_nodes(
    "CVE-2024-3400",
    similarity_type = "structural",
    top_k           = 5,
)

for n in content_similar:
    print("[{:.3f}] {}  {}".format(n["score"], n["type"], n["id"]))
similarity_type="content" compares node text/content, while similarity_type="structural" compares neighbourhood topology. Other values currently fall back to content similarity, so reserve "embedding" for lower-level KG APIs rather than ContextGraph.find_similar_nodes().

Domain Examples

Finding the primary attribution path from a C2 IP to a threat actor, then finding all alternative corroboration paths to strengthen the attribution case before it goes into an intelligence product.
from semantica.context import ContextGraph
from semantica.kg import PathFinder
from semantica.utils.helpers import classify_path_distance

graph = ContextGraph(advanced_analytics=True)

for node_id, ntype, content in [
    ("apt29",       "ThreatActor",   "APT29 / NOBELIUM / Cozy Bear"),
    ("hammertoss",  "Malware",       "HAMMERTOSS C2 backdoor"),
    ("twitter_c2",  "Infrastructure","APT29 Twitter C2 (steganography)"),
    ("github_c2",   "Infrastructure","APT29 GitHub dead-drop resolver"),
    ("as200651",    "Network",       "APT29 hosting cluster AS200651"),
    ("nato_gov",    "Target",        "NATO government agency"),
]:
    graph.add_node(node_id, ntype, content)

graph.add_edge("apt29",     "hammertoss",  "deploys",   weight=0.95)
graph.add_edge("hammertoss","twitter_c2",  "c2_via",    weight=0.88)
graph.add_edge("hammertoss","github_c2",   "c2_via",    weight=0.82)
graph.add_edge("twitter_c2","as200651",    "hosted_on", weight=0.90)
graph.add_edge("github_c2", "as200651",    "resolves",  weight=0.76)
graph.add_edge("as200651",  "nato_gov",    "targets",   weight=0.85)

pf = PathFinder()

# Primary attribution path
primary = pf.dijkstra_shortest_path(graph, "apt29", "nato_gov")
length  = pf.path_length(graph, primary)
band    = classify_path_distance(len(primary) - 1)
print("Primary [{}] length={:.3f}: {}".format(band, length, " → ".join(primary)))

# Three corroboration paths
k_paths = pf.find_k_shortest_paths(graph, "apt29", "nato_gov", k=3)
for i, path in enumerate(k_paths, 1):
    l = pf.path_length(graph, path)
    b = classify_path_distance(len(path) - 1)
    print("Alt {}: {} [{}, length={:.3f}]".format(i, " → ".join(path), b, l))

# All reachable from APT29 with confidence >= 60%
neighbors = graph.get_neighbor_distances("apt29", hops=4, min_confidence=0.60)
print("\nReachable from APT29 (confidence >= 60%):")
for n in neighbors:
    print("  [{:10s}]  decay={:.3f}  {}".format(
        n["distance_band"], n["confidence_decay"], n["id"]
    ))
  • Context GraphsContextGraph node and edge model; add_edge(weight=...) feeds confidence decay
  • Graph Analytics — centrality, community detection, Node2Vec embeddings, link prediction
  • Agent Memory — proximity-blended retrieval (proximity_weight) integrates distance intelligence into memory search
  • Decision Intelligencetrace_decision_causality() for causal chains with distance annotations
  • Reasoning & RulesTemporalReasoningEngine for Allen interval algebra over time-bounded graph nodes