semantica.conflicts detects and resolves contradictions when multiple sources disagree on the same fact:
  • Five conflict types: value, type, temporal, logical, and relationship
  • Seven resolution strategies: voting, credibility-weighted, most-recent, first-seen, highest-confidence, manual review, expert review
  • InvestigationGuideGenerator produces step-by-step investigation instructions for manual resolution
  • SourceTracker maps each property value to its contributing source for full attribution
  • Conflicts are surfaced explicitly: never silently corrupting the knowledge graph

Why Detect Conflicts?

When you ingest data from multiple sources, contradictions are inevitable. One annual report says Apple’s revenue was 391B;afinancialnewswiresays391B; a financial newswire says 383B. Without conflict detection, both values land in your graph and queries silently return inconsistent answers. Semantica’s conflict detection makes disagreements explicit and actionable:
  • Value conflicts: SEC says revenue is 391B;Reuterssays391B; Reuters says 383B
  • Type conflicts: “Python” is a ProgrammingLanguage in one source, a Snake species in another
  • Temporal conflicts: a CEO had two different employers during overlapping date ranges
  • Logical conflicts: an entity simultaneously holds two mutually exclusive properties
  • Relationship conflicts: the same relationship has inconsistent cardinality or properties across sources

Exported Classes

ClassRole
ConflictDetectorDetects value, type, and relationship conflicts across entity lists
ConflictResolverResolves conflicts with configurable strategy: voting, credibility_weighted, most_recent, first_seen, highest_confidence, manual_review, expert_review
ConflictTypeEnum: VALUE_CONFLICT, TYPE_CONFLICT, TEMPORAL_CONFLICT, LOGICAL_CONFLICT, RELATIONSHIP_CONFLICT
ResolutionStrategyEnum of available resolution strategies passed to ConflictResolver
ResolutionResultDataclass returned by resolve_conflict / resolve_conflicts
SourceTrackerTracks which source contributed each property value on each entity
SourceReferenceSource document reference with document, page, section, confidence
PropertySourceAggregated property-level provenance: value + list of SourceReference objects
ConflictAnalyzerAnalyzes conflict patterns, severity distribution, and per-source statistics
ConflictPatternDataclass describing a detected conflict pattern
InvestigationGuideGeneratorGenerates step-by-step investigation guides for conflicts requiring manual review
InvestigationGuideGuide dataclass: conflict_id, conflict_summary, severity, investigation_steps, recommended_actions
InvestigationStepStep dataclass: step_number, description, action, expected_outcome

What You Get

ConflictDetector

Value, type, and relationship conflict detection across entity and relationship lists.

ConflictResolver

7 resolution strategies including voting, credibility-weighted, and temporal preference.

SourceTracker

Track which source each conflicting fact came from, with per-source credibility scores.

ConflictAnalyzer

Pattern analysis, severity grouping, source-level statistics, and trend identification.

InvestigationGuideGenerator

Auto-generate step-by-step investigation checklists for human and expert review.

Convenience Functions

detect_conflicts() and resolve_conflicts() for one-call workflows.

Quick Start

1

Set credibility scores before ingestion

from semantica.conflicts import SourceTracker

tracker = SourceTracker()
tracker.set_source_credibility("sec_filings",   0.95)
tracker.set_source_credibility("pubmed",        0.92)
tracker.set_source_credibility("wikipedia",     0.80)
tracker.set_source_credibility("news_articles", 0.65)
2

Detect conflicts after building the graph

from semantica.conflicts import ConflictDetector

detector = ConflictDetector()

# Detect value conflicts on a specific property
conflicts = detector.detect_value_conflicts(entities, "revenue")
print("Found %d conflicts" % len(conflicts))

for conflict in conflicts:
    print("[%s] entity='%s'  attr='%s'" % (
        conflict.conflict_type, conflict.entity_id, conflict.property_name))
    print("  Values: %s  Severity: %s" % (
        conflict.conflicting_values, conflict.severity))
3

Triage by severity

from semantica.conflicts import ConflictAnalyzer

analyzer  = ConflictAnalyzer()
analysis  = analyzer.analyze_conflicts(conflicts)
severity_counts = analysis["by_severity"]["counts"]
severity_details = analysis["by_severity"]["details"]
print("Critical: %d" % severity_counts.get("critical", 0))
print("High:     %d" % severity_counts.get("high", 0))
print("Low:      %d" % severity_counts.get("low", 0))
4

Auto-resolve low-severity, escalate critical

from semantica.conflicts import ConflictResolver, InvestigationGuideGenerator, ResolutionStrategy

resolver = ConflictResolver(source_tracker=tracker)

# Auto-resolve low-severity conflicts
low_conflicts = severity_details.get("low", [])
# Re-fetch full Conflict objects if needed: severity_details contains dicts
auto_resolved = resolver.resolve_conflicts(
    conflicts,
    strategy=ResolutionStrategy.CREDIBILITY_WEIGHTED,
)

# Generate investigation guides for critical conflicts
critical_ids = {d["conflict_id"] for d in severity_details.get("critical", [])}
critical_conflicts = [c for c in conflicts if c.conflict_id in critical_ids]

generator = InvestigationGuideGenerator()
for conflict in critical_conflicts:
    guide = generator.generate_guide(conflict)
    print("\n%s" % guide.title)
    for step in guide.investigation_steps:
        print("  [%d] %s" % (step.step_number, step.description))
        print("       Action: %s" % step.action)
Detect before you merge, not after. Run conflict detection on raw entity data before deduplication and graph construction. Detecting conflicts in a live graph that already contains merged entities is harder: you lose the original source attribution.

ConflictDetector

from semantica.conflicts import ConflictDetector

detector = ConflictDetector()

# Detect value conflicts on a specific property
conflicts = detector.detect_value_conflicts(entities, "revenue")

Detection Types

TypeWhat It DetectsExample
VALUESame entity, same property, different values across sourcesRevenue 391Bvs391B vs 383B
TYPESame entity classified as different types”Python” as Language vs Snake
TEMPORALConflicting timestamps or validity windowsCEO at two companies simultaneously
LOGICALLogically inconsistent property combinationsis_alive=True but death_date set
RELATIONSHIPInconsistent relationship properties across sourcesEdge weight 0.9 vs 0.3 from two sources
TEMPORAL and LOGICAL conflict detection is not implemented on ConflictDetector directly. The ConflictType enum includes these types for use in custom pipelines, but the detector class only implements detect_value_conflicts, detect_type_conflicts, detect_relationship_conflicts, and detect_entity_conflicts.
Run targeted detection by type:
# Detect value conflicts for a specific property
value_conflicts = detector.detect_value_conflicts(entities, "revenue")

# Detect type classification conflicts
type_conflicts = detector.detect_type_conflicts(entities)

# Detect relationship property conflicts (takes a list of relationship dicts)
relation_conflicts = detector.detect_relationship_conflicts(relationships)

# Detect conflicts across all properties of a set of entities
all_conflicts = detector.detect_entity_conflicts(entities)

ConflictDetector Methods

MethodReturnsDescription
detect_value_conflicts(entities, property_name, entity_type=None)List[Conflict]Detect value disagreements on a specific property across entity instances
detect_type_conflicts(entities)List[Conflict]Detect type classification conflicts
detect_relationship_conflicts(relationships)List[Conflict]Detect relationship property conflicts (takes a list of relationship dicts)
detect_entity_conflicts(entities, entity_type=None)List[Conflict]Detect conflicts across all monitored properties for a set of entities
get_conflict_report()Dict[str, Any]Generate a summary report of all detected conflicts

ConflictResolver

from semantica.conflicts import ConflictResolver, ResolutionStrategy

resolver = ConflictResolver()
results  = resolver.resolve_conflicts(conflicts, strategy=ResolutionStrategy.VOTING)

for result in results:
    print("Resolved '%s' -> %s" % (result.conflict_id, result.resolved_value))
    print("  Strategy: %s  Confidence: %.2f" % (result.resolution_strategy, result.confidence))
Don’t auto-resolve everything. Use MANUAL_REVIEW for conflicts with severity == "critical" or severity == "high": high severity means the disagreement is large and the stakes of getting it wrong are high.

Choosing a Resolution Strategy

Use the convenience aliases for shorter code:
from semantica.conflicts import voting, credibility_weighted, most_recent, highest_confidence

results = resolver.resolve_conflicts(conflicts, strategy=voting)

SourceTracker

from semantica.conflicts import SourceTracker, SourceReference

tracker = SourceTracker()
tracker.set_source_credibility("sec_10k",   0.92)
tracker.set_source_credibility("wikipedia", 0.80)

source_ref = SourceReference(
    document="sec_10k_2023",
    page=12,
    confidence=0.95,
)
tracker.track_property_source(
    entity_id="apple_inc",
    property_name="revenue",
    value="$391B",
    source=source_ref,
)

# Returns a PropertySource object with .value and .sources (List[SourceReference])
prop_source = tracker.get_property_sources("apple_inc", "revenue")
if prop_source:
    print("Value: %s" % prop_source.value)
    for s in prop_source.sources:
        credibility = tracker.get_source_credibility(s.document)
        print("  %s (confidence: %.2f, credibility: %.2f)" % (
            s.document, s.confidence, credibility))

chain = tracker.get_traceability_chain("apple_inc")
Key behaviours:
  • Credibility scores default to 0.50 for any source not explicitly set
  • SourceTracker stores property-level provenance: so you can trace exactly which source contributed each value
Always set credibility scores. The default credibility is 0.50 for all sources. Without explicit scores, CREDIBILITY_WEIGHTED behaves identically to VOTING. The power of this strategy is in the differentiation.
Combine with provenance. The SourceTracker feeds directly into the Provenance module’s audit trail. If you need to explain how a resolved value was chosen, provenance records give you the full chain.

ConflictAnalyzer

from semantica.conflicts import ConflictAnalyzer

analyzer = ConflictAnalyzer()

analysis     = analyzer.analyze_conflicts(conflicts)
patterns     = analysis["patterns"]
severity_counts = analysis["by_severity"]["counts"]
source_stats = analysis["by_source"]
trends       = analyzer.analyze_trends(conflicts)

# analyze_trends returns a list of dicts, one per time period
for t in trends:
    print("Period: %s  Count: %d  Trend: %s" % (
        t["period"], t["conflict_count"], t["trend"]))
Key behaviours:
  • analyze_conflicts()["patterns"] returns a list of ConflictPattern objects: use pattern.pattern_type and pattern.frequency to find systemic data quality issues
  • analyze_conflicts()["by_source"] includes counts and top_sources: sources appearing in many conflicts may have upstream data quality problems
  • analyze_trends() returns a list of per-period dicts (period, conflict_count, trend, trend_direction): trend is "increasing", "decreasing", or "stable"
Use analyze_conflicts()["by_source"]["top_sources"] to identify bad data feeds. A single source appearing in many conflicts is a data quality problem upstream, not a conflict to resolve record by record. Flag it and investigate the source pipeline.
Severity is a string label, not a score. ConflictDetector assigns "critical", "high", or "medium" based on property importance and value differences. Critical fields (id, name, type, revenue) always yield "critical". Domain context determines what to prioritize.

InvestigationGuideGenerator

Auto-generate human-readable investigation checklists for conflicts requiring manual or expert review:
from semantica.conflicts import InvestigationGuideGenerator

generator = InvestigationGuideGenerator()
guide     = generator.generate_guide(conflict)

print("Title:   %s" % guide.title)
print("Summary: %s" % guide.conflict_summary)

for step in guide.investigation_steps:
    print("  [%d] %s" % (step.step_number, step.description))
    print("       Action: %s" % step.action)
    if step.expected_outcome:
        print("       Expected: %s" % step.expected_outcome)

Schemas

@dataclass
class Conflict:
    conflict_id:        str
    conflict_type:      ConflictType        # VALUE_CONFLICT | TYPE_CONFLICT | ...
    entity_id:          Optional[str]       # entity involved (None for relationship conflicts)
    property_name:      Optional[str]       # the conflicting property name
    relationship_id:    Optional[str]       # relationship involved (for RELATIONSHIP_CONFLICT)
    conflicting_values: List[Any]           # conflicting values (one per source)
    sources:            List[Dict[str, Any]]# source dicts for each value
    confidence:         float               # detection confidence 0–1 (default: 1.0)
    severity:           str                 # "low" | "medium" | "high" | "critical"
    recommended_action: Optional[str]
    metadata:           Dict[str, Any]
@dataclass
class ResolutionResult:
    conflict_id:        str
    resolved:           bool
    resolved_value:     Any                 # None if unresolved or flagged for review
    resolution_strategy: Optional[str]      # e.g. "voting", "credibility_weighted"
    confidence:         float               # 0.0–1.0
    sources_used:       List[str]           # document IDs that contributed
    resolution_notes:   Optional[str]
    metadata:           Dict[str, Any]
from semantica.conflicts import ConflictType

ConflictType.VALUE_CONFLICT         # revenue is $391B in source A, $383B in source B
ConflictType.TYPE_CONFLICT          # "Apple" is ORGANIZATION in one source, PRODUCT in another
ConflictType.TEMPORAL_CONFLICT      # overlapping validity windows with contradictory states
ConflictType.LOGICAL_CONFLICT       # fact violates an ontology axiom or SHACL constraint
ConflictType.RELATIONSHIP_CONFLICT  # inconsistent relationship properties across sources
@dataclass
class InvestigationGuide:
    conflict_id:         str
    conflict_summary:    str                      # generated summary of the disagreement
    severity:            str                      # "low" | "medium" | "high" | "critical"
    conflicting_sources: List[Dict[str, Any]]
    investigation_steps: List[InvestigationStep]
    recommended_actions: List[str]
    context:             Dict[str, Any]
    generated_at:        str                      # ISO timestamp
    # title is a @property: "Investigation: <conflict_id>"

@dataclass
class InvestigationStep:
    step_number:      int
    description:      str   # what to do
    action:           str   # specific action to take
    expected_outcome: Optional[str]

Deduplication

Resolve duplicate entities before conflict detection.

Ontology

Logical conflicts use SHACL shapes and ontology axioms.

Provenance

Track which source each conflicting fact came from.

Knowledge Graph

The graph being checked for conflicts.