semantica.evals is planned as a comprehensive evaluation framework for measuring extraction accuracy, graph quality, and pipeline performance.
Planned Features
When released,semantica.evals will provide:
| Planned Class | Role |
|---|---|
KGEvaluator | Completeness, consistency, schema compliance, coverage, and orphan node detection |
ExtractionEvaluator | NER precision / recall / F1 and relation extraction metrics against gold datasets |
PipelineBenchmark | Throughput (docs/sec), per-step latency, peak memory, and error rate |
RegressionTracker | Record runs and compare metrics across commits or config changes |
EvalReport | Structured report: {scores, regressions, recommendations} |
DeduplicationEvaluator | Merge precision, false positive / false negative rates |
ReasoningEvaluator | Inference accuracy, rule coverage, and derivation depth |
Current Workaround
Untilsemantica.evals ships, use semantica.ontology.OntologyEvaluator for ontology quality metrics:
EvaluationResult fields returned by evaluate_ontology():
| Field | Type | Description |
|---|---|---|
coverage_score | float | Fraction of competency questions answerable by the ontology |
completeness_score | float | Average of class and property completeness scores |
gaps | List[str] | Identified gaps in coverage |
suggestions | List[str] | Improvement suggestions |
metrics | dict | Detailed sub-metrics |
Semantic Extract
Extraction module.
Knowledge Graph
Graph quality assessment.
Pipeline
Pipeline performance metrics.
Ontology Evaluator
Available now for ontology quality metrics.
