RDF triple storage with SPARQL queries and bulk loading: Blazegraph, Apache Jena, and RDF4J.
semantica.triplet_store provides W3C-standard RDF storage with SPARQL 1.1 query support. Use it when you need semantic web compatibility, OWL-style reasoning, SPARQL-based queries, or standards-compliant RDF serialization.
TripletStore wraps the backend of your choice. Construct a Triplet object (from semantica.semantic_extract.types) and call add_triplet():
from semantica.triplet_store import TripletStorefrom semantica.semantic_extract.types import Tripletstore = TripletStore( backend="blazegraph", endpoint="http://localhost:9999/blazegraph/sparql")# Create and store a single triplett = Triplet( subject="http://example.org/apple_inc", predicate="http://example.org/founded_by", object="http://example.org/steve_jobs",)store.add_triplet(t)# Query with SPARQL: returns a QueryResult with a .bindings listresult = store.execute_query(""" PREFIX ex: <http://example.org/> SELECT ?person ?company WHERE { ?person ex:founded ?company . }""")for row in result.bindings: person = row.get("person", {}).get("value") company = row.get("company", {}).get("value") print(person, company)
from semantica.triplet_store import TripletStorestore = TripletStore( backend="blazegraph", endpoint="http://localhost:9999/blazegraph/sparql")
2
Add triplets
from semantica.semantic_extract.types import Triplet# Add a single tripletstore.add_triplet(Triplet( subject="http://example.org/apple_inc", predicate="http://example.org/founded_by", object="http://example.org/steve_jobs",))# Bulk-add a list of Triplet objectsstore.add_triplets(triplets, batch_size=500)
# store(knowledge_graph, ontology) converts entities/relationships# to RDF triples and bulk-loads them in one callstore.store(knowledge_graph=kg_dict, ontology=ontology_dict)
from semantica.triplet_store import TripletStorestore = TripletStore( backend="blazegraph", endpoint="http://localhost:9999/blazegraph/sparql", namespace="kb", # default: "kb" timeout=30, # request timeout in seconds)
Best for: Wikidata-style workloads, high triple counts, named graph support, SPARQL 1.1 Update.
pip install rdflib
store = TripletStore( backend="jena", endpoint="http://localhost:3030/ds", # SPARQL read endpoint for rdflib SPARQLStore)
Best for: local development with rdflib, SPARQL read queries against a Fuseki endpoint.
backend="jena" OWL inference is a placeholder.enable_inference=True is accepted but the inference call returns 0 inferred triples. For production OWL reasoning, use Jena Fuseki directly with its built-in reasoner configuration.
pip install requests
store = TripletStore( backend="rdf4j", endpoint="http://localhost:8080/rdf4j-server", repository_id="semantica", # passed through **config)
Best for: Eclipse Foundation deployments, transaction-based loading via REST API.
Backend
License
Named Graphs
Write via
Best For
Blazegraph
Open source
Yes
SPARQL Update REST
High triple count, SPARQL 1.1
Apache Jena
Apache 2.0
No (rdflib backend)
rdflib in-process
Local dev, read queries
RDF4J
Eclipse 1.0
Yes
REST API N-Triples
Enterprise Java, transactions
Use Apache Jena for development, Blazegraph for production. Jena initializes with rdflib in-memory: no server required for local testing. Switch to Blazegraph for high-throughput persistent workloads by changing backend=.
All store operations use the Triplet dataclass from semantica.semantic_extract.types:
from semantica.semantic_extract.types import Triplett = Triplet( subject="http://example.org/apple_inc", # required: full URI string predicate="http://example.org/founded_by", # required: full URI string object="http://example.org/steve_jobs", # required: URI or literal string confidence=0.95, # optional: float 0.0–1.0, default 1.0 metadata={"source": "wikipedia"}, # optional: dict)
Field
Type
Default
Description
subject
str
required
Subject URI
predicate
str
required
Predicate URI
object
str
required
Object URI or literal
confidence
float
1.0
Confidence score (0–1)
metadata
dict
{}
Arbitrary metadata
add_triplet() takes a Triplet object, not keyword arguments. Use Triplet(subject=..., predicate=..., object=...) from semantica.semantic_extract.types and pass the object: not subject=, predicate=, obj= to add_triplet.
Each dict maps variable name → {"value": ..., "type": ...}
variables
List[str]
SPARQL result variable names
execution_time
float
Seconds elapsed
metadata
dict
Query, graph scope, cache hit flag
execute_query() returns QueryResult, not a list. Iterate result.bindings, not result directly. Each binding is a dict mapping variable name → {"value": ..., "type": ...}.
For large result sets, paginate with LIMIT and OFFSET:
page_size = 1000offset = 0while True: result = store.execute_query(f""" SELECT ?s ?p ?o WHERE {{ ?s ?p ?o . }} ORDER BY ?s LIMIT {page_size} OFFSET {offset} """) if not result.bindings: break process_batch(result.bindings) offset += page_size
Paginate large SPARQL result sets. A SELECT * WHERE { ?s ?p ?o } against a large store returns all triples. Always include LIMIT and OFFSET in exploratory queries. QueryEngine adds LIMIT 1000 automatically unless you specify one.
Blazegraph and RDF4J support named graphs. Scope execute_query() to a named graph with the graph= parameter:
# Add a triplet: named graph stored in metadata or backend-specific APIfrom semantica.semantic_extract.types import Triplett = Triplet( subject="http://example.org/a", predicate="http://example.org/p", object="http://example.org/b",)store.add_triplet(t) # named graph targeting requires backend-specific API# Query a named graph via FROM clause in SPARQLresult = store.execute_query(""" SELECT ?s ?p ?o WHERE { ?s ?p ?o . }""", graph="http://example.org/graph1") # injects FROM <graph> before WHERE# Or scope inline using FROM in the query stringresult = store.execute_query(""" SELECT ?s ?p ?o FROM <http://example.org/graph1> WHERE { ?s ?p ?o . }""")
Named graph support is only available for Blazegraph and RDF4J backends. The graph= parameter is silently ignored for the Jena backend.
Use named graphs to isolate sources. Pass graph="http://example.org/source_A" to execute_query() to scope a query to a specific named graph. Blazegraph and RDF4J support named graphs; Jena (rdflib backend) does not.
store.add_skos_concept( concept_uri="http://example.org/skos/MachineLearning", scheme_uri="http://example.org/skos/AIScheme", pref_label="Machine Learning", alt_labels=["ML", "Statistical Learning"], broader=["http://example.org/skos/AI"], definition="A field of artificial intelligence...",)# Retrieve all concepts in a schemeconcepts = store.get_skos_concepts(scheme_uri="http://example.org/skos/AIScheme")for c in concepts: print(c["uri"], c["pref_label"], c["alt_labels"])
# Compare two named graph snapshots and return added/removed triplesdelta = store.compute_delta( old_graph_uri="http://example.org/graph/v1", new_graph_uri="http://example.org/graph/v2",)print(f"Added: {delta['added_count']} triples")print(f"Removed: {delta['removed_count']} triples")for t in delta["added_triples"]: print(f"+ {t.subject} {t.predicate} {t.object}")for t in delta["removed_triples"]: print(f"- {t.subject} {t.predicate} {t.object}")
The Export module writes RDF that the triplet store can then receive via add_triplets():
from semantica.export import RDFExporterfrom semantica.triplet_store import TripletStorefrom semantica.semantic_extract.types import Triplet# Export KG to Turtleexporter = RDFExporter()exporter.export_to_file(kg, "output.ttl", format="turtle")# Parse the file and load triplets into the store# (TripletStore does not have a built-in import_file() method —# parse with rdflib and convert to Triplet objects)import rdflibg = rdflib.Graph()g.parse("output.ttl", format="turtle")store = TripletStore(backend="jena", endpoint="http://localhost:3030/ds")triplets = [ Triplet(subject=str(s), predicate=str(p), object=str(o)) for s, p, o in g]store.add_triplets(triplets)# Query with SPARQLresult = store.execute_query("SELECT * WHERE { ?s ?p ?o } LIMIT 10")for row in result.bindings: print(row)