export_to_rdf() returns a string: it does not write a file. Call export() or export_knowledge_graph() to write directly to disk.
Use export_to_rdf() + string for inspection, export() for production. In notebooks or debug sessions, export_to_rdf() is handy for quick inspection. For CI pipelines and pipelines writing files, export() is a single call.
Use turtle for human readability, ntriples for streaming. Turtle is compact and readable for debugging and sharing. N-Triples (.nt) is line-oriented: one triple per line: making it safe to stream, concatenate, and process with standard Unix tools.
Namespace management:
from semantica.export import NamespaceManager, RDFExporterns_manager = NamespaceManager()# ns_manager.namespaces contains the built-in prefix dict (rdf, rdfs, owl, xsd, semantica)# Add custom namespaces by updating the dict directlyns_manager.namespaces["ex"] = "http://example.org/"ns_manager.namespaces["schema"] = "https://schema.org/"# Generate Turtle prefix declarationsdecls = ns_manager.generate_namespace_declarations( ns_manager.namespaces, format="turtle")print(decls) # @prefix ex: <http://example.org/> . etc.
from semantica.export import ParquetExporterexporter = ParquetExporter(compression="snappy")# compression: snappy | gzip | brotli | zstd | lz4 | none# Export entities and relationships as separate Parquet filesexporter.export_entities(entities, "nodes.parquet")exporter.export_relationships(relationships, "edges.parquet")# Export full knowledge graph (writes entities.parquet and relationships.parquet)exporter.export_knowledge_graph(graph, "output_base")# → output_base_entities.parquet, output_base_relationships.parquet# Generic export from list or dictexporter.export(entities, "entities.parquet")exporter.export(graph, "output_base")
ParquetExporter and ArrowExporter require pyarrow. Both fall back to a no-op stub class if pyarrow is not installed. Install with pip install pyarrow before using these exporters.
Use ParquetExporter for downstream analytics. Parquet preserves column types (int, float, datetime) that CSV loses and is natively supported by Spark, BigQuery, Databricks, and Snowflake. Use compression="snappy" for a good balance of speed and compression.
Requires pyarrow: pip install pyarrow. Schema is explicitly typed.
from semantica.export import CSVExporterexporter = CSVExporter(delimiter=",")exporter.export_entities(entities, "nodes.csv")exporter.export_relationships(relationships, "edges.csv")exporter.export_knowledge_graph(graph, "output_base")
from semantica.export import SemanticNetworkYAMLExporterexporter = SemanticNetworkYAMLExporter()exporter.export(graph, "graph.yaml")
LPGExporter writes Cypher CREATE statements for Neo4j and Memgraph:
from semantica.export import LPGExporterexporter = LPGExporter()# Write Cypher CREATE statements to fileexporter.export(graph, "import.cypher")# Also availableexporter.export_knowledge_graph(graph, "import.cypher")
ArangoAQLExporter writes INSERT statements for ArangoDB:
from semantica.export import ArangoAQLExporterexporter = ArangoAQLExporter( vertex_collection="entities", edge_collection="relationships")# Write AQL INSERT statements to fileexporter.export(graph, "import.aql")exporter.export_knowledge_graph(graph, "import.aql")
Both exporters write to a file and return None.
ArangoAQLExporter.export() and LPGExporter.export() write to a file and return None. They do not return the AQL/Cypher string. Write to a file and read it back if you need the string.
from semantica.export import VectorExporterexporter = VectorExporter()# vectors: list of dicts with 'id', 'vector', 'text', 'metadata' keysexporter.export(vectors, "vectors.json", format="json")exporter.export(vectors, "vectors.npz", format="numpy") # NumPy .npzexporter.export(vectors, "vectors.bin", format="binary")exporter.export(vectors, "vectors.faiss", format="faiss")
ArrowExporter: requires pyarrow:
from semantica.export import ArrowExporterexporter = ArrowExporter()exporter.export(graph, "graph.arrow")
DistanceExporter: takes a graph argument at construction:
from semantica.export import DistanceExporterexporter = DistanceExporter(graph) # graph is required# Compute all pairwise distances and write to fileexporter.to_csv("distances.csv")exporter.to_jsonl("distances.jsonl")# Compute with column selection and optional node subsetexporter.to_csv( "distances.csv", include=["source_id", "target_id", "hop_count", "distance_band"], node_subset=["node_a", "node_b", "node_c"],)# Return as pandas DataFrame (requires pandas)df = exporter.to_dataframe(include=["hop_count", "semantic_similarity"])# Return as string (for API responses)csv_str = exporter.to_csv_string(node_subset=["node_a", "node_b"])jsonl_str = exporter.to_jsonl_string()
Available include columns: source_id, source_type, target_id, target_type, hop_count, weighted_distance, semantic_similarity, distance_band, source_betweenness, target_betweenness.
DistanceExporter requires a graph at construction. Instantiate as DistanceExporter(graph), not DistanceExporter(). Semantic similarity columns (semantic_similarity) require the graph nodes to have embeddings in their properties.
The export_csv convenience function delegates to CSVExporter.export(). For per-type exports use the class directly (exporter.export_entities(), exporter.export_relationships()).
Match your export format to your consumer. Neo4j → cypher; ArangoDB → aql; Gephi/yEd → graphml or gexf; semantic web tools → turtle or json-ld; analytics pipelines → parquet; zero-copy IPC → arrow.