Parsing N-triplets via streaming
I've been pretty confused about this for a while, but I've finally learned how to parse the large RDF N-Triples (.nt) store using Raptor and Redland Python Extensions.
A common example is the following:
import RDF
parser=RDF.Parser(name="ntriples")
model=RDF.Model()
stream=parser.parse_into_model(model,"file:./mybigfile.nt")
for triple in model:
print triple.subject, triple.predicate, triple.object
Parse_into_model () loads the object into memory by default, so if you are parsing a large file, you can use HashStorage as your model and serialize it that way.
But what if you just want to read a file and say add it to MongoDB without loading it into the model or something?
+3
source to share