Parsing N-triplets via streaming

I've been pretty confused about this for a while, but I've finally learned how to parse the large RDF N-Triples (.nt) store using Raptor and Redland Python Extensions.

A common example is the following:

import RDF
parser=RDF.Parser(name="ntriples")
model=RDF.Model()
stream=parser.parse_into_model(model,"file:./mybigfile.nt")
for triple in model:
    print triple.subject, triple.predicate, triple.object

      

Parse_into_model () loads the object into memory by default, so if you are parsing a large file, you can use HashStorage as your model and serialize it that way.

But what if you just want to read a file and say add it to MongoDB without loading it into the model or something?

+3


source to share


1 answer


import RDF

parser=RDF.NTriplesParser()

for triple in parser.parse_as_stream("file:./mybigNTfile.nt"):
  print triple.subject, triple.predicate, triple.object

      



+2


source







All Articles