Loading a large RDF file in til

I am trying to create a Sesame based SPARQL endpoint. I have installed Tomcat, PostgreSQL and deployed the Sesame web app. I created a repository based on PostgreSQL RDF repository. Now I need to upload a large ttl file (540M triplets, file size is several GB) to the repository. Uploading a large file on top of Workbench is not a good solution - it will take a few days. What's the best non-programming solution for loading data? Are there tools like a "console" for loading data? For example, Virtuoso has an isql tool for bulk upload ...

+3


source to share


1 answer


There is no out-of-the-box bulk loading tool for Sesame that I am aware of - although Sesame-compatible triplesting producers have such tools available as part of their specific database. Programming a bulk download solution isn't particularly difficult, but somehow we weren't able to include such a tool in the main Sesame distribution.



540M triplets, by the way, is probably too big for any of the default stores from Sesame - the Native Store only scales to 150 million, and loading such a large dataset into the memory store is too cumbersome (even if you have available RAM). As such, you will probably need to look into the Sesame compatible database provided by a third party. There are many options available, both commercial and free / open source, see this review on the Sesame website for a list of suggestions.

+1


source







All Articles