How to enable execution limits on Linkedmdb
I tried to extract all movies from Linkedmdb. I used OFFSET to make sure that I didn't type the maximum number of results for each query. I used the following script in python
"""
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
SELECT distinct ?film
WHERE {
?film a movie:film .
} LIMIT 1000 OFFSET %s """ %i
I looped 5 times with offsets being 0.1000,2000,3000,4000 and recording the number of results. It was (1000.1 thousand 500.0.0). I already knew the limit was 2500, but I thought using OFFSET we can handle it. It is not true? Can't get all data (even if we are using some kind of loop)?
source to share
Your current request is legitimate, but there is no specific order, so the offset will not take you to a predictable place in the results. (A lazy implementation can return the same results over and over again.) When you use limit and offset , you also need to use order by . The SPARQL 1.1 spec says (emphasis mine):
15.4 OFFSET
OFFSET forces generated solutions to run after the specified number of solutions. OFFSET zero has no effect.
Using LIMIT and OFFSET to select different subsets of a decision query will not be useful unless the ordering becomes predictable using ORDER BY.
source to share