How to resolve the execution limits in Linkedmdb

2019-01-28 11:01发布

I was trying to extract all movies from Linkedmdb. I used OFFSET to make sure I wont hit the maximum number of results per query. I used the following scrip in python

"""
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 SELECT distinct ?film
 WHERE {
 ?film a movie:film .
 } LIMIT 1000 OFFSET %s """ %i

I looped 5 times, with offsets being 0,1000,2000,3000,4000 and recorded the number of results. It was (1000,1000,500,0,0). I already knew the limit was 2500 but I thought by using OFFSET, we can get away with this. Is it no true? There is no way to get all the data (even when we use a loop of some sort)?

标签： sparql linkedmdb

1条回答

Melony?

2楼-- · 2019-01-28 11:54

Your current query is legal, but but there's no specified ordering, so the offset doesn't bring you to a predictable place in the results. (A lazy implementation could just return the same results over and over again.) When you use limit and offset, you need to also use order by. The SPARQL 1.1 specification says (emphasis added):

15.4 OFFSET

OFFSET causes the solutions generated to start after the specified number of solutions. An OFFSET of zero has no effect.

Using LIMIT and OFFSET to select different subsets of the query solutions will not be useful unless the order is made predictable by using ORDER BY.

0人赞添加讨论(0) 举报

How to resolve the execution limits in Linkedmdb

15.4 OFFSET

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间