What are brackets in SPARQL and why is the linked

2019-08-05 05:20发布

问题:

The following SPARQL query fetches only 2500 records with actors and films I don't know why its limited to 2500:

PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX movie: <http://data.linkedmdb.org/resource/movie/>

SELECT ?id ?filmTitle ?actorName WHERE { 
SERVICE <http://data.linkedmdb.org/sparql> {
?film a movie:film ;
      movie:filmid ?id ;
      dcterms:title ?filmTitle ;
      movie:actor [ a movie:actor ;
                    movie:actor_name ?actorName ].
  }
}

The query is from an answer to the question: Querying the Linked Movie Database (LMDB) with SPARQL

What does the a keyword mean? What do the square brackets [] stand for?

I understood that the a keyword is a substitute for rdf:type and I rewrote a portion of the SPARQL query without the actors. But I still can't figure out the meaning of the square brackets [].

PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT  ?film ?id ?filmTitle WHERE { 
#VALUES ?filmTitle { "The Matrix" }
SERVICE <http://data.linkedmdb.org/sparql> {
    ?film rdf:type movie:film.
    ?film movie:filmid ?id.
    ?film rdfs:label ?filmTitle.

  }
}

Thanks for your responses but the code misses out some actors for movies. For example the movie "A Bridge Too Far" has 18 actors but the result of this query has only 2

PREFIX dcterms: <purl.org/dc/terms/>; 
PREFIX movie: <data.linkedmdb.org/resource/movie/>; 
SELECT ?id ?filmTitle ?actorName 
WHERE { 
SERVICE <data.linkedmdb.org/sparql>;
 { 
  ?film a movie:film ; 
  movie:filmid ?id ;
  dcterms:title ?filmTitle ; 
  movie:actor [ a movie:actor ; 
           movie:actor_name ?actorName ]. 
 } 
} ORDER BY ASC(?filmTitle) 

My edited code, still giving same result of 2 actors instead of 18

filmlist.rq

PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT  ?film ?id ?filmTitle ?actorName WHERE { 
  #VALUES ?filmTitle { "The Matrix" }
  SERVICE <http://data.linkedmdb.org/sparql> {
        ?film rdf:type movie:film.
        ?film movie:filmid ?id.
        ?film rdfs:label ?filmTitle.
        ?film movie:actor ?actorID.
        ?actorID movie:actor_name ?actorName.

  }
}
ORDER BY ASC(?filmTitle)

回答1:

[ … ] is a blank node

The square brackets are described in the SPARQL 1.1 Query Language. In particular, see 4.1.4 Syntax for Blank Nodes

4.1.4 Syntax for Blank Nodes

Blank nodes in graph patterns act as variables, not as references to specific blank nodes in the data being queried.

Blank nodes are indicated by either the label form, such as "\_:abc", or the abbreviated form "[]". A blank node that is used in only one place in the query syntax can be indicated with []. A unique blank node will be used to form the triple pattern. Blank node labels are written as "_:abc" for a blank node with label "abc". The same blank node label cannot be used in two different basic graph patterns in the same query.

The [:p :v] construct can be used in triple patterns. It creates a blank node label which is used as the subject of all contained predicate-object pairs. The created blank node can also be used in further triple patterns in the subject and object positions.

The following two forms

[ :p "v" ] .
[] :p "v" .

allocate a unique blank node label (here "b57") and are equivalent to writing:

_:b57 :p "v" .

This allocated blank node label can be used as the subject or object of further triple patterns. For example, as a subject:

[ :p "v" ] :q "w" .

which is equivalent to the two triples:

_:b57 :p "v" .
_:b57 :q "w" .

and as an object:

:x :q [ :p "v" ] .

which is equivalent to the two triples:

:x  :q _:b57 .
_:b57 :p "v" .

a is shorthand for rdf:type

What does the a keyword mean? What do the square brackets [] stand for?

I understood that the a keyword is a substitute for rdf:type

There's not really much more to say than that. You can use a instead of rdf:type:

4.2.4 rdf:type

The keyword "a" can be used as a predicate in a triple pattern and is an alternative for the IRI http://www.w3.org/1999/02/22-rdf-syntax-ns#type. This keyword is case-sensitive.

?x  a  :Class1 .
[ a :appClass ] :p "v" .

is syntactic sugar for:

?x    rdf:type  :Class1 .
_:b0  rdf:type  :appClass .
_:b0  :p        "v" .

LinkedMDB imposes some odd limits

The LinkedMDB endpoint imposes some odd limits on query results. Some other questions and answers have touched on this in the past, including:

  • Can't retrieve movies with high IDs from LinkedMDB with SPARQL
  • LinkedMDB SPARQL Query

If you need to get some specific results that are outside of the default range of what's returned, you'll probably want to include an order by and then a limit. Even so, this endpoint has some odd behavior, and for specific problems, you're probably best off contacting them directly; some of these oddities don't indicate a problem with your query, but are just a problem with the endpoint.



回答2:

The square brackets represent blank nodes in SPARQL, see: http://www.w3.org/TR/sparql11-query/#QSynBlankNodes

It is like using a new variable. So instead of:

?film movie:actor [ a movie:actor ;
                    movie:actor_name ?actorName ].

you could write:

?film movie:actor ?actor .
?actor a movie:actor .
?actor movie:actor_name ?actorName .

where ?actor is a new variable that is not used anywhere else. For different b-nodes (different pairs of brackets) it would be different variables.

As to the limit, I don't know. The server is currently down so I cannot check. It could be some limits they have configured on their side.

In any case, to retrieve all the results, you should "paginate" through the results using SPARQL LIMIT and OFFSET.