tbloader vs SPARQL INSERT - Why different behaviou

2019-02-23 06:57发布

问题:

There is a strange behaviour in the connection of the commandline tools of ARQ, TDB and Named Graphs. If importing data via tdbloader in a named graph it can not be queried via GRAPH clause in a SPARQL SELECT query. However, this query is possible when inserting the data in the same graph with SPARQL INSERT.

I have following assembler description file tdb.ttl:

@prefix rdfs:   <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ja:     <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .


[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

[] rdf:type         tdb:DatasetTDB ;
    tdb:location "DB" ;
.

There is a dataset in the file data.ttl:

<a> <b> <c>.

Now, I am inserting this data with tdbloader and secondly another triple with SPARQL INSERT, both in the named graph data:

tdbloader --desc tdb.ttl --graph data data.ttl
update --desc tdb.ttl "INSERT DATA {GRAPH <data> {<d> <e> <f>.}}"

Now, the data can be queried with SPARQL via:

$arq --desc tdb.ttl "SELECT *  WHERE{ GRAPH ?g {?s ?p ?o.}}"
----------------------------
| s   | p   | o   | g      |
============================
| <a> | <b> | <c> | <data> |
| <d> | <e> | <f> | <data> |
----------------------------

Everything seems perfect. But now I want to query only this specifc named graph data:

$ arq --desc tdb.ttl "SELECT *  WHERE{ GRAPH <data> {?s ?p ?o.}}"
-------------------
| s   | p   | o   |
===================
| <d> | <e> | <f> |
-------------------

Why is the data imported from tdbloader missing? What is wrong with this query? How can I get results back from both imports?

回答1:

Try this query:

PREFIX : <data>
SELECT * { { ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } } }

and the output is

----------------------------
| s   | p   | o   | g      |
============================
| <a> | <b> | <c> | <data> |
| <d> | <e> | <f> | :      |
----------------------------

or try:

 tdbquery --loc DB --file Q.rq -results srj

to get the results in a different form.

The text output is makign things look nice but two different things end up as <data>.

What you are seeing is that

tdbloader --desc tdb.ttl --graph data data.ttl

used data exactly as is to name the graph. But

INSERT DATA {GRAPH <data> {<d> <e> <f>.}}

does a full SPARQL parse, and resolves against the base URI, probably looking like file://*currentdirectory*.

When printing in text, URIs get abbreviated, including using the base. So both the original data (from tdbloader) and file:///path/data appear as <data>.

PREFIX : <data>

gives the text output a different way to write it as :.

Finally try:

BASE <http://example/>
SELECT * { { ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } } }

which sets the base URI to something no where near your data URIs so switching off nice formatting by base URI:

----------------------------------------------------------------------------------------------------------------
| s                        | p                        | o                        | g                           |
================================================================================================================
| <file:///home/afs/tmp/a> | <file:///home/afs/tmp/b> | <file:///home/afs/tmp/c> | <data>                      |
| <file:///home/afs/tmp/d> | <file:///home/afs/tmp/e> | <file:///home/afs/tmp/f> | <file:///home/afs/tmp/data> |
----------------------------------------------------------------------------------------------------------------