How do I construct get the whole sub graph from a

2020-02-14 04:00发布

问题:

If this is the RDF graph , given the resource A , I need to construct all the triples connected to A till the end . here i have to get the graph include B,C,D,E

After that, suppose I've got this graph and want to go only from the starting point (:A) and get the subgraph produced by following the paths that end with an edge on property :d. For instance, if A1 is given as the starting point, and d as the property, we'd construct:

:A1 :a :B1, 
:B1 :b :S1,
:B1 :b :S2,
:S1 :d :D1,
:S2 :d :D2,

回答1:

The first case

To get the whole connected graph, you need to use a wildcard property path to follow most of the path, and then grab the last link with an actual variable. I usually use the empty relative path in constructing wildcards, so as to use <>|!<> as the wildcard, but since you mentioned that your endpoint doesn't like it, you can use any absolute IRI that you like, too. E.g.,

prefix x: <urn:ex:>

construct { ?s ?p ?o }
where { :A (x:|!x:)* ?s . ?s ?p ?o . }

This works because every property is either x: or not, so x:|!x: matches every property, and then (x:|!x:)* is an arbitrary length path, including paths of length zero, which means that ?s will be bound to everything reachable from :a, including :a itself. Then you're grabbing the triples where ?s is the subject. When you construct the graph of all those triples, you get the subgraph that you're looking for.

Here's an example based on the graph you showed. I used different properties for different edges to show that it works, but this will work if they're all the same, too.

@prefix : <urn:ex:> .

:A :p :B, :C .
:B :q :D .
:C :r :E .

:F :s :G .
:G :t :H .
prefix x: <urn:ex:>
prefix : <urn:ex:>

construct {
  ?s ?p ?o
}
where {
  :A (x:|!x:)* ?s .
  ?s ?p ?o .
}

Since this is a construct query, the result is a graph, not a "table". It contains the triples we'd expect:

@prefix :      <urn:ex:> .

:C      :r      :E .

:B      :q      :D .

:A      :p      :B , :C .

The second case

If you want to ensure that the paths end in a particular kind of edge, you can do that too. If you only want the paths from A1 to those ending with edges on d, you can do:

prefix x: <urn:ex:>      #-- arbitrary, used for the property path.
prefix : <...>           #-- whatever you need for your data.

construct {
  ?s1 ?p ?o1 .           #-- internal edge in the path
  ?s2 :d ?o2 .           #-- final edge in the path
}
where {
  :A (x:|!x:)* ?s1 .     #-- start at :A and go any length into the path
  ?s1 ?p ?o1 .           #-- get the triple from within the path, but make
  ?o1 (x:|!x:)* ?s2 .    #-- sure that from ?o1 you can get to to some other
  ?s2 :d ?o2 .           #-- ?s2 that's related to an ?o2 by property :d .
}


回答2:

The most important part of an RDF graph is the properties. Since your diagram does not define the properties, the question is rather ambiguous, but comes down to a couple of scenarios.

If the property is the same in the graph, then a transitive property path can be used:

CONSTRUCT {:A :prop ?rsc }
WHERE {
   :A :prop* ?rsc .
}

If there are multiple types pf properties in the graph, it is more complicated to get the transitive closure. For example the following will get all properties in the example graph:

CONSTRUCT {
   :A ?p ?rsc1 .
   :A ?p1 ?rsc2 .
}
WHERE {
   :A ?p ?rsc1 .
   OPTIONAL {?rsc1 ?p1 ?rsc2 .}
}

Note this goes two levels deep. For arbitrary levels, it may be best to call the following query until no new triples are created:

CONSTRUCT {
   ?rsc ?p ?o .
}
WHERE {
   ?rsc ?p ?o .
}

...where rsc is bound to :A initially and to the values for ?o for subsequent iterations.



回答3:

It's not possible to get the whole connected graph with one CONSTRUCT query. You'll need to run multiple queries. Even with multiple queries it gets a bit tricky when Blank Nodes are involved as you will have to gradually expand their context and return this context in every subsequent query.

For an example of some code that does exactly this, see: https://github.com/apache/clerezza-rdf-core/tree/master/impl.sparql/src/main/java/org/apache/clerezza/commons/rdf/impl/sparql.



标签: sparql rdf