I would like to know if there are some functions to manipulate RDF Collections in SPARQL.
A motivating problem is the following.
Suppose you have:
@prefix : <http://example.org#> .
:x1 :value 3 .
:x2 :value 5 .
:x3 :value 6 .
:x4 :value 8 .
:list :values (:x1 :x2 :x3 :x4) .
And you want to calculate the following formula: ((Xn - Xn-1) + ... (X2 - X1)) / (N - 1)
Is there some general way to calculate it?
Up until now, I was only able to calculate it for a fixed set of values. For example, for 4 values, I can use the following query:
prefix : <http://example.org#>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?r {
?list :values ?ls .
?ls rdf:first ?x1 .
?ls rdf:rest/rdf:first ?x2 .
?ls rdf:rest/rdf:rest/rdf:first ?x3 .
?ls rdf:rest/rdf:rest/rdf:rest/rdf:first ?x4 .
?x1 :value ?v1 .
?x2 :value ?v2 .
?x3 :value ?v3 .
?x4 :value ?v4 .
BIND ( ((?v4 - ?v3) + (?v3 - ?v2) + (?v2 - ?v1)) / 3 as ?r)
}
What I would like is some way to access the Nth value and to define some kind of recursive function to calculate that expression. I think it is not possible, but maybe, someone has a nice solution.
No built-ins that make formulas easier…
SPARQL does include some mathematical functions for arithmetic and aggregate computations. However, I don't know of any particularly convenient ways of concisely representing mathematical expressions in SPARQL. I've been looking at a paper lately that discusses an ontology for representing mathematical objects like expressions and definitions. They implemented a system to evalute these, but I don't think it used SPARQL (or at least, it wasn't just a simple extension of SPARQL).
…but we can still do this case.
That said, this particular case isn't too hard to do, since it's not too hard to work with RDF lists in SPARQL, and SPARQL includes the mathematical functions needed for this expression. First, a bit about RDF list representation, that will make the solution easier to understand. (If you're already familiar with this, you can skip the next paragraph or two.)
RDF lists are linked lists, and each list is related to it's first element by the
rdf:first
property, and to the rest of the list byrdf:rest
. So the convenient notation(:x1 :x2 :x3 :x4)
is actually shorthand for:Representing blank nodes with
[]
, we can make this a bit clearer:Once the head of the list has been identified, that is, the element with
rdf:first :x1
, then any list l reachable from it by an even number repetitions (including 0) ofrdf:rest/rdf:rest
is a list whoserdf:first
is an odd numbered element of the list (since you started indexing at 1). Starting at l and going forward onerdf:rest
, we're at an l' whoserdf:first
is an even numbered element of the list.Since SPARQL 1.1 property paths let us write
(rdf:rest/rdf:rest)*
to denote any even numbered repetitions ofrdf:rest
, we can write up the following query that binds the:value
of odd numbered elements of?n
and the value of the following even numbered elements to?nPlusOne
. The math in theSELECT
form is straightforward, although to get N-1, we actually use2*COUNT(*)-1
, because the number of rows (each of which binds elements n and n+1) is N/2.Results (using Jena's command line ARQ):
which is what is expected since
Update
I just realized that what is implemented above was based on my comment on the question about whether the summation was correct, because it simplified very easily. That is, the above implements
whereas the original question asked for
The original is even simpler, since the pairs are identified by each
rdf:rest
of the original list, not just even numbers of repetitions. Using the same approach as above, this query can be represented by:Results:
Of course, since the expression can be simplified to
we can also just use a query which binds
?x1
to the first element of the list,?xn
to the last element, and?xi
to each element of the list (so thatCOUNT(?xi)
(and alsoCOUNT(*)
) is the number of items in the list):Results:
You may also have a look at alternative ways of describing/representing lists in RDF, e.g., with help of the Ordered List Ontology. I think with this model you can more easily query what you want ;)