I have a specific instance and a SPARQL query that retrieves other instances similar to that one. By similar, I mean that the other instances have at least one common property and value in common with the specific instance, or have a class in common with the specific instance.
Now, I'd like to extend the query such that if the specific instance has a value for a "critical" property, then the only instances that are considered similar are those that also have that critical property (as opposed to just having at least one property and value in common).
For instance, here is some sample data in which instance1
has a value for predicate2
which is a subproperty of isCriticalPredicate
.
@prefix : <http://example.org/rs#>
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
:instance1 a :class1.
:instance1 :predicate1 :instance3.
:instance1 :predicate2 :instance3. # (@) critical property
:instance2 a :class1.
:instance2 :predicate1 :instance3.
:instance4 :predicate2 :instance3.
:predicate1 :hasSimilarityValue 0.6.
:predicate2 rdfs:subPropertyOf :isCriticalPredicate.
:predicate2 :hasSimilarityValue 0.6.
:class1 :hasSimilarityValue 0.4.
Here is a query in which ?x
is the specific instance, instance1
. The query retrieves just instance4
, which is correct. However, if I remove the critical property from instance1
(line labeled with @), I get no results, but should get instance2
, since it has a property in common with instance1
. How can I fix this?
PREFIX : <http://example.org/rs#>
select ?item (SUM(?similarity * ?importance * ?levelImportance) as ?summedSimilarity)
(group_concat(distinct ?becauseOf ; separator = " , ") as ?reason) where
{
values ?x {:instance1}
bind (4/7 as ?levelImportance)
{
values ?instanceImportance {1}
?x ?p ?instance.
?item ?p ?instance.
?p :hasSimilarityValue ?similarity
bind (?p as ?becauseOf)
bind (?instanceImportance as ?importance)
}
union
{
values ?classImportance {1}
?x a ?class.
?item a ?class.
?class :hasSimilarityValue ?similarity
bind (?class as ?becauseOf)
bind (?classImportance as ?importance)
}
filter (?x != ?item)
?x :isCriticalPredicate ?y.
?item :isCriticalPredicate ?y.
}
group by ?item
I've said it before and I'll say it again: minimal data is very helpful. From your past questions, I know that your working project has similarity values on properties and the like, but none of that really matters for the problem at hand. Here's some data that just has a few instances, property values, and one property designated as critical:
@prefix : <urn:ex:>
:p a :criticalProperty .
:a :p :u ; #-- :a has a critical property, so
:q :v . #-- only :d can be similar to it.
:c :q :v ; #-- :c has no critical properties, so
:r :w . #-- both :a and :d can be similar to it.
:d :p :u ;
:q :v .
The trick in a query like this is to filter out the results that have the problem, not to try to select the ones that don't. Logically, those mean the same thing, but in writing the query, it's easier to think about constraint violation, and to try to filter out the results that violate the constraint. In this case, you want to filter out any results where the instance has a critical property and value but the similar instance doesn't.
prefix : <urn:ex:>
select ?i (group_concat(distinct ?j) as ?js) where {
#-- :a has a critical property, but
#-- :c does not, so these are useful
#-- starting points
values ?i { :a :c }
#-- get ?j instances that have a value
#-- in common with ?i.
?i ?property ?value .
?j ?property ?value .
#-- make sure that ?i and ?j aren't
#-- the same instance
filter (?i != ?j)
#-- make sure that there is no critical
#-- property value that ?i has that
#-- ?j does not also have
filter not exists {
?i ?criticalProperty ?criticalValue .
?criticalProperty a :criticalProperty .
filter not exists {
?j ?criticalProperty ?criticalValue .
}
}
}
group by ?i
----------------------------
| i | js |
============================
| :a | "urn:ex:d" |
| :c | "urn:ex:d urn:ex:a" |
----------------------------
Related
There are some other questions that also touch on constaint satisfaction/violation that might be useful reading. While not all of these use nested filter not exists, most of them do have a pattern of filter not exists { … filter <negative condition> }.
- Find lists containing ALL values in a set? (This actually uses a nested filter not exists.)
- Is it possible to express a recursive definition in SPARQL?
- Count values that satisfy a constraint and return 0 for those that don't satisfy it in SPARQL
- SPARQL: return all intersections fulfilled by specified or equivalent classes (There's another nested filter not exists, in the last query in the answer.)