Why does my SPARQL query duplicate results?

2019-06-03 18:04发布

问题:

I am doing some searching, and learning more about SPARQL, but it is not easy like SQL. I just want to know why my query duplicates result and how to fix it. This is my SPARQL Query:

PREFIX OQ:<http://www.owl-ontologies.com/Ontology1364995044.owl#>

SELECT ?x ?ys ?z ?Souhaite
WHERE {
  ?y OQ:hasnameactivite ?x.
  ?y OQ:AttenduActivite ?Souhaite.
  ?y OQ:SavoirDeActivite ?z.
  ?y OQ:hasnamephase ?ys.
  ?y OQ:Activitepour ?v.
  ?ro OQ:hasnamerole ?nr.
  ?y OQ:avoirrole ?ro.
  FILTER regex (?nr ,"Concepteur").
  FILTER regex (?v,"Voiture").
}

This gives me these results:

Expected result is:

回答1:

While first reading your question, I was going to respond that you can change SELECT in your query to SELECT DISTINCT (using the DISTINCT modifier) to remove duplicate results. However, looking at your result set, I don't actually see any duplicated answers. Each row appears to be unique. The values for ?xs and ?ys all happen to be the same, but the combinations of ?z and ?Souhaite make the rows distinct. Your results are essentially the product { xs1 } × { ys1 } × { z1, z2, z3 } × { S1, S2, S3 }, and don't contain any duplicates.

I just looked a bit more closely at the query and the results you are showing, and there are some discrepancies. For instance, your results have a variable named ?xs but your query does not use such a variable. I will assume that ?x is supposed to be ?xs. Also, the variable names ?xs, ?ys, ?z, and ?Souhaite are not very descriptive at all. It's hard to talk about these when we don't know what role they play in the result.

Regarding the results that you are expecting, ?xs and ?ys really should be bound for each row. The second row of your desired results, for instance, have a ?z and a ?Souhaite, but no ?xs and ?ys, but they probably do not make any sense without a corresponding ?xs and ?ys, correct? As such, I will not try to address the issue of those columns being blank in your second and third rows; they should not be blank.

In your expected results, you have removed the rows that included many ?z/?Souhaite combinations, such as "Besoins …" "Schemas …" and "Volume …" "Fourchette …". These appeared in the results because they are in your data. If you want help cleaning your data so that these are not present, we will need to see your data, and know something about from where it came.



标签: rdf sparql owl