Find the two nearest neighbors of points

2019-05-16 12:26发布

I want to find the two nearest neighbors of each point

dataset :

:p1 :has_position 1 .
:p2 :has_position 2 .
:p3 :has_position 3 .
:p4 :has_position 4 .

expected results :

?POINT  ?NEIGHBORS
"p1"    "p2; p3"
"p2"    "p1; p3"
"p3"    "p2; p4"
"p4"    "p2; p3"

I try something like this :

SELECT ?POINT ?POS (group_concat(?idPointN;separator='; ' )as ?NEIGHBORS)
WHERE{
    ?idPoint :has_position ?POS .
    ?idPointN :has_position ?POSN . FILTER (?idPoint != ?idPointN)
}
GROUP BY ?POINT ?POS

this return all the neighbors of points. I want to do somthing like ORDER BY(?POS-?POSN) and limit 2 in the group_concat but i don't know how.

EDIT :

I write this query

SELECT ?POINT ?NEIGHBOR
WHERE{
    ?idPoint rdfs:label ?POINT . FILTER(?idN != ?idPoint)
    ?idPoint :has_position ?POS .


    ?idN rdfs:label ?NEIGHBOR .
    ?idN :has_position ?POSN .
}
ORDER BY ?POINT abs(?POS-?POSN)

It give me for each point all the neighbors order by the closest.

How can i have only the 2 closest ? and on the same line ?

标签: sparql
1条回答
Melony?
2楼-- · 2019-05-16 12:44

Queries that get the top n per something are really tricky in SPARQL, and there's really no great way to do it yet. It almost always comes down to some weird hack. First, the data with a prefix declaration:

@prefix : <urn:ex:>

:p1 :has_position 1 .
:p2 :has_position 2 .
:p3 :has_position 3 .
:p4 :has_position 4 .

Then the query. The select line has a long string concatenation in it, but that's just to strip off the prefixes like you described in the question. The "hack" in this case is recognizing that the two closest points q and r will minimize the quantity |p − q| + |p − r|, so we can compute that quantity and take the values of q and r that gave it to us. You'll also need to make sure that you impose some ordering on q and r, or else you'll get duplicated results (since you could just swap q and r).

prefix : <urn:ex:>

select ?p (concat(strafter(str(?q),str(:)),", ",strafter(str(?r),str(:))) as ?neighbors) {
  ?p :has_position ?pos1 .
  ?q :has_position ?pos2 .
  ?r :has_position ?pos3 .
  filter(?p != ?q && ?p != ?r)
  filter(str(?q) < str(?r))

  filter not exists {
    ?qq :has_position ?pos22 .
    ?rr :has_position ?pos33 .
    filter(?p != ?qq && ?p != ?rr)
    filter(str(?qq) < str(?rr))
    filter((abs(?pos1 - ?pos22) + abs(?pos1 - ?pos33)) < 
           (abs(?pos1 - ?pos2)  + abs(?pos1 - ?pos3)))
  }
}
-------------------
| p   | neighbors |
===================
| :p1 | "p2, p3"  |
| :p2 | "p1, p3"  |
| :p3 | "p2, p4"  |
| :p4 | "p2, p3"  |
-------------------

Now, you could also do this with a subquery that finds the minimal quantity for each p, and then, in the outer query, finds the q and r values that produce it:

prefix : <urn:ex:>

select ?p (concat(strafter(str(?q), str(:)), ", ", strafter(str(?r), str(:))) as ?neighbors) {
  { select ?p (min(abs(?pos1 - ?pos2) + abs(?pos1 - ?pos3)) as ?d) {
      ?p :has_position ?pos1 .
      ?q :has_position ?pos2 .
      ?r :has_position ?pos3 .
      filter(?p != ?q && ?p != ?r)
      filter(str(?q) < str(?r))
    }
    group by ?p
  }

  ?p :has_position ?pos1 .
  ?q :has_position ?pos2 .
  ?r :has_position ?pos3 .
  filter(?p != ?q && ?p != ?r)
  filter(str(?q) < str(?r))
  filter(abs(?pos1 - ?pos2) + abs(?pos1 - ?pos3) = ?d)
}
-------------------
| p   | neighbors |
===================
| :p1 | "p2, p3"  |
| :p2 | "p1, p3"  |
| :p3 | "p2, p4"  |
| :p4 | "p2, p3"  |
-------------------
查看更多
登录 后发表回答