Comparing two strings with SPARQL

2019-08-03 05:37发布

问题:

i'm using the regex function with SPARQL. Is there a function which find a string that have minimum distance from another one? I mean, i need a function which gives me the most similar word compared with another one. Actually i pass two variables (these variables take values form two different datasets) and compare just considering these case insensitive. So i need a function that can compare two variables. Does anybosy know anything ?

回答1:

There is no such function in standard SPARQL. However, SPARQL is extensible, so you can add your own functions if you want (of course, at the price of losing portability of your query). For example, see this tutorial on how to do this in Sesame's SPARQL engine.

I also imagine that some triplestores with extended support for full-text search (like OWLIM, or Virtuoso) may have some built-in support for this kind of thing, but I do not know this for sure.

Edit

Assuming you want something like Levenshtein distance, you could have a function ex:ldistance(?string1, ?string2) that given two strings outputs the distance. So ex:ldistance("room", "root") would return 1, ex:ldistance("room", "door") would return 2, and so on. You could then use this to query for a given distance, e.g. to get all strings that are closer than 2 to "room":

SELECT ?x ?string1 
WHERE {
       ?x rdfsl:abel ?string1 
       FILTER(ex:ldistance("room", ?string1) < 2)
}

or returning all matching strings ordered by their distance:

SELECT ?x ?string1 ?ldistance
WHERE {
       ?x rdfsl:abel ?string1 
       BIND ( ex:ldistance("room", ?string1) as ?ldistance)
}
ORDER BY ?ldistance

However, as said, the function ex:ldistance does not actually exist in SPARQL, so you will need to create it yourself, as an extension.



标签: rdf sparql