finding super classes of an entity in SPARQL

2019-07-25 21:50发布

问题:

I want to make a Name Entity Recognizer using wikipedia Data, I need to get all the super classes of a word to see in which category (Place, Human, Organization or None) the word is. I surfed the Internet a lot and find some pages like :

  • SPARQL query to find all sub classes and a super class of a given class

which when I execute the query results "No matching records found" even with the word mentioned in the page and trying other namespaces. and:

  • Extracting hierarchy for dbpedia entity using SPARQL

which is very similar to my work, but I get the "No matching records found" result too.

I think the queries mentioned in these links are logically correct, but I have no idea why they results nothing for me. I also tried to learn SPARQL by examples mentioned in these sites :

  • https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples
  • https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries

and I didn't find anything for finding super classes of a word.

There are some examples of the codes which I didn't get result:

PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
PREFIX ns:<http://dbpedia.org/>

SELECT ?subClass ?label WHERE { 
    ?subClass rdfs:subClassOf ns:Albert . 
    ?subClass rdfs:label ?label . }

or:

SELECT * WHERE {
  dbpedia:Albert a ?c1 ; a ?c2 .
  ?c1 rdfs:subClassOf ?c2 .
}

I really appreciate if you answer my question.

Thanks in advanced

回答1:

Probably, you are looking for something like this query:

SELECT DISTINCT ?c WHERE {
  ?Q wdt:P31/wdt:P279? ?c .
  ?Q rdfs:label "Tom Hanks"@en
} 

Wikidata uses its own predicates instead of rdf:type and rdfs:subClassOf (wdt:P31 and wdt:P279 respectively).



回答2:

So the subClassOf predicate only applies to classes of things not instances generally. You need to connect with the class via rdf:type.

SELECT * WHERE {
  <http://dbpedia.org/resource/Albert_Einstein> a ?c1 ; a ?c2 .
  ?c1 rdfs:subClassOf ?c2 .
}

I am not sure what type of entities you can get from Albert, it probably requires disambiguation. My example queries are using Albert Einstein as the DBPEDIA resource.

Bear in mind that you there could multiple hops to the root class depending the level of abstraction that you are interested. This second query goes up two levels.

SELECT DISTINCT ?c3 WHERE {
  <http://dbpedia.org/resource/Albert_Einstein> a ?c1 ; a ?c2 .
  ?c1 rdfs:subClassOf ?c2 .
  ?c2 rdfs:subClassOf ?c3 .
}


回答3:

  1. Who is "Albert"?! You can only query for data that does exist in DBpedia. There is no resource http://dbpedia.org/resource/Albert

  2. Your first query uses a wrong namespace, at least I've never seen http://dbedia.org as namespace, for resources it's usually http://dbpedia.org/resource/

  3. Your first query uses the rdfs:subClassOf predicate wrong for the case that "Albert" is supposed to be a resource. Expressing that a resource ":x" belongs to a class :C is done by the RDF triple :x a :C .. And the class :C has a superclass :D is denoted in RDF by :C rdfs:subClassOf :D ..

  4. Your second query again uses some old namespace prefix dbpedia:, which is now called dbr: and does exactly represent the namespace http://dbpedia.org/resource/. But as I mentioned in my first point, there is no resource for "Albert"

  5. What is the "superclass of a word"? Just to clarify, resources belong to a class, and a class can have superclasses.

If you want to have all classes including their superclasses a resource belongs to, you can use e.g. for "Tom Hanks"

PREFIX dbr: <http://dbpedia.org/resource/>
SELECT DISTINCT ?c WHERE {
  dbr:Tom_Hanks a/rdfs:subClassOf* ?c .
}