I am using DBpedia to getting page category using SPARQL in R. However, there are having some problems on it. Source code I am using:
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?categoryUri ?categoryName WHERE {
<http://dbpedia.org/resource/xxx> dcterms:subject ?categoryUri.
## xxx are random words (e.g. Agree, Film, Work, Plan...)
?categoryUri rdfs:label ?categoryName.
FILTER (lang(?categoryName) = "en")
}
The problems are:
Category cannot be retrieved if the words need to be redirected (e.g. Agree -> Agreement)
Disambiguation pages cannot be used from the above source code, because there are so many sub-pages within the category of word (e.g. Work)
So, how can I resolve the above problems? I am really appreciate if anyone can offer your help!!!
SPARQL does only do what you write, so there is no magic behind. If some resource :s
is possibly connected to others by a property :p
, add a triple pattern :s :p ?o .
- sometimes you might even consider to use a property path in case of the resolution of the transitive closure of :p
, i.e. :s :p* ?o .
.
With redirects resolved:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dcterms: <http://purl.org/dc/terms/>
SELECT * WHERE
{ <http://dbpedia.org/resource/Agree> (dbo:wikiPageRedirects)* ?page
OPTIONAL
{ ?page dcterms:subject ?categoryUri}
}
Note the OPTIONAL
clause, which is necessary here because not all resources in DBpedia belong to a category.
Including disambiguation pages:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dcterms: <http://purl.org/dc/terms/>
SELECT * WHERE
{ <http://dbpedia.org/resource/Agree> (dbo:wikiPageRedirects)*/(dbo:wikiPageDisambiguates)* ?page
OPTIONAL
{ ?page dcterms:subject ?categoryUri}
}