Using SPARQL to query DBPedia Company Information

I'm trying to query DBPedia using SPARQL only to find company information such as a description, and a logo.

I'm rather lost with devising the SPARQL Query to do this.

    SELECT DISTINCT ?subject 
                ?employees 
                ?homepage 
  WHERE 
    {
      ?subject  rdf:type               <http://dbpedia.org/class/yago/Company108058098>  .
      ?subject  dbpedia2:numEmployees  ?employees
        FILTER  ( xsd:integer(?employees) >= 50000 )                                     .
      ?subject  foaf:homepage          ?homepage                                         .
    } 
  ORDER BY  DESC(xsd:integer(?employees))
  LIMIT  20

I have come across the above query, which finds companies with over 50,000 emplayoees, but I don't understand such things as the rdf type being "http://dbpedia.org/class/yago/Company108058098"

Well all I want to know is given a company name, how can I return a unique ID, logo and description? I just want 3 pieces of data back, which I can then store in my database.

标签： rdf sparql wikipedia dbpedia

2条回答

啃猪蹄的小仙女

2楼-- · 2019-05-25 22:09

To get all companies one have to use LIMIT and OFFSET because usually public endpoints limits number of results per query. Based on @Joshua answer I wrote a small script that can be run to get all companies from public dbpedia endpoint. Here is the gist: https://gist.github.com/szydan/e801fa687587d9eb0f6a

One can also modify the query and use it to get other entities.

0人赞添加讨论(0) 举报

在下西门庆

3楼-- · 2019-05-25 22:13

The reason for rdf:type <http://dbpedia.org/class/yago/Company108058098> in a query like the following is because (presumably), that's a class whose instances are companies. Asking for instances of the class is a way of asking for companies.

select * { ?s rdf:type <http://dbpedia.org/class/yago/Company108058098> }
limit 10

SPARQL results

It's the same principle that lets us select Persons with:

select * { ?s a dbpedia-owl:Person }
limit 10

SPARQL results

As to your specific query, a typically good way to query DBpedia data is to start by looking at the data manually and finding the types of values you're interested in. For instance, you might look at Apple, Inc., whose DBpedia resource is

http://dbpedia.org/resource/Apple_Inc., which redirects to
http://dbpedia.org/page/Apple_Inc. which you can view in your browser.

For the kinds of information that you're looking for, important properties seem to be:

rdfs:label "Apple" or "Apple Inc." (which you'd use to query against), or
foaf:name "Apple Inc."
foaf:depiction http://upload.wikimedia.org/wikipedia/commons/f/fa/Apple_logo_black.svg, or
dbpedia-owl:thumbnail http://upload.wikimedia.org/wikipedia/commons/thumb/f/fa/Apple_logo_black.svg/200px-Apple_logo_black.svg.png
dbpedia-owl:abstract "english description"@en
rdf:type dbpedia-owl:Company (to help narrow down the results)

You can simply use the resource IRI as the unique identifier. Given all this, you can write a query like the following. It has multiple results, though, because there are multiple possible logos, but so it goes.

select ?iri ?logo ?description {
  ?iri a dbpedia-owl:Company ;
       dbpedia-owl:abstract ?description ;
       rdfs:label "Apple Inc."@en ;
       foaf:depiction|dbpedia-owl:thumbnail ?logo .
  filter( langMatches(lang(?description),"en") )
}

SPARQL results

It would be nice to be able to use

foaf:name|rdfs:label "Apple In."@en

as well, but the endpoint says in that case that the estimated time is too great:

Virtuoso 42000 Error The estimated execution time 9320 (sec) exceeds the limit of 3000 (sec).

I'm not sure how it estimates the time, but you can use some optionals and some values to work around it (but be sure to put distinct into the select):

select distinct ?iri ?logo ?description {
  values ?hasLogo { foaf:depiction dbpedia-owl:thumbnail }
  values ?hasName { foaf:name rdfs:label }
  ?iri a dbpedia-owl:Company ;
       dbpedia-owl:abstract ?description ;
       ?hasLName "Apple Inc."@en ;
       ?hasLogo ?logo .
  filter( langMatches(lang(?description),"en") )
}

Note: At time of writing, DBpedia's endpoint is very sluggish and under maintenance, so I'm not sure yet whether this last permutation actually hits the estimated time cutoff or not. I think that it will go through, though.

0人赞添加讨论(0) 举报

Using SPARQL to query DBPedia Company Information

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间