Query DBpedia to get meta-data for Books

2019-06-27 05:45发布

问题:

I have a bunch of ISBN's. I want to query DBpedia and get the meta-data of the books.

I am unable to understand the SPARQL.

Can someone tell me how can I get the meta-data of a book from DBpedia in Java?

回答1:

SPARQL is both a query language and a protocol to query so-called SPARQL endpoints.

A SPARQL query that asks DBpedia for a book (or books) that have the ISBN 0-553-05250-0, and its (or their) associated properties and values is this:

select distinct ?book ?prop ?obj 
where {
  ?book a dbo:Book .
  ?book ?prop ?obj .
  ?book dbp:isbn ?isbn .
  FILTER (regex(?isbn, "0-553-05250-0"))
} 
LIMIT 100

See the result of the query in your browser here.

Be aware that regex(?isbn, "0-553-05250-0") takes some time to evaluate. It may not work for all ISBNs, because

  • Wikipedia may never have a complete list of ISBNs, so neither may DBpedia
  • the same ISBN without dashes will not match a query with dashes.

Also, I noticed that some ISBNs are just a string of digits and dashes, others have "ISBN" in it or "(paperback)" appended.

You can send this query to the DBpedia endpoint via the webform (by visiting the endpoint with your browser) via Jena, a well-known Java toolkit for RDF and SPARQL.
Here is the query in some Java code that queries DBpedia for results and prints them to the command line (based on another Jena, SPARQL and DBpedia related question, of which there are many):

String sparqlQueryString1= "select distinct ?book ?prop ?obj " +
       "where { " +
       "  ?book a dbpedia-owl:Book . " +
       "  ?book ?prop ?obj . " +
       "  ?book dbpprop:isbn ?isbn . " +
       "  FILTER (regex(?isbn, \"0-553-05250-0\")) " +
       "} " +
       "LIMIT 100";

Query query = QueryFactory.create(sparqlQueryString1);
QueryExecution qexec = QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql", query);

ResultSet results = qexec.execSelect();
ResultSetFormatter.out(System.out, results, query);       

qexec.close() ;

My favourite SPARQL resource is Lee Feigenbaum's cheat sheet, which is a pretty comprehensive reference. Perhaps you would like to check out the tutorials Jena provides with its documentation.



回答2:

As far as I can tell, Wikipedia does not have an ISBN search.

Wikipedia has this page for using other ISBN search engines.

Amazon.com has an ISBN search here. I couldn't find an API for automating the ISBN search on Amazon.