extract information from xml file as RDF triples

2019-07-16 02:17发布

问题:

Could any one please recommend a tutorial or tell me how can I build a java program for extracting information from xml files and produce the out put as RDF triples using an existing ontology. an example would be really helpful.

Thanks

回答1:

There are ready-made tools that address this problem, such as XSPARQL. You can write an XSPARQL query that queries the XML and produces RDF triples as output. This example should be pretty close to what you're looking for.



回答2:

Your problem is really two problems:

  • parsing XML
  • writing RDF

For Java XML parsing, there are numerous examples on the web:

  • Java and XML - Tutorial
  • Java Examples in a Nutshell, Chapter 19, XML
  • Working with XML: The Java/XML Tutorial

For RDF there are fewer resources, it's a much more specialized field:

  • What are some good Java RDF libraries?

In the past I worked with Jena – it offers a friendly API to the semantic web stack.



回答3:

I would recommend the XmlToRdf Java library.

XmlToRdf offers incredibly fast conversion by using the built in Java SAX parser to stream convert your XML file to RDF. A vast selection of configurations (with sane defaults) makes it simple to adjust the conversion for your needs, including element renaming and advanced IRI generation with composite identifiers.

Output from the conversion can be written directly to file as RDF Turtle or added to a Sesame Repository or Jena Dataset for further processing. With Sesame and Jena it is possible to do further, SPARQL based, transformations on the data and outputting to formats such as RDF Turtle and JSON-LD.