-->

RiotException when loading a Model using Jena 2.12

2019-09-10 06:53发布

问题:

I've created this simple class named RDFReader for loading a model from a URI from DBpedia:

import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.util.FileManager;

public class RDFReader {
    public static Model readFromURL(String URL){
      try{
         return (new FileManager()).loadModel(URL);
      }catch(Exception e){
         e.printStackTrace();
      }
      return null;  
     }

    public static void main(String[] args) {
      RDFReader.readFromURL("http://dbpedia.org/resource/Pacific_Rim_(film)");
    }   
}

I've used Jena v2.12.1 as shown in the following snippet of my pom.xml

    <dependency>
        <groupId>org.apache.jena</groupId>
        <artifactId>jena-core</artifactId>
        <version>2.12.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.jena</groupId>
        <artifactId>jena-arq</artifactId>
        <version>2.12.1</version>
    </dependency>
    <dependency>

By running this code with Jena v2.12.1 I've got the next exception:

org.apache.jena.riot.RiotException: [line: 21, col: 17] Unknown char: –(8211;0x2013)
at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
at org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:163)
at org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:106)
at org.apache.jena.riot.lang.LangTurtleBase.triples(LangTurtleBase.java:249)
at org.apache.jena.riot.lang.LangTurtleBase.triplesSameSubject(LangTurtleBase.java:191)
at org.apache.jena.riot.lang.LangTurtle.oneTopLevelElement(LangTurtle.java:44)
at org.apache.jena.riot.lang.LangTurtleBase.runParser(LangTurtleBase.java:90)
at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:182)
at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:906)
at org.apache.jena.riot.RDFDataMgr.parse(RDFDataMgr.java:687)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:210)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:183)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:121)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:112)
at org.apache.jena.riot.adapters.RDFReaderRIOT.read(RDFReaderRIOT.java:77)
at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:253)
at com.hp.hpl.jena.util.FileManager.readModelWorker(FileManager.java:377)
at com.hp.hpl.jena.util.FileManager.loadModelWorker(FileManager.java:308)
at com.hp.hpl.jena.util.FileManager.loadModel(FileManager.java:260)
at edu.polito.rdf.utils.RDFReader.readFromURL(RDFReader.java:12)
at edu.polito.rdf.utils.RDFReader.main(RDFReader.java:20)

However using Jena v2.11.0 the code runs without any problem. So I would like to know:

  1. Why does the 2.12.1 version of Jena produce this error?
  2. Is it possible to solve this problem to use the 2.12.1 version of Jena instead of the 2.11.0?.

By the way, I'm using eclipse Luna 4.4.1 and java version "1.8.0_11".

回答1:

The problem is that I was trying to access to the resource http://dbpedia.org/resource/Pacific_Rim_(film) which contains a character 0x2013 (Em-dash) that is not legal at that point in Turtle.

These are the answers for both questions:

  1. I was using Jena 2.12.1 which applies content negotiation and lists Turtle before RDF/XML so it found that the resource in DBpedia (with Turtle) was wrong (as explained before). With version 2.11.0 it run without problems because that version had a bug that allowed the parser to be lenient. This bug was then fixed in version 2.12.1, so reading a resource with an illegal character launches the RiotException.

  2. The solution was simply to ask DBpedia for the RDF/XML (alternatively N-triples) of the resource which can be accessed through http://dbpedia.org/data/Pacific_Rim_(film).rdf. The final code to get the RDF/XML from DBpedia is:

Note: As suggested by @AndyS I’ve used the RDFDataMgr.loadModel(String URI):

import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.util.FileManager;

public class RDFReader {
   public static Model readFromURL(String URL){
        return RDFDataMgr.loadModel(URL);
   }

   public static void main(String[] args) {
        Model model =      RDFReader.readFromURL("http://dbpedia.org/data/Pacific_Rim_(film).rdf");
     }
}


标签: rdf jena dbpedia