Can't read Protege ontology in Jena

2019-09-03 14:16发布

问题:

I'm new to onotlogy and Java. I learn it now and have some theoretical knowledge. I use "apache-jena-3.1.0" in Eclipse and Protege editor 5.0.0 beta 23.

First of all, I created a simple ontology in Jena. Something like that:

public static void main(String[] args) {
OntModel m = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
...
OntClass gen1 = m.createClass(st + "Generation_1");
OntClass gen2 = m.createClass(st + "Generation_2");
...
ObjectProperty hasParent = m.createObjectProperty(st + "hasParent");
...
m.write(System.out);

try {
m.write(new FileWriter("C:/java/family1_RDF.owl"), "RDF/XML");
m.write(new FileWriter ("C:/java/family2_N3.owl"), "N3");
} catch (IOException e) {
    e.printStackTrace();
}

It works well. I'm able to read saved ontology in my application and to open it in Protege editor.

Then I created simple ontology in Protege. Saved it in RDF/XML Syntax. I tried to open it in my application by the code:

OntModel base = ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM_RDFS_INF);
    try {
        base.read(new FileReader ("C:/java/asutp_class.owl"), "OWL/XML");
        } catch (IOException e) {
        e.printStackTrace();
        }
    base.write(System.out);

It didn't work. Eclipse sent me a lot of mistakes:

Exception in thread "main" org.apache.jena.riot.RiotException: [line: 271, col: 120] {E210} Encoding error with non-ascii characters. at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.error(ErrorHandlerFactory.java:128) at org.apache.jena.riot.lang.LangRDFXML$ErrorHandlerBridge.error(LangRDFXML.java:246) at org.apache.jena.rdfxml.xmlinput.impl.ARPSaxErrorHandler.error(ARPSaxErrorHandler.java:37) at org.apache.jena.rdfxml.xmlinput.impl.XMLHandler.warning(XMLHandler.java:196) at org.apache.jena.rdfxml.xmlinput.impl.XMLHandler.warning(XMLHandler.java:173) at org.apache.jena.rdfxml.xmlinput.impl.XMLHandler.warning(XMLHandler.java:168) at org.apache.jena.rdfxml.xmlinput.impl.ParserSupport.warning(ParserSupport.java:207) at org.apache.jena.rdfxml.xmlinput.impl.ParserSupport.checkEncoding(ParserSupport.java:192) at org.apache.jena.rdfxml.xmlinput.impl.URIReference.resolve(URIReference.java:167) at org.apache.jena.rdfxml.xmlinput.states.WantDescription.startElement(WantDescription.java:63) at org.apache.jena.rdfxml.xmlinput.impl.XMLHandler.startElement(XMLHandler.java:111) at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source) at org.apache.xerces.impl.XMLNamespaceBinder.handleStartElement(Unknown Source) at org.apache.xerces.impl.XMLNamespaceBinder.startElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source) at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.jena.rdfxml.xmlinput.impl.RDFXMLParser.parse(RDFXMLParser.java:150) at org.apache.jena.rdfxml.xmlinput.impl.RDFXMLParser.parse(RDFXMLParser.java:134) at org.apache.jena.rdfxml.xmlinput.ARP.load(ARP.java:99) at org.apache.jena.riot.lang.LangRDFXML.parse(LangRDFXML.java:140) at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:187) at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:873) at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:288) at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:273) at org.apache.jena.riot.adapters.RDFReaderRIOT.read(RDFReaderRIOT.java:62) at org.apache.jena.rdf.model.impl.ModelCom.read(ModelCom.java:245) at org.apache.jena.ontology.impl.OntModelImpl.read(OntModelImpl.java:2117) at asutp_lassification.main(asutp_lassification.java:14)

What the problem is? How can I open Protege's ontology in my Jena application?

Thanks a lot!

回答1:

Line 271 has a URI with fragment with "#АСУ1" which when I look at the bytes are indeed not ASCII (they are d0 90 d0 a1 d0 a3 in UTF-8 encoding).

RDF/XML is an old standard and requires URIs (strictly "RDF URI References" which means IRIs need encoding). Turtle is better at handling IRIs directly.



标签: jena protege