Jena read from turtle fails

2019-02-18 05:11发布

问题:

I have just imported jena libraries to eclipse to work on rdf-s and it is my first try, but I cannot read a turtle (.ttl) file.

I tried it in the following way:

import java.io.*;
import java.util.*;
import com.hp.hpl.jena.rdf.model.*;

public class Simpsons {

public static void main(String[] args) throws IOException {
    Model model=ModelFactory.createDefaultModel();
    model.read(new FileInputStream("simpsons.ttl"),null);

}

}

The error I get is the following:

Exception in thread "main" org.apache.jena.riot.RiotException: [line: 1, col: 1 ] Content is not allowed in prolog.
at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
at org.apache.jena.riot.lang.LangRDFXML$ErrorHandlerBridge.fatalError(LangRDFXML.java:252)
at com.hp.hpl.jena.rdf.arp.impl.ARPSaxErrorHandler.fatalError(ARPSaxErrorHandler.java:48)
at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.warning(XMLHandler.java:209)
at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.fatalError(XMLHandler.java:239)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at com.hp.hpl.jena.rdf.arp.impl.RDFXMLParser.parse(RDFXMLParser.java:151)
at com.hp.hpl.jena.rdf.arp.ARP.load(ARP.java:119)
at org.apache.jena.riot.lang.LangRDFXML.parse(LangRDFXML.java:142)
at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTFactoryImpl$1.read(RDFParserRegistry.java:142)
at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:859)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:255)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:241)
at org.apache.jena.riot.adapters.RDFReaderRIOT_Web.read(RDFReaderRIOT_Web.java:62)
at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:253)
at assignment2.Simpsons.main(Simpsons.java:11)

Please help me with some ideas because I have no clue what the problem would be as it's my very first try with Jena. I also got a hint from somewhere that I should do the following: :

It seems that Jena is not so good at discovering the RDF serialisation used in files by itself, especially for files addressed with an URL. A solution to this problem is to make a method that gets the file extension of the filename by the use of string functions and returns the appropriate RDF serialisation format in Jena’s predefined strings. You can then use your method both for reading input and writing to file in the correct serialisation format.

but I don't really understand how should I write that method.

回答1:

The read method you are using assumes that the input format is RDF/XML.

you need to use one of the other read methods.

So it would be:

public static void main(String[] args) throws IOException {
    Model model=ModelFactory.createDefaultModel();
    model.read(new FileInputStream("simpsons.ttl"),null,"TTL");
}


回答2:

Following Program will read and traverse over the TTL file

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.apache.jena.graph.Triple ;
import org.apache.jena.riot.RDFDataMgr ;
import org.apache.jena.riot.lang.PipedRDFIterator;
import org.apache.jena.riot.lang.PipedRDFStream;
import org.apache.jena.riot.lang.PipedTriplesStream;

public class ReadingTTL
{
    public static void main(String... argv) {
        final String filename = "yagoTransitiveType2.ttl";

        // Create a PipedRDFStream to accept input and a PipedRDFIterator to
        // consume it
        // You can optionally supply a buffer size here for the
        // PipedRDFIterator, see the documentation for details about recommended
        // buffer sizes
        PipedRDFIterator<Triple> iter = new PipedRDFIterator<>();
        final PipedRDFStream<Triple> inputStream = new PipedTriplesStream(iter);

        // PipedRDFStream and PipedRDFIterator need to be on different threads
        ExecutorService executor = Executors.newSingleThreadExecutor();

        // Create a runnable for our parser thread
        Runnable parser = new Runnable() {

            @Override
            public void run() {
                // Call the parsing process.
                RDFDataMgr.parse(inputStream, filename);
            }
        };

        // Start the parser on another thread
        executor.submit(parser);

        // We will consume the input on the main thread here

        // We can now iterate over data as it is parsed, parsing only runs as
        // far ahead of our consumption as the buffer size allows
        while (iter.hasNext()) {
            Triple next = iter.next();
            // Do something with each triple
            System.out.println("Subject:  "+next.getSubject());
            System.out.println("Object:  "+next.getObject());
            System.out.println("Predicate:  "+next.getPredicate());
            System.out.println("\n");
        }
    }

}