Get parse tree of a sentence using OpenNLP. Gettin

2019-04-10 17:57发布

OpenNLP is an Apache project on Natural Language Processing. One of the aims of an NLP program is to parse a sentence giving a tree of its grammatical structure. For example, the sentence "The sky is blue." might be parsed as

      S
     / \
   NP   VP
  / \    | \
The sky is blue.

where S is Sentence, NP is Noun-phrase, and VP is Verb-phrase. Equivalently the above tree can be written down as a parenthesized string like this: S(NP(The sky) VP(is blue.))

I am trying to be able to get the parenthesized strings from sentences using OpenNLP, but I can't get the example code to work.

In particular, I am following along the last part of this tutorial and my code gets stuck at initializing ParserModel.

I have downloaded the appropriate binaries from here and added opennlp-tools-1.5.3.jar (which includes classes for all of the following objects) as a library to my IntelliJ project. Also, I moved en-parser-chunking.bin to my "user.dir."

The following is the code which should give me a parse tree, but it runs indefinitely at creating the ParserModel object.

    InputStream is = new FileInputStream("en-parser-chunking.bin");
    ParserModel model = new ParserModel(is);
    Parser parser = ParserFactory.create(model);
    String sentence = "The sky is blue.";
    Parse topParses[] = ParserTool.parseLine(sentence, parser, 1);
    for (Parse p : topParses)
        p.show();
    is.close();

It's my first day of working with OpenNLP, but I can't even get this simple example to work.

2条回答
Juvenile、少年°
2楼-- · 2019-04-10 18:15

Your model might be damaged. Try downloading it again and use that one. If that doesn't help, call kill -QUIT <pid> (under Linux) to get a stacktrace when the process hangs, or use a debugger.

查看更多
【Aperson】
3楼-- · 2019-04-10 18:23
public static void Parse() throws InvalidFormatException, IOException {
    // http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Parser#Training_Tool
    InputStream is = new FileInputStream("en-parser-chunking.bin");

    ParserModel model = new ParserModel(is);

    Parser parser = ParserFactory.create(model);

    String sentence = "Programcreek is a very huge and useful website.";
    Parse topParses[] = ParserTool.parseLine(sentence, parser, 1);

    for (Parse p : topParses)
        p.show();

    is.close();

    /*
     * (TOP (S (NP (NN Programcreek) ) (VP (VBZ is) (NP (DT a) (ADJP (RB
     * very) (JJ huge) (CC and) (JJ useful) ) ) ) (. website.) ) )
     */
}

try this

查看更多
登录 后发表回答