OpenNLP is an Apache project on Natural Language Processing. One of the aims of an NLP program is to parse a sentence giving a tree of its grammatical structure. For example, the sentence "The sky is blue." might be parsed as
S
/ \
NP VP
/ \ | \
The sky is blue.
where S
is Sentence, NP
is Noun-phrase, and VP
is Verb-phrase. Equivalently the above tree can be written down as a parenthesized string like this: S(NP(The sky) VP(is blue.))
I am trying to be able to get the parenthesized strings from sentences using OpenNLP, but I can't get the example code to work.
In particular, I am following along the last part of this tutorial and my code gets stuck at initializing ParserModel
.
I have downloaded the appropriate binaries from here and added opennlp-tools-1.5.3.jar
(which includes classes for all of the following objects) as a library to my IntelliJ project. Also, I moved en-parser-chunking.bin
to my "user.dir."
The following is the code which should give me a parse tree, but it runs indefinitely at creating the ParserModel
object.
InputStream is = new FileInputStream("en-parser-chunking.bin");
ParserModel model = new ParserModel(is);
Parser parser = ParserFactory.create(model);
String sentence = "The sky is blue.";
Parse topParses[] = ParserTool.parseLine(sentence, parser, 1);
for (Parse p : topParses)
p.show();
is.close();
It's my first day of working with OpenNLP, but I can't even get this simple example to work.
Your model might be damaged. Try downloading it again and use that one. If that doesn't help, call
kill -QUIT <pid>
(under Linux) to get a stacktrace when the process hangs, or use a debugger.try this