I am doing a NLP project in Windows and the problem is whenever I run Stanford CoreNLP from my command prompt, it takes about 14-15 seconds to generate the XML output of the given input text file. I think that this issue is because the library takes quite a lot of time to load. Can please somebody explain what the problem is and how can I resolve this issue as this time problem is a big issue for my project?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
Stanford CoreNLP uses large model files of parameters for various components. Yes, they take lots of time to load. What you want to do is to only start up the program once and then to feed it lots of text.
How you do that depends what you are doing:
- You can pass a -filelist to the command-line version to process a whole bunch of files at once.
- You can leave one StanfordCoreNLP object running, and send files to it and get the output back using the API.
- Depending on what NLP processing you need, you may also be able to speed start-up a lot by not loading models you are not using. See the "annotators" property.
Update 2016: There is now more information on this on the documentation page Understanding memory and time usage
回答2:
Christopher is correct, here is one of the solutions:
import java.util.Properties;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
public class SentimentAnalyzer {
private StanfordCoreNLP pipeline;
public void initializeCoreNLP() {
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, sentiment");
pipeline = new StanfordCoreNLP(props);
}
public T getSentiment(String text) {
...
Annotation annotation= new Annotation(text);
pipeline.annotate(annotation);
...
return ...
}
public static void main(String[] argv) {
SentimentAnalyzer sentimentAnalyzer = new SentimentAnalyzer();
sentimentAnalyzer.initializeCoreNLP(); // run this only once
T t = sentimentAnalyzer.getSentiment("put text here..."); // run this multiple times
}
}
回答3:
To see how to use the API check the example code "NERDemo.java" in the downloaded folder of Core NLP.