Stanford CoreNLP very slow

2019-09-15 14:21发布

I am doing a NLP project in Windows and the problem is whenever I run Stanford CoreNLP from my command prompt, it takes about 14-15 seconds to generate the XML output of the given input text file. I think that this issue is because the library takes quite a lot of time to load. Can please somebody explain what the problem is and how can I resolve this issue as this time problem is a big issue for my project?

3条回答
霸刀☆藐视天下
2楼-- · 2019-09-15 14:54

Stanford CoreNLP uses large model files of parameters for various components. Yes, they take lots of time to load. What you want to do is to only start up the program once and then to feed it lots of text.

How you do that depends what you are doing:

  • You can pass a -filelist to the command-line version to process a whole bunch of files at once.
  • You can leave one StanfordCoreNLP object running, and send files to it and get the output back using the API.
  • Depending on what NLP processing you need, you may also be able to speed start-up a lot by not loading models you are not using. See the "annotators" property.

Update 2016: There is now more information on this on the documentation page Understanding memory and time usage

查看更多
叼着烟拽天下
3楼-- · 2019-09-15 15:09

Christopher is correct, here is one of the solutions:

import java.util.Properties;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;

public class SentimentAnalyzer {
    private StanfordCoreNLP pipeline;

    public void initializeCoreNLP() { 
        Properties props = new Properties();
        props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, sentiment");
        pipeline = new StanfordCoreNLP(props);
    }

    public T getSentiment(String text) {
        ...
        Annotation annotation= new Annotation(text);
        pipeline.annotate(annotation);
        ...
        return ...
    }

    public static void main(String[] argv) {
        SentimentAnalyzer sentimentAnalyzer = new SentimentAnalyzer();
        sentimentAnalyzer.initializeCoreNLP(); // run this only once
        T t = sentimentAnalyzer.getSentiment("put text here..."); // run this multiple times
    }
}
查看更多
一纸荒年 Trace。
4楼-- · 2019-09-15 15:11

To see how to use the API check the example code "NERDemo.java" in the downloaded folder of Core NLP.

查看更多
登录 后发表回答