Adding a new annotator in Stanford CoreNLP

2019-06-11 06:32发布

I am trying to add a new annotator in Stanford CoreNLP according to the instructions in http://nlp.stanford.edu/downloads/corenlp.shtml.

"Adding a new annotator StanfordCoreNLP also has the capacity to add a new annotator by reflection without altering the code in StanfordCoreNLP.java. To create a new annotator, extend the class edu.stanford.nlp.pipeline.Annotator and define a constructor with the signature (String, Properties). Then, add the property customAnnotatorClass.FOO=BAR to the properties used to create the pipeline. If FOO is then added to the list of annotators, the class BAR will be created, with the name used to create it and the properties file passed in. "

I have created a new class for my new annotator, but i cannot put the properties file that would pass in. I have only put the new annotator in the pipeline.

props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref, regexner, color");
props.setProperty("customAnnotatorClass.color", "myPackage.myPipeline");

Is there any example code to help me?

1条回答
Melony?
2楼-- · 2019-06-11 07:08

You can have mine, if you like. The interesting stuff starts at // adding our own annotator property:

/** Annotates a document with our customized pipeline.
 * @param text A text to process
 * @return The annotated text
 */
private Annotation annotateText(String text) {
    Annotation doc = new Annotation(text);

    StanfordCoreNLP pipeline;

    // creates a StanfordCoreNLP object, with POS tagging, lemmatization,
    // NER, parsing, and coreference resolution
    Properties props = new Properties();
    // alternative: wsj-bidirectional
    try {
        props.put(
                "pos.model",
                "edu/stanford/nlp/models/pos-tagger/wsj-bidirectional/wsj-0-18-bidirectional-distsim.tagger");
    } catch (Exception e) {
        e.printStackTrace();
    }
    // adding our own annotator property
    props.put("customAnnotatorClass.sdclassifier",
            "edu.kit.ipd.alicenlp.ivan.analyzers.StaticDynamicClassifier");

    // configure pipeline
    props.put(
                "annotators", 
                "tokenize, ssplit, pos, lemma, ner, parse, sdclassifier");
    pipeline = new StanfordCoreNLP(props);

    pipeline.annotate(doc);
    return doc;
}
查看更多
登录 后发表回答