French dependency parsing using CoreNLP

2019-09-09 12:07发布

问题:

I am following the example in this link. I have downloaded the french jar from here. When I call it as follows,

java -mx1g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -props StanfordCoreNLP-french.properties -annotators tokenize,ssplit,pos,depparse -file french.txt -outputFormat conllu

I always see it loads a english dep-parser model instead of french.

Loading depparse model file: edu/stanford/nlp/models/parser/nndep/english_UD.gz ... PreComputed 100000, Elapsed Time: 1.341 (s)

Is this a bug?

回答1:

Update -- I found that the default properties file does not specify a depparse model. So now I give it my own config file and now it works.

annotators = tokenize, ssplit, pos, depparse, parse

tokenize.language = fr

pos.model = edu/stanford/nlp/models/pos-tagger/french/french.tagger

parse.model = edu/stanford/nlp/models/lexparser/frenchFactored.ser.gz

depparse.model = edu/stanford/nlp/models/parser/nndep/UD_French.gz