Stanford NNDep parser: java.lang.ArrayIndexOutOfBo

After training a model, i’m trying to parse the test treebank. Unfortunately, this error keeps popping up:

Loading depparse model file: nndep.model.txt.gz ...
###################
#Transitions: 77
#Labels: 38
ROOTLABEL: root
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 25
        at edu.stanford.nlp.parser.nndep.Classifier.preCompute(Classifier.java:663)
        at edu.stanford.nlp.parser.nndep.Classifier.preCompute(Classifier.java:637)
        at edu.stanford.nlp.parser.nndep.DependencyParser.initialize(DependencyParser.java:1151)
        at edu.stanford.nlp.parser.nndep.DependencyParser.loadModelFile(DependencyParser.java:589)
        at edu.stanford.nlp.parser.nndep.DependencyParser.loadModelFile(DependencyParser.java:493)
        at edu.stanford.nlp.parser.nndep.DependencyParser.main(DependencyParser.java:1245)

If the pre-trained english model, which ships with the NLP package, is used, that error does not appear. Therefore, there is maybe something wrong with the trained model? There were no errors during training, however. 500 iterations were done (default 20000 takes over 15 hours on my 2,33 GHz Core 2 Duo CPU @ 4 Gb RAM – is such an amount of time normal, by the way?)　Train, dev and test sets are UD 1.2; word embeddings used are these. Seems that this error happens when non-english treebank is used for training (tried swedish and polish UD; -tlp option is not set, using UniversalEnglish).

标签： nlp stanford-nlp

1条回答

姐就是有狂的资本

2楼-- · 2019-08-11 09:02

Answering my own question, with a hint in a comment by @Jon Gauthier. It turns out that the -embeddingSize flag is needed also at parsing stage if it was used during training (= other value then the default 50 was used). The documentation never says that, and in fact only refers to the flag in regards to the training phase, but the error message in the question code actually cryptically hints about the origin of the error, displaying „25“̦ which was the dimensionality of the word embeddings used.

0人赞添加讨论(0) 举报

Stanford NNDep parser: java.lang.ArrayIndexOutOfBo

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间