-->

Issues in Gensim WordRank Embeddings

2019-08-29 16:46发布

问题:

I am using Gensim wrapper to obtain wordRank embeddings (I am following their tutorial to do this) as follows.

from gensim.models.wrappers import Wordrank

model = Wordrank.train(wr_path = "models", corpus_file="proc_brown_corp.txt", 
out_name= "wr_model")

model.save("wordrank")
model.save_word2vec_format("wordrank_in_word2vec.vec")

However, I am getting the following error FileNotFoundError: [WinError 2] The system cannot find the file specified. I am just wondering what I have made wrong as everything looks correct to me. Please help me.

Moreover, I want to know if the way I am saving the model is correct. I saw that Gensim offers the method save_word2vec_format. What is the advantage of using it without directly using the original wordRank model?

回答1:

FileNotFoundError: [WinError 2] The system cannot find the file specified.

So, I am gonna assume here that you got the traceback on

model = Wordrank.train(wr_path = "models", corpus_file="proc_brown_corp.txt", 
out_name= "wr_model")

See, the wr_path is supposed to point to where you have your wordrank installed, to be more specific, the path to the folder where your wordrank binary is saved.

So mine was path_to_wordrank_binary ='/home/ubuntu/wordrank' where wordrank is the folder that contains the wordrank.cpp

Then ensure that your corpus file is on the current directory. Since that's what you have given.

This is the tutorial you should be looking into.