I was following the instructions on this link ("http://radimrehurek.com/2014/03/tutorial-on-mallet-in-python/"), however I came across an error when I tried to train the model:
model = models.LdaMallet(mallet_path, corpus, num_topics =10, id2word = corpus.dictionary)
IOError: [Errno 2] No such file or directory: 'c:\\users\\brlu\\appdata\\local\\temp\\c6a13a_state.mallet.gz'
Please share any thoughts you might have.
Thanks.
This can happen for two reasons:
1. You have space in your mallet path.
2. There is no MALLET_HOME environment variable.
In my case I forgot to import gensim's mallet wrapper. The following code resolved the error.
import os
from gensim.models.wrappers import LdaMallet
os.environ['MALLET_HOME'] = 'C:/.../mallet-2.0.8/'
A more detailed explanation can be found here:
https://github.com/RaRe-Technologies/gensim/issues/2137
I had similar problems with gensim
+ MALLET
on Windows:
- Make sure that
MALLET_HOME
is set
Escape slashes when set mallet_path in Python
mallet_path = 'c:\\mallet-2.0.7\\bin\\mallet'
LDA_model = gensim.models.LdaMallet(mallet_path, ...
Also, it might be useful to modify line 142 in Python\Lib\site-packages\gensim\models\ldamallet.py:
change --token-regex '\S+'
to --token-regex \"\S+\"
Hope it helps