Error when implementing gensim.LdaMallet

2019-04-10 04:18发布

问题:

I was following the instructions on this link ("http://radimrehurek.com/2014/03/tutorial-on-mallet-in-python/"), however I came across an error when I tried to train the model:

    model = models.LdaMallet(mallet_path, corpus, num_topics =10, id2word = corpus.dictionary)
    IOError: [Errno 2] No such file or directory: 'c:\\users\\brlu\\appdata\\local\\temp\\c6a13a_state.mallet.gz'

Please share any thoughts you might have.

Thanks.

回答1:

This can happen for two reasons: 1. You have space in your mallet path. 2. There is no MALLET_HOME environment variable.



回答2:

  1. Make sure that mallet properly works from command-line.
  2. Look to your folder 'c:\users\brlu\appdata\local\temp\...' if there are some files, you can deduce at which step mallet-wrapper fails. Try this step at command line.


回答3:

In my case I forgot to import gensim's mallet wrapper. The following code resolved the error.

import os
from gensim.models.wrappers import LdaMallet

os.environ['MALLET_HOME'] = 'C:/.../mallet-2.0.8/'

A more detailed explanation can be found here: https://github.com/RaRe-Technologies/gensim/issues/2137



回答4:

I had similar problems with gensim + MALLET on Windows:

  1. Make sure that MALLET_HOME is set
  2. Escape slashes when set mallet_path in Python

    mallet_path = 'c:\\mallet-2.0.7\\bin\\mallet'
    LDA_model = gensim.models.LdaMallet(mallet_path, ...
    
  3. Also, it might be useful to modify line 142 in Python\Lib\site-packages\gensim\models\ldamallet.py: change --token-regex '\S+' to --token-regex \"\S+\"

Hope it helps



回答5:

Try the following

  1. import tempfile
  2. tempfile.tempdir='some_other_non_system_temp_directory'