-->

Cannot run Mallet TopicModel

2019-08-23 06:41发布

问题:

I am trying to run Mallet`s topic modelling but got the following error:

Couldn't open cc.mallet.util.MalletLogger resources/logging.properties file.
Perhaps the 'resources' directories weren't copied into the 'class' directory.
Continuing.
Exception in thread "main" java.lang.IllegalArgumentException: Trouble reading file     stoplists\en.txt at    cc.mallet.pipe.TokenSequenceRemoveStopwords.fileToStringArray(TokenSequenceRemoveStopwords.java:144) at cc.mallet.pipe.TokenSequenceRemoveStopwords.<init>(TokenSequenceRemoveStopwords.java:73) at LDA.TopicModel.main(TopicModel.java:23)  

I have already added all the jar files! Could you please advise what is the problem here?

Thanks,

回答1:

I received the first error, which it's able to continue from, as well.

But the actual exception that stops you seems to be that you don't have the MALLET stop words list in the right place. I downloaded their en.txt stopwords list to a specific location and gave it a direct path instead of "stoplists/en.txt", which worked.



回答2:

Your english stop words file is missing (stoplists\en.txt). Either try downloading the jar files again, or just use maven which will make it easier for you to import in your java project. In the Maven POM file add:

<dependencies>
    <dependency>
        <groupId>cc.mallet</groupId>
        <artifactId>mallet</artifactId>
        <version>2.0.8</version>
    </dependency>
....
</dependencies>

Latest version can be found here.