Is there an alternate for the now removed module &

2020-02-09 08:42发布

问题:

I've been trying to find out an alternative for two straight days now, and couldn't find anything relevant. I'm basically trying to get a probabilistic score of a synthesized sentence (synthesized by replacing some words from an original sentence picked from the corpora).

I tried Collocations, but the scores that I'm getting aren't very helpful. So I tried making use of the language model concept, only to find that the seemingly helpful module 'model' has been removed from NLTK because of some bugs.

It'd be really great if someone could either let me know about some alternate way to get the ngram model implementation in python, or better yet, suggest me some other way to solve the problem of 'scoring' the sentence.

回答1:

According to this open issue on the nltk repo, NGramModel is currently not in master because of some bugs. Their current solution is to install the code from the model branch. This is about 8 months behind master though, so you might miss out on other features and bug fixes.

pip install https://github.com/nltk/nltk/tarball/model

The relevant code is here in the model branch. You could copy this to your local code if you don't want to use the outdated branch. If you really care about using this you could try to fix the outstanding bugs on it and submit a pull request.