I'm fairly new with Python and NLTK. I am busy with an application that can perform spell checks(replaces the incorrectly spelled word with the correctly spelled word), Im currently using the Enchant Library on Python-2.7, PyEnchant and the NLTK library. The code below is the class that handles the correction/replacement.
from nltk.metrics import edit_distance
class SpellingReplacer(object):
def __init__(self, dict_name = 'en_GB', max_dist = 2):
self.spell_dict = enchant.Dict(dict_name)
self.max_dist = 2
def replace(self, word):
if self.spell_dict.check(word):
return word
suggestions = self.spell_dict.suggest(word)
if suggestions and edit_distance(word, suggestions[0]) <= self.max_dist:
return suggestions[0]
else:
return word
I have written a function that takes in a list of words and performs the def replace on each word and return a list of the words but spelled correctly.
def spell_check(word_list):
checked_list = []
for item in word_list:
replacer = SpellingReplacer()
r = replacer.replace(item)
checked_list.append(r)
return checked_list
>>> word_list = ['car', 'colour']
>>> spell_check(words)
['car', 'color']
Now I don't really like this because it isn't very accurate and I'm looking for a way to achieve spelling checks and replacements on words. I also need something that can pick up spelling mistakes like "caaaar"? Are there better ways to perform spelling checks out there? If so what are they? How does Google do it for example because their spelling suggester is very good? Any suggestions
I'd recommend starting by carefully reading this post by Peter Norvig. (I had to something similar and I found it extremely useful.)
The following function, in particular has the ideas that you now need to make your spell checker more sophisticated: splitting, deleting, transposing, and inserting the irregular words to 'correct' them.
Note: The above is one snippet from Norvig's spelling corrector
And the good news is that you can incrementally add to and keep improving your spell-checker.
Hope that helps.
spell corrector->
you need to import a corpus on to your desktop if you store elsewhere change the path in the code i have added a few graphics as well using tkinter and this is only to tackle non word errors!!
from autocorrect import spell for this u need to install, prefer anaconda and it only works for words, not sentences so that's a limitation u gonna face.
from autocorrect import spell print(spell('intrerpreter')) output: interpreter
You can use the autocorrect lib to spell check in python.
Example Usage:
Result: