I am using python to clean a given sentence. Suppose that my sentence is:
What's the best way to ensure this?
I want to convert:
What's -> What is
Similarly,
must've -> must have
Also, verbs to original form,
told -> tell
Singular to plural, and so on.
I am currently exploring textblob. But not all of the above is possible using it.
If you want to roll your own, you can use this for contraction mapping:
http://alicebot.blogspot.com/2009/03/english-contractions-and-expansions.html
And this for verb replacements:
http://www.lexically.net/downloads/BNC_wordlists/e_lemma.txt
For the latter, you would probably want to generate a reverse dictionary mapping all the conjugated forms to their original (perhaps keeping in mind that there could be ambiguous forms, so make sure to check for these and handle them properly).
The answers above will work perfectly well and could be better for ambiguous contraction (although I would argue that there aren't that much of ambiguous cases). I would use something that is more readable and easier to maintain:
It might have some flaws I didn't think about though.
For the first question, there isn't a direct module that does that for you so you will have to build your own, first you will need a contraction dictionary like this one:
Then write some code to modify your text according to the dictionary, something like this:
For your second question on changing verb tense, nodebox's linguistics library is very popular and highly recommended for such tasks. After downloading their zip file, unzip it and copy it to python's site-package directory. After doing that, you can write something like this:
Note: this library is only for Python 2 since it does not yet offer support for python 3.