I need to convert one
into 1
, two
into 2
and so on.
Is there a way to do this with a library or a class or anything?
I need to convert one
into 1
, two
into 2
and so on.
Is there a way to do this with a library or a class or anything?
This could be easily be hardcoded into a dictionary if there's a limited amount of numbers you'd like to parse.
For slightly more complex cases, you'll probably want to generate this dictionary automatically, based on the relatively simple numbers grammar. Something along the lines of this (of course, generalized...)
If you need something more extensive, then it looks like you'll need natural language processing tools. This article might be a good starting point.
Quick and dirty Java port of e_h's C# implementation (above). Note that both return double, not int.
I needed something a bit different since my input is from a speech-to-text conversion and the solution is not always to sum the numbers. For example, "my zipcode is one two three four five" should not convert to "my zipcode is 15".
I took Andrew's answer and tweaked it to handle a few other cases people highlighted as errors, and also added support for examples like the zipcode one I mentioned above. Some basic test cases are shown below, but I'm sure there is still room for improvement.
Some tests...
A quick solution is to use the inflect.py to generate a dictionary for translation.
inflect.py has a
number_to_words()
function, that will turn a number (e.g.2
) to it's word form (e.g.'two'
). Unfortunately, its reverse (which would allow you to avoid the translation dictionary route) isn't offered. All the same, you can use that function to build the translation dictionary:If you're willing to commit some time, it might be possible to examine inflect.py's inner-workings of the
number_to_words()
function and build your own code to do this dynamically (I haven't tried to do this).There's a ruby gem by Marc Burns that does it. I recently forked it to add support for years. You can call ruby code from python.
results:
"fifteen sixteen" 1516 "eighty five sixteen" 8516 "nineteen ninety six" 1996 "one hundred and seventy nine" 179 "thirteen hundred" 1300 "nine thousand two hundred and ninety seven" 9297
The majority of this code is to set up the numwords dict, which is only done on the first call.