Does anybody know a function to convert a text representation of a number into an actual number, e.g. 'twenty thousand three hundred and five' into 20305. I have written numbers in dataframe rows and want to convert them to numbers.
In package qdap, you can replace numeric represented numbers with words (e.g., 1001 becomes one thousand one), but not the other way around:
library(qdap)
replace_number("I like 346457 ice cream cones.")
[1] "I like three hundred forty six thousand four hundred fifty seven ice cream cones."
Here's a start that should get you to hundreds of thousands.
Results:
I can tell you already that this won't work for
word2num("four hundred thousand fifty")
, because it doesn't know how to handle consecutive "hundred" and "thousand" terms, but the algorithm can be modified probably. Anyone should feel free to edit this if they have improvements or build on them in their own answer. I just thought this was a fun problem to play with (for a little while).Edit: Apparently Bill Venables has a package called english that may achieve this even better than the above code.
Here's what I think is a better solution.