I am trying to figure out how autocorrect algorithms can be implemented in either PHP
or C#
.
In short, I have a user inputted word that should be able to have minor misspelling be tolerated. I also have an SQL
database of correctly spelled words. I want to be able to grab the closest (correctly) spelled word from the database to that which the user entered.
I realize there are a zillion autocorrect packages out there, but I would like to be able to customize it, so I am looking for any information on implementing this functionality in either PHP
or C#
.
Many thanks, Brett
A dictionary file and levenshtein distance functions are going to be your best bet.
http://us.php.net/manual/en/function.levenshtein.php
Check out the comments on that function, it has a few sample implementations.
To take it to the next level, you could also throw soundex or metaphone functions in there, and it will catch phonetic errors too.
Web or windows? Assume web, since you mention PHP.
Budget or no budget? There are various web editors out there. Telerik makes a nice AJAX control, for example, that allows using AJAX to spell check. It is fully customizable. I am sure some of the other vendors (Infragistics, Synfusion, ComponentOne, etc) have similar editors.
If you need to head to Open Source, there are editors out there. Not sure which support customization of lists, however. As the third party controls are relatively inexpensive (a few hundred dollars or less) and easy to customize (Telerik's is), I find it a better option to coding yourself or ending up with an open source implementation that is hard to customize. It is worth looking at open source, however.
I am assuming you mean Peter Norvig's spell corrector, only written in C# or PHP (1, 2) as linked from his site.
This is essentially the method Google uses for spelling corrections.