In PHP, is there a way to detect the language of a string? Suppose the string is in UTF-8 format.
相关问题
- Views base64 encoded blob in HTML with PHP
- Laravel Option Select - Default Issue
- PHP Recursively File Folder Scan Sorted by Modific
- Can php detect if javascript is on or not?
- Using similar_text and strpos together
You could do this entirely client side with
Google's AJAX Language API(now defunct).You can detect automatically a string's language
And translate any string written in one of the
supported languages(also defunct)I tried the Text_LanguageDetect library and the results I got were not very good (for instance, the text "test" was identified as Estonian and not English).
I can recommend you try the Yandex Translate API which is FREE for 1 million characters for 24 hours and up to 10 million characters a month. It supports (according to the documentation) over 60 languages.
One approach might be to break the input string into words and then look up those words in an English dictionary to see how many of them are present. This approach has a few limitations:
You can see how to detect language for a string in php using the Text_LanguageDetect Pear Package or downloading to use it separately like a regular php library.
You can not detect the language from the character type. And there are no foolproof ways to do this.
With any method, you're just doing an educated guess. There are available some math related articles out there
Perhaps submit the string to this language guesser:
http://www.xrce.xerox.com/competencies/content-analysis/tools/guesser