How to detect the language of a document - in PHP?

2019-06-06 05:45发布

The basics have already been answered here. But is there a pre-built PHP lib doing the same as Lingua::Identify from CPAN?

2条回答
Viruses.
2楼-- · 2019-06-06 06:16

There's a PEAR package Text_LanguageDetect that I've used before. Get's the job done well enough. I'm not sure of any other libs that are more mature.

查看更多
霸刀☆藐视天下
3楼-- · 2019-06-06 06:19

1- You could do it yourself (the hard way) - detecting both language and codepage by looking at character and n-gram frequencies. You would need lots of "training" data, but it's doable.

2- You could run a perl script to do the detection for you(much easier).

查看更多
登录 后发表回答