I have to analyzed a image which containing both English and Japanese texts. When I run tesseract by default (eng), some Japanese characters lost. Otherwise, if I run tesseract with japanese (-l jpn) some English characters lost (e.p. Email). How can I run one process which recognize both English and Japanese characters. Thanks.
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
Since tesseract 3.02 it is possible to specify multiple languages for the -l parameter.
-l lang The language to use. If none is specified, English is assumed. Multiple languages may be specified, separated by plus characters. Tesseract uses 3-character ISO 639-2 language codes.
An example:
tesseract myscan.png out -l deu+eng