Does Tesseract neglect any nontext area in a scann

2019-04-02 00:02发布

站内文章 / 前端开发

24 0

一夜七次

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm using Tesseract but I don't know whether it neglects any nontext area and targets text only. Do I have to remove any nontext area as a preprocessing step for better output?

回答1:

Tesseract has a pretty good algorithm to detect text, but it will eventually give false-positive matches.

Ideally, you would pre-process the image before submitting it to tesseract. Some time ago I engaged in a similar task, so I suggest you take a look at the following material: