Junk results when using Tesseract OCR and tess-two

2019-01-20 19:13发布

站内文章 / Android

31 0

迷人小祖宗

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have developed OCR Application using Tesseract OCR Library and referred from the following Links.

android-ocr
tesseract

But I am getting junk data as results sometimes. Can anyone help me what to do further to get accurate results.

回答1:

You should provide your test images if you want to get specific help for your case as well as any code you are using but a general rule of thumb for getting accurate results are :

Use a high resolution image (if needed) 300 DPI is minimum
Make sure there is no shadows or bends in the image
If there is any skew, you will need to fix the image in code prior to ocr
Use a dictionary to help get good results
Adjust the text size (12 pt font is ideal)
Binarize the image and use image processing algorithms to remove noise

On top of all this, there are a lot of image processing functions out there that can help increase accuracy depending on your image such as deskew, perspective correction, line removal, border removal, dot removal, despeckle, and many more depending on your image.