Tesseract improvements and image pre-processing st

2019-09-06 09:26发布

问题:

I am working on Tesseract library and below is the input for the Tesseract,

At the initial step of implementation I have used only the "MRZ" zone of the ID card. But the actual intention is to scan the entire document and get all the texts in the ID card.

I have gone through this document and to improve quality of Tesseract th first step is the image should be 300 dpi.

1) How to convert the captured camera image in ios to 300 dpi?

2) What should be the best contrast and brigtness level for Tesseract to give best outputs?

3) Is there anyother pre-processing step that I can apply to an image to get good accuracy?

4) For better accuracy what is the recommended image resolution?

5) I have used "int tesseract::TESSDLL_API::MeanTextConf" to get the confidence score. With this confidence score for each character is there a possibility that I can decide if the confidence score is above some percentage then the recognized character is accurate? If I am wrong can you please explain the use of "MeanTextConf" method?

回答1:

I wrote several generic OCR blog posts on the image pre-processing and "how OCR works best" some time ago. Please find them here: http://www.ocr-it.com/user-scenario-process-digital-camera-pictures-and-ocr-to-extract-specific-numbers

In general, getting high enough resolution should be the first step. Low resolution simply does not have enough information per letter to read characters reliably. Then I do adaptive binarization, where the image is converted to black & white using threshold where backgrounds should dispensary and characters should remain pretty clear, without extra noise or holes in them. Then, optionally, can perform segmentation into various fields and process each field separately with specific settings, such as "digits only" for the number, and "M|F" for sexe field, etc.