Reading text from image using Tesseract and OpenCV

2019-09-02 08:13发布

问题:

I'm trying to make a program that can read the information off of a nutritional label but Tesseract is having lots of issues actually being able to read anything. I've tried a number of different Image processing techniques using OpenCV but not much seems to help.

Here are some of my better looking attempts (which happen to be the simplest):

Tango bottle label uneditied

Tango bottle label edited

Output:

200k], Saturates, 09

Irn Bru bottle label unedited

Irn Bru bottle label edited

Output

This is just changing the images to grey scale, a 3x3 Gaussian blur and Otsu binarisation.

I would appreciate any help on how to make the text more readable using OpenCV or any other image processing library.

Would it be simpler to forego using Tesseract and use machine learning for this?

回答1:

First of all read this StackOverflow Answer regarding OCR prepossessing.

The most important steps described above are the Image Binarization and Image Denoising

Here is an example:

Original Image

Grey Scale

Unsharp Marking

Binarization

Now ready to apply OCR

JAVA code

Imgproc.cvtColor(original, grey, Imgproc.COLOR_RGB2GRAY, 0);

Imgproc.GaussianBlur(grey, blur, new Size(0, 0), 3);

Core.addWeighted(blur, 1.5, unsharp, -0.5, 0, unsharp);

Imgproc.threshold(unsharp,binary,127,255,Imgproc.THRESH_BINARY);

MatOfInt params = new MatOfInt(Imgcodecs.CV_IMWRITE_PNG_COMPRESSION);
File ocrImage = new File("ocrImage.png");
Imgcodecs.imwrite(ocrImage,binary,params);

/*initialize OCR ...*/
lept.PIX image = pixRead(ocrImage);
api.SetImage(image);
String ocrOutput = api.GetUTF8Text();