Extracting lines from an image to feed to OCR - Te

2019-03-21 15:50发布

问题:

I was watching this talk from pycon http://youtu.be/B1d9dpqBDVA?t=15m34s around the 15:33 mark the speaker talks about extracting lines from an image (receipt) and then feeding that to the OCR engine so that text can be extracted in a better way.

I have a similar need where I'm passing images to the OCR engine. However, I don't quite understand what he means by extracting lines from an image. What are some open source tools that I can use to extract lines from an image?

回答1:

Take a look at the technique used to detect the skew angle of a text.

Groups are lines are used to isolate text on an image (this is the interesting part).

From this result you can easily detect the upper/lower limits of each line of text. The text itself will be located inside them. I've faced a similar problem before, the code might be useful to you:

All you need to do from here is crop each pair of lines and feed that as an image to Tesseract.



回答2:

i can tell u a simple technique to feed the images to OCR.. just perform some operations to get the ROI (Region of Interest) of ur image, and localize the area where the image after binarizing it.. then you may find contours, and by keeping the threasholding value, and setting the required contour area, you can feed the resulting image to OCR :) .. (sorry for bad way of explaination)



回答3:

Direct answer: you extract lines from an image with Hough Transform. You can find an analytical guide here. Text lines can be detected as well. Karlphillip's answer is based on Hough Transform too.