Image processing - rotate scanned document to alig

2019-02-19 06:14发布

问题:

I have an OCR C# project where I get a scanned document with text in it, and I need to return the text in the document.

I already have the solution for parsing the text, however we are stuck in the part where the scanned document is rotated (to the right or to the left).

Suppose there is no noise in the image (All pixels are white or black), can anyone help us with an algorithm to rotate the image in runtime (Without a human eye)?

Thanks

回答1:

Use Hough Transform to detect the strongest line orientation which should be the horizontal text orientation. The basic premise of the Hough Transform is to convert x-y coordinate to a r-theta coordinate system where r is the distance from origin and theta is the orientation.

Once the image is transformed, bin same thetas to find the strongest orientation.

Because this method uses voting within discrete r and thetas. The resolution of the theta is only as good as number of bins used. So instead of using -180 to +180 degree in one degree increment, you might want to bound it for either more accurate angle or speed.



回答2:

(I not an expert but by curiosity write this post)

IMHO, this problem can be solved cost effectively with brute force trial and error approach. Because there can be not too many wrong orientation.

I think your can easily determine the bounding box of text. This bounding box can have wrong orientation only in two way. Rotated clock wisely or Rotated counter clock wisely. So with maximum two rotation of image (rotation that make bounding box upright) you can find correct orientation.

That is, you could find correct document orientation without further processing of image to determine text align. And determining the text align will be rather large processing I think.

UPDATE

I'm suggesting that we don't have to find exact rotation angle. If the bonding box is upright it can be in the right angle or 180 degree rotated angle.

1) make bonding box upright
2) run OCR, check the result, if ok its done
3) rotate 180 degree
2) run OCR. this time it must be in the right angle

If we really have to find the exact rotation angle, I think it must start with finding possible shape of character 'o', 'c', or 'm' (excluding italic font). Or, find relative location of the period('.'). This will require complicated operation, I think.