I need a programmatic way of taking a scanned image (let's assume PNG or any other convenient image format) and breaking it up into many smaller images. The scanned image is a grid, and the boxes of the grid will always be the same size and in the same relative location. Because the image is scanned, they are not necessarily in the same absolute location. In each box is a character, ideally I'd like to save the character as its own image file, without any of the box border.
I prefer PHP and ImageMagick, which I think will be the right combination of tools. However, I'm flexible if there's a much better way to do it.
Here is the start of an algorithmic approach to the problem...
I'm using this image I created for test purposes, called
box.jpg
, with dimensions of 352x232 pixels:The goal is to identify the red box and extract the 'Dave' picture.
My algorithmic approach would be as follows:
Scale the picture to one that has the original width, but a height of only 1 pixel; at the same time convert to grayscale and increase the contrast; use the textual description of each pixel's properties that ImageMagick can emit. This way you should be able to find the two spots where the vertical red line pixels accumulated the extreme color value. (The vertical red line pixels together with the gray letter pixels will have a more common color value.)
Do the same in the other direction: Scale the picture to one that has the original height, but a width of only 1 pixel (convert to grayscale, increase the contrast, use the textual description... yadda-yadda). You'll find the two spots where the horizontal red line pixels accumulated the extreme color value. (Vertical red line combined with gray letter pixels will have a more 'average' color value.)
Identify the location of each of the color value peaks in each of the two results: this will give you the geometry of the sub-image to extract from the original.
Extract the sub-image from the original. Crop each side as needed.
I can't elaborate the complete algorithm in detail, but here are the commands I'd use for steps 1 and 2.
Command for Step 1
Result for Step 1
This is the content of
columns.txt
:(Note: It appears to be a bit confusing that ImageMagick calls color values of
#FFFFFF
sometimeswhite
, sometimesgray(255,255,255)
-- as well as calling color values of#000000
somtimesblack
, somtimesgray(0,0,0)
... Maybe a bug? Anyway, doesn't block us here...)Command for Step 2
Result for Step 2
This is the content of
rows.txt
(this time I dropped the confusing color names):From these two results we can reliably conclude:
Hence, our command to cut the sub-image from the original one could be:
or, to make the image dimensions better distinguishable from the white background of this web page:
Resulting image:
You can now apply OCR on the image: