I have an image like this one:
and I would like to have a black number written on white so that I can use an OCR to recognise it. How could I achieve that in Python?
Many thanks,
John.
I have an image like this one:
and I would like to have a black number written on white so that I can use an OCR to recognise it. How could I achieve that in Python?
Many thanks,
John.
If you just want to turn a white-on-black image to black-on-white, that's trivial; it's just invert
:
from PIL import Image, ImageOps
img = Image.open('zero.jpg')
inverted = ImageOps.invert(img)
inverted.save('invzero.png')
If you also want to do some basic processing like increasing the contrast, see the other functions in the ImageOps
module, like autocontrast
. They're all pretty easy to use, but if you get stuck, you can always ask a new question. For more complex enhancements, look around the rest of PIL. ImageEnhance
can be used to sharpen an image, ImageFilter
can do edge detection and unsharp masking; etc. You may also want to change the format to greyscale (L8), or even black and white (L1); that's all in the Image.convert
method.
Of course you have to know what processing you want to do. One thing you might want to try is playing around with the image in Photoshop or GIMP and keeping track of what operations you do, then looking for how to implement those operations in PIL. (It might be simpler to just use gimp-fu scripting in the first place instead of trying to use PIL…)
You don't need to manipulate the image for OCR. For example, you could just use pytesser:
from PIL import Image
from pytesser import *
im = Image.open('wjNL6.jpg')
text = image_to_string(im)
print text
Output:
0