I am having some problems with pytesseract. I need to configure Tesseract to that it is configured to accept single digits while also only being able to accept numbers as the number zero is often confused with an 'O'.
Like this:
target = pytesseract.image_to_string(im,config='-psm 7',config='outputbase digits')
Many thanks,
Niall
The reason you are having trouble is because character restriction does not work in version 4.0. You have to force legacy mode (oem 0) to have it limit found characters. There is a bug somewhere in the tesseract team that they have not yet addressed.
tesseract-4.0.0a
supports belowpsm
. If you want to have single character recognition, setpsm = 10
. And if your text consists of numbers only, you can settessedit_char_whitelist=0123456789
.Here is a sample usage of
image_to_string
with multiple parameters.Hope this helps.