I'm trying to make a program that can read the information off of a nutritional label but Tesseract is having lots of issues actually being able to read anything. I've tried a number of different Image processing techniques using OpenCV but not much seems to help.
Here are some of my better looking attempts (which happen to be the simplest):
Tango bottle label uneditied
Tango bottle label edited
Output:
200k], Saturates, 09
Irn Bru bottle label unedited
Irn Bru bottle label edited
Output
This is just changing the images to grey scale, a 3x3 Gaussian blur and Otsu binarisation.
I would appreciate any help on how to make the text more readable using OpenCV or any other image processing library.
Would it be simpler to forego using Tesseract and use machine learning for this?