I am getting below error from tesseract for an image of size 5+ MB.
Tesseract Open Source OCR Engine v3.01 with Leptonica
Page 0
Image too large: (39667, 56133)
Error during processing.
Is there a limit on file size or is there a parameter to resolve this issue.
Appreciate your help..
It's not the file size but rather the image size (dimension) that exceeds Tesseract limits. I have no problems with Tesseract recognizing 16MB image. Try resize or rescale your image and try again.
The maximum width and height are 32767.
From the source code (file baseapi.cpp):
if (tesseract_->ImageWidth() > MAX_INT16 ||
tesseract_->ImageHeight() > MAX_INT16) {
tprintf("Image too large: (%d, %d)\n",
tesseract_->ImageWidth(), tesseract_->ImageHeight());