Installed Tesseract to do some OCR testing with Selenium WebDriver (Java).
This is my maven dependency for Tess4J
<dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId`enter code here`>tess4j</artifactId>
<version>2.0.0</version>
<scope>test</scope>
</dependency>
Installed Tesseract 3.03.00 via brew. Setup TESSDATA_PREFIX to the path
/usr/local/Cellar/tesseract/3.04.00/share/tessdata
But, actually, when I did the following command
sudo find / -name tessdata
I found that tessdata folder in 4 different locations.
/Users/<username>/Downloads/Tess4J/tessdata
/Users/<username>/tesseract-ocr/tessdata
/usr/local/Cellar/tesseract/3.04.00/share/tessdata
/usr/local/share/tessdata
Confused now if I have setup my TESSDATA_PREFIX correctly or not since I am getting the following error when tried to run my junit test
Error opening data file ./tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.`enter code here`
Failed loading language 'eng'
Tesseract couldn't load any languages!
AdaptedTemplates != NULL:Error:Assert failed:in file adaptmatch.cpp, line 174
TESSDATA_PREFIX
should be set to the parent directory oftessdata
, i.e.,/usr/local/Cellar/tesseract/3.04.00/share/
.For those having problems with path on Tesseract (wich is likely to happen) i've see that usually you can pass the path of tessdata as first parameter on the instance.
On others you can programatically set it before.
Case of Python and Tesserwrap i just did: https://stackoverflow.com/a/38821791/2480481
I know that's MacOS, but the key of the answer is "put the path at the instance and pray", it worked for me.
This problem is resolved by setting the path programmatically using the method instance.setDatapath(parentPath);
You might see a pop "The last time you opened java, it unexpectedly quit while reopening windows. Do you want to try to reopen its windows again?", click "Reopen" and it will never appear again.
All my tests are running perfectly fine now.