Tesseract MacOS Error opening data file ./tessdata

2019-05-18 20:19发布

问题:

Installed Tesseract to do some OCR testing with Selenium WebDriver (Java).

This is my maven dependency for Tess4J

<dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId`enter code here`>tess4j</artifactId>
<version>2.0.0</version>
<scope>test</scope>
</dependency>

Installed Tesseract 3.03.00 via brew. Setup TESSDATA_PREFIX to the path

/usr/local/Cellar/tesseract/3.04.00/share/tessdata

But, actually, when I did the following command

sudo find / -name tessdata 

I found that tessdata folder in 4 different locations.

/Users/<username>/Downloads/Tess4J/tessdata
/Users/<username>/tesseract-ocr/tessdata
/usr/local/Cellar/tesseract/3.04.00/share/tessdata
/usr/local/share/tessdata

Confused now if I have setup my TESSDATA_PREFIX correctly or not since I am getting the following error when tried to run my junit test

    Error opening data file ./tessdata/eng.traineddata
    Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.`enter code here`
    Failed loading language 'eng'
    Tesseract couldn't load any languages!
    AdaptedTemplates != NULL:Error:Assert failed:in file adaptmatch.cpp, line 174

回答1:

TESSDATA_PREFIX should be set to the parent directory of tessdata, i.e., /usr/local/Cellar/tesseract/3.04.00/share/.



回答2:

This problem is resolved by setting the path programmatically using the method instance.setDatapath(parentPath);

You might see a pop "The last time you opened java, it unexpectedly quit while reopening windows. Do you want to try to reopen its windows again?", click "Reopen" and it will never appear again.

All my tests are running perfectly fine now.



回答3:

For those having problems with path on Tesseract (wich is likely to happen) i've see that usually you can pass the path of tessdata as first parameter on the instance.

On others you can programatically set it before.

Case of Python and Tesserwrap i just did: https://stackoverflow.com/a/38821791/2480481

I know that's MacOS, but the key of the answer is "put the path at the instance and pray", it worked for me.



标签: tesseract