I'm trying to install Tesseract-OCR on my server however when I install all what I believe to be the correct repos. When I try to install it the package is not found
I tried adding rpmforge but to no avail. Any ideas from somebody that has done before or is familiar with adding and searching through repos?
I used these instructions which worked correctly in Centos
Install Tesseract OCR libs from sources in Centos
Download Leptonica and Teseract sources:
$ wget http://www.leptonica.org/source/leptonica-1.69.tar.gz
$ wget https://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.02.tar.gz
Configure, compile, install libs:
$ tar xzvf leptonica-1.69.tar.gz
$ cd leptonica-1.69
$ ./configure
$ make
$ sudo make install
$ tar xzf tesseract-ocr-3.02.02.tar.gz
$ cd tesseract-3.01
$ ./autogen.sh
$ ./configure
$ make
$ sudo make install
$ sudo ldconfig
Download languages (english) and copy to tessdata folder:
$ wget http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.eng.tar.gz
$ tar xzf tesseract-ocr-3.02.eng.tar.gz
$ sudo cp tesseract-ocr/tessdata/* /usr/local/share/tessdata
and enjoy it ;)
I recommend to try installing from rpm here: http://pkgs.org/download/tesseract
There are also several dependencies: libpng-devel, libjpeg-devel, libtiff-devel, zlib and leptonica.
Last 2 can also be found on RPM site
This worked for me :
/usr/bin/yum --enablerepo epel-testing install tesseract.x86_64 tesseract-langpack-fra.noarch
tesseract is not in the epel repository but in the epel-testing repo witch is not activated by default.
I have written a bash script to install Tesseract 3.05 on Centos 7. This fetches and installs all dependencies, and also installs language files for English, Hindi, Bengali and Thai.
Code available on GitHub
https://github.com/EisenVault/install-tesseract-redhat-centos
Hope this helps.
Install Tesseract OCR libs from sources (UPDATED as on 14th July 2018)
Download Leptonica and Teseract sources:
$ wget http://www.leptonica.com/source/leptonica-1.76.0.tar.gz
$ wget https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-3.02.02.tar.gz
Configure, compile, install Leptonica:
$ tar xzvf leptonica-1.76.0.tar.gz
$ cd leptonica-1.76.0
$ ./configure & make & sudo make install
Configure, compile, install Tesseract:
$ tar xzf tesseract-ocr-3.02.02.tar.gz
$ cd tesseract-ocr
$ ./autogen.sh & ./configure & make & sudo make install & sudo ldconfig
Download language file:
I am downloading english language file(eng.traineddata) here. You can see complete list of language files here and download as per your need.
https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-302
Download languages (english) and copy to tessdata folder:
$ wget https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-3.02.eng.tar.gz
$ tar xzf tesseract-ocr-3.02.eng.tar.gz
$ sudo cp tesseract-ocr/tessdata/* /usr/local/share/tessdata
Now your Tesseract OCR is installed and ready to use!
Example:
$tesseract /path/to/input/test.jpg /path/to/output/abc.txt -l eng
Enjoy!!!
enter image description here
yum install --nogpgcheck tesseract
after installation to test enter the following command:
tesseract --version