Installing Tesseract-OCR on CentOS 6

2019-01-19 15:46发布

问题:

I'm trying to install Tesseract-OCR on my server however when I install all what I believe to be the correct repos. When I try to install it the package is not found

I tried adding rpmforge but to no avail. Any ideas from somebody that has done before or is familiar with adding and searching through repos?

回答1:

I used these instructions which worked correctly in Centos

Install Tesseract OCR libs from sources in Centos

Download Leptonica and Teseract sources:

$ wget http://www.leptonica.org/source/leptonica-1.69.tar.gz
$ wget https://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.02.tar.gz

Configure, compile, install libs:

 $ tar xzvf leptonica-1.69.tar.gz      
 $ cd leptonica-1.69      
 $ ./configure
 $ make
 $ sudo make install

 $ tar xzf tesseract-ocr-3.02.02.tar.gz
 $ cd tesseract-3.01
 $ ./autogen.sh
 $ ./configure
 $ make
 $ sudo make install
 $ sudo ldconfig

Download languages (english) and copy to tessdata folder:

$ wget http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.eng.tar.gz       
$ tar xzf tesseract-ocr-3.02.eng.tar.gz       
$ sudo cp tesseract-ocr/tessdata/* /usr/local/share/tessdata

and enjoy it ;)



回答2:

I recommend to try installing from rpm here: http://pkgs.org/download/tesseract There are also several dependencies: libpng-devel, libjpeg-devel, libtiff-devel, zlib and leptonica. Last 2 can also be found on RPM site



回答3:

This worked for me :

/usr/bin/yum --enablerepo epel-testing install tesseract.x86_64 tesseract-langpack-fra.noarch

tesseract is not in the epel repository but in the epel-testing repo witch is not activated by default.



回答4:

I have written a bash script to install Tesseract 3.05 on Centos 7. This fetches and installs all dependencies, and also installs language files for English, Hindi, Bengali and Thai.

Code available on GitHub

https://github.com/EisenVault/install-tesseract-redhat-centos

Hope this helps.



回答5:

Install Tesseract OCR libs from sources (UPDATED as on 14th July 2018)

Download Leptonica and Teseract sources:

$ wget http://www.leptonica.com/source/leptonica-1.76.0.tar.gz

$ wget https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-3.02.02.tar.gz

Configure, compile, install Leptonica:

$ tar xzvf leptonica-1.76.0.tar.gz
$ cd leptonica-1.76.0
$ ./configure & make & sudo make install

Configure, compile, install Tesseract:

$ tar xzf tesseract-ocr-3.02.02.tar.gz
$ cd tesseract-ocr
$ ./autogen.sh & ./configure & make & sudo make install & sudo ldconfig

Download language file:

I am downloading english language file(eng.traineddata) here. You can see complete list of language files here and download as per your need. https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-302

Download languages (english) and copy to tessdata folder:

$ wget https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-3.02.eng.tar.gz
$ tar xzf tesseract-ocr-3.02.eng.tar.gz
$ sudo cp tesseract-ocr/tessdata/* /usr/local/share/tessdata

Now your Tesseract OCR is installed and ready to use! Example:

$tesseract /path/to/input/test.jpg /path/to/output/abc.txt -l eng

Enjoy!!!



回答6:

enter image description here

yum install --nogpgcheck tesseract

after installation to test enter the following command: tesseract --version