How to implement Tesseract to run with project in

I have a C++ project in Visual Studio 2010 and wish to use OCR. I came across many "tutorials" for Tesseract but sadly, all I got was a headache and wasted time.

In my project I have an image stored as a Mat. One solution to my problem is to save this Mat as an image (image.jpg for example) and then call Tesseract executable file like this:

system("tesseract.exe image.jpg out");

Which gets me an output out.txt and then I call

infile.open ("out.txt");

to read the output from Tesseract.

It is all good and works like a chair but it is not an optimal solution. In my project I am processing a video so save/call .exe/write/read at 10+ FPS is not what I am really looking for. I want to implement Tesseract to existing code so to be able to pass a Mat as an argument and immediately get a result as a String.

Do you know any good tutorial(pref. step-by-step) to implement Tesseract OCR with Visual Studio 2010? Or your own solution?

标签： c++ opencv ocr tesseract

3条回答

我欲成王，谁敢阻挡

2楼-- · 2019-01-14 05:08

It has been a lot since the last reply but it may be help to others;

I think you must also add "liblept168.lib" and "liblept168d.lib" to Additional Dependencies
Add "liblept168.dll" and "liblept168d.dll" to the destination of your exe.
Add #include to your code.

(This answer must be a comment to Bruce's answer. Sorry for confusion. )

0人赞添加讨论(0) 举报

混吃等死

3楼-- · 2019-01-14 05:16

OK, I figured it out but it works for Release and Win32 configuration only (No debug or x64). There are many linking errors under Debug configuration.

So,

1. First of all, download prepared library folder(Tesseract + Leptonica) here:

Mirror 1(Google Drive)

Mirror 2(MediaFire)

2. Extract tesseract.zip to C:\

3. In Visual Studio, go under C/C++ > General > Additional Include Directories

Insert C:\tesseract\include

4. Under Linker > General > Additional Library Directories

Insert C:\tesseract\lib

5. Under Linker > Input > Additional Dependencies

Add:

liblept168.lib
libtesseract302.lib

Sample code should look like this:

#include <tesseract\baseapi.h>
#include <leptonica\allheaders.h>
#include <iostream>

using namespace std;

int main(void){

    tesseract::TessBaseAPI api;
    api.Init("", "eng", tesseract::OEM_DEFAULT);
    api.SetPageSegMode(static_cast<tesseract::PageSegMode>(7));
    api.SetOutputName("out");

    cout<<"File name:";
    char image[256];
    cin>>image;
    PIX   *pixs = pixRead(image);

    STRING text_out;
    api.ProcessPages(image, NULL, 0, &text_out);

    cout<<text_out.string();

    system("pause");
}

For interaction with OpenCV and Mat type images look HERE

0人赞添加讨论(0) 举报

我只想做你的唯一

4楼-- · 2019-01-14 05:34

You need to use the library through the API.

Most probably:

start by downlaoding the libs ( https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-3.02.02-win32-lib-include-dirs.zip&can=2&q= ). They're compiled with Visual 2008 but it should be enough
Use the API directly (example, look at an open source project using it: https://code.google.com/p/qtesseract/source/browse/#svn%2Ftrunk%2Ftessdata ) and read the links from this answer : How can i use tesseract ocr(or any other free ocr) in small c++ project?

0人赞添加讨论(0) 举报

How to implement Tesseract to run with project in

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间