I am working on a Android application using real-time OCR. I using OpenCV and Tesseract Library. But the performance is very poor, even on my Galaxy SIII. There are any methods to improve the performance? It is my code:
Mat mGray = new Mat();
capture.retrieve(mGray);
Bitmap bmp = Bitmap.createBitmap(mGray.cols(), mGray.rows(), Bitmap.Config.ARGB_8888);
tessBaseApi.setImage(bmp);
String recognizedText = tessBaseApi.getUTF8Text();
Log.i("Reg", recognizedText);
Will the speed of tesseract OCR be reduced by passing bitmap to the Tesseract API? What pre-processing should I perform before passing to the Tesseract API?
One thing to try is to binarize the image using adaptive thresholding (adaptiveThreshold in OpenCV).
You can have Tesseract only do the recognition pass 1, so that it skips passes 2 through 9,
when it calls recog_all_words().
Change the following line in baseapi.cpp
and rebuild your Tesseract library project:
if (tesseract_->recog_all_words(page_res_, monitor, NULL, NULL, 0)) {
Change it to:
if (tesseract_->recog_all_words(page_res_, monitor, NULL, NULL, 1)) {
Some things that might make it faster are:
- Select a smaller region from mGray where your text is, before createBitmap - so the more heavy methods that follow process a smaller image.
- Changing Bitmap.Config.ARGB_8888 to Bitmap.Config.RGB_565 - your image is grayscale, it will not need a ARGB bitmap.
Use multithreading, but be aware to create one instance per thread for TessBaseAPI. Don't share them between different threads. Create N threads (N >= number of cores), and java will make sure that you speed up at least the number of cores times.
What I do is creating N threads which create TessBaseAPI objects in their own context (in the run method) and wait for OCR requests in a loop until interrupted.
...
...
@Override
public void run() {
TessBaseAPI tessBaseApi = new TessBaseAPI();
tessBaseApi.init(Ocrrrer.DATA_PATH, "eng");
setTessVariable(tessBaseApi, "load_system_dawg", "0");
setTessVariable(tessBaseApi, "load_freq_dawg", "0");
setTessVariable(tessBaseApi, "load_unambig_dawg", "0");
setTessVariable(tessBaseApi, "load_punc_dawg", "0");
setTessVariable(tessBaseApi, "load_number_dawg", "0");
setTessVariable(tessBaseApi, "load_fixed_length_dawgs", "0");
setTessVariable(tessBaseApi, "load_bigram_dawg", "0");
setTessVariable(tessBaseApi, "wordrec_enable_assoc", "0");
setTessVariable(tessBaseApi, "tessedit_enable_bigram_correction", "0");
setTessVariable(tessBaseApi, "assume_fixed_pitch_char_segment", "1");
setTessVariable(tessBaseApi, TessBaseAPI.VAR_CHAR_WHITELIST, "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ<");
Log.d(TAG, "Training file loaded");
while (!interrupted()) {
reentrantLock.lock();
try {
Log.d(TAG, this.getName() + " wait for OCR");
jobToDo.await();
Log.d(TAG, this.getName() + " input arrived. Do OCR");
this.ocrResult = doOcr(tessBaseApi);
ocrDone.signalAll();
} catch (InterruptedException e) {
return;
} finally {
try {
reentrantLock.unlock();
} catch (Exception ex) {
}
}
}
}
...
...
You can see that the tessBaseApi object is local to the run method, hence absolutely not shared.