I am working on an Android app that requires OCR. I have decided to use Tesseract as API but I keep on getting this error:
E/Tesseract(native): Could not initialize Tesseract API with language=eng!
- I have already copied file
"eng.traineddata"
to the location.
- I am using Android Studio 2.1.2 (SDK 23)
- Testing on device with API 22 Android Lollipop 5.1.1 (Read about Permission issue on Marshmallow)
Here is the code I am using:
public void reads(View view) {
TextView textView = (TextView) findViewById(R.id.textView);
int rotation = 0;
try {
ExifInterface exifInterface = new ExifInterface(mCurrentPhotoPath);
int orientation = exifInterface.getAttributeInt(ExifInterface.TAG_ORIENTATION,ExifInterface.ORIENTATION_NORMAL);
switch (orientation){
case ExifInterface.ORIENTATION_ROTATE_90: rotation = 90; break;
case ExifInterface.ORIENTATION_ROTATE_180: rotation = 180; break;
case ExifInterface.ORIENTATION_ROTATE_270: rotation = 270; break;
}
} catch(Exception e) {
}
int w = imageBitmap.getWidth();
int h = imageBitmap.getHeight();
if (rotation != 0) {
Matrix matrix = new Matrix();
matrix.preRotate(rotation);
imageBitmap = Bitmap.createBitmap(imageBitmap,0,0,w,h,matrix,false);
} else {
imageBitmap = Bitmap.createBitmap(imageBitmap,0,0,w,h);
}
imageBitmap = imageBitmap.copy(Bitmap.Config.ARGB_8888,true);
TessBaseAPI ReadIt = new TessBaseAPI();
ReadIt.init("/storage/emulated/0/","eng");
ReadIt.setImage(imageBitmap);
String Text = ReadIt.getUTF8Text();
if (Text!=null) textView.setText(Text);
}
I have used this line in my build.gradle dependency:
compile 'com.rmtheis:tess-two:6.0.2'
also, I have copied the"eng.traineddata
in the folder named tessdata directly by downloading in the particular stated directory.
Are you using tess-two?. In your code:
TessBaseAPI ReadIt = new TessBaseAPI();
ReadIt.init("/storage/emulated/0/","eng");
"/storage/emulated/0/"
path should be pointing to your data files. You must have a subdirectory
named "tessdata". See
https://github.com/rmtheis/tess-two/blob/d7a45fd2e08b7ec315cd1e29d1a7e0c72fb24a66/tess-two/src/com/googlecode/tesseract/android/TessBaseAPI.java#L176
Read more at:
Could not initialize Tesseract API with language=eng!
If you dont use Marshmallow and still have problem try clean and rebuild project.
Release permissions of manifest in Activity:
In manifest:
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"/>
In onCreate:
if (ContextCompat.checkSelfPermission(this,
Manifest.permission.WRITE_EXTERNAL_STORAGE)
!= PackageManager.PERMISSION_GRANTED) {
// Should we show an explanation?
if (ActivityCompat.shouldShowRequestPermissionRationale(this,
Manifest.permission.WRITE_EXTERNAL_STORAGE)) {
} else {
ActivityCompat.requestPermissions(this,
new String[]{Manifest.permission.WRITE_EXTERNAL_STORAGE},
1);
}
}
I had this same issue and the problem was that Marshmallow specifically requires a new way for your app to get read/write permission to storage. This blog post solved my problem.
In my Main Activity I have the following:
@Override
protected void onCreate(Bundle savedInstanceState) {
...
...
getStorageAccessPermissions(); // Request storage read/write permissions from the user
}
@TargetApi(23)
private void getStorageAccessPermissions() {
int hasWriteStoragePermission = checkSelfPermission(Manifest.permission.WRITE_EXTERNAL_STORAGE);
if (hasWriteStoragePermission != PackageManager.PERMISSION_GRANTED) {
requestPermissions(new String[] {Manifest.permission.WRITE_EXTERNAL_STORAGE}, REQUEST_CODE_WRITE_EXTERNAL_PERMISSIONS);
}
}
Where REQUEST_CODE_WRITE_EXTERNAL_PERMISSIONS is an integer constant declared globally.
In a class that I have extending TessBaseAPI I added the following just for logging purposes to make sure that I actually can access the storage.
/* Checks if external storage is available to at least write to and returns the path name */
private static String isExternalStorageWritable() {
String state = Environment.getExternalStorageState();
String retval = "External storage is not writable";
if (Environment.MEDIA_MOUNTED.equals(state)) {
retval = Environment.getExternalStorageDirectory().toString();
}
return retval;
}
/* Checks if external storage is available to at least read from and returns the path name */
private static String isExternalStorageReadable() {
String state = Environment.getExternalStorageState();
String retval = "External storage is not readable";
if (Environment.MEDIA_MOUNTED.equals(state) ||
Environment.MEDIA_MOUNTED_READ_ONLY.equals(state)) {
retval = Environment.getExternalStorageDirectory().toString();
}
return retval;
}
Tesseract-two isn't using the newest version of the OCR engine, it uses 3.05, so we are forced to use data from here. It seems the new data uses a different model, neural networks. The previous models before 4.0 worked differently.
I have tried using the data from here
and here. These data sets are only compatible with the newest version of tesseract, 4.0 (source), so it won't work if you are using an older version of tesseract.
Newer versions of tess-two check to make sure that the training data files can be found on the device. If those training data files are not found, a more informative message than the error message you're seeing will be shown.
So when you see this error message on newer versions of tess-two, it means that the training data files were found in the expected location, but they are the wrong version or are otherwise unreadable. Check to make sure you're using the right version of the training data files.