Before mark as copy or repeat question, please read the whole question first.
I am able to do at pressent is as below:
- To get image and crop the desired part for OCR.
- Process the image using
tesseract
andleptonica
. - When the applied document is cropped in chunks ie 1 character per image it provides 96% of accuracy.
- If I don't do that and the document background is in white color and text is in black color it gives almost same accuracy.
For example if the input is as this photo :
Photo start
Photo end
What I want is to able to get the same accuracy for this photo
without generating blocks.
The code I used to init tesseract and extract text from image is as below:
For init of tesseract
in .h file
tesseract::TessBaseAPI *tesseract;
uint32_t *pixels;
in .m file
tesseract = new tesseract::TessBaseAPI();
tesseract->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], "eng");
tesseract->SetPageSegMode(tesseract::PSM_SINGLE_LINE);
tesseract->SetVariable("tessedit_char_whitelist", "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ");
tesseract->SetVariable("language_model_penalty_non_freq_dict_word", "1");
tesseract->SetVariable("language_model_penalty_non_dict_word ", "1");
tesseract->SetVariable("tessedit_flip_0O", "1");
tesseract->SetVariable("tessedit_single_match", "0");
tesseract->SetVariable("textord_noise_normratio", "5");
tesseract->SetVariable("matcher_avg_noise_size", "22");
tesseract->SetVariable("image_default_resolution", "450");
tesseract->SetVariable("editor_image_text_color", "40");
tesseract->SetVariable("textord_projection_scale", "0.25");
tesseract->SetVariable("tessedit_minimal_rejection", "1");
tesseract->SetVariable("tessedit_zero_kelvin_rejection", "1");
For get text from image
- (void)processOcrAt:(UIImage *)image
{
[self setTesseractImage:image];
tesseract->Recognize(NULL);
char* utf8Text = tesseract->GetUTF8Text();
int conf = tesseract->MeanTextConf();
NSArray *arr = [[NSArray alloc]initWithObjects:[NSString stringWithUTF8String:utf8Text],[NSString stringWithFormat:@"%d%@",conf,@"%"], nil];
[self performSelectorOnMainThread:@selector(ocrProcessingFinished:)
withObject:arr
waitUntilDone:YES];
free(utf8Text);
}
- (void)ocrProcessingFinished0:(NSArray *)result
{
UIAlertView *alt = [[UIAlertView alloc]initWithTitle:@"Data" message:[result objectAtIndex:0] delegate:self cancelButtonTitle:nil otherButtonTitles:@"OK", nil];
[alt show];
}
But I don't get proper output for the number plate image either it is null or it gives some garbage data for the image.
And if I use the image which is the first one ie white background with text as black then the output is 89 to 95% accurate.
Please help me out.
Any suggestion will be appreciated.
Update
Thanks to @jcesar for providing the link and also to @konstantin pribluda to provide valuable information and guide.
I am able to convert images in to proper black and white form (almost). and so the recognition is better for all images :)
Need help with proper binarization of images. Any Idea will be appreciated
I was able to achieve near instant results using the demo photo provided as well as it generating the correct letters.
I pre-processed the image using GPUImage
And then sending that processed image to TESS
This left ' marks for the - but these are also easy to remove. Depending on the image set that you have you may have to fine tune it a bit but it should get you moving in the right direction.
Let me know if you have problems using it, it's from a project I'm using and I didn't want to have to strip everything out or create a project from scratch for it.
I daresay that tesseract will be overkill for your purpose. You do not need dictionary matching to improve recognition quality ( you do not have this dictionary , but maybe means to compute checksum on license number ), and you have font optimised for OCR. And best of all, you have markers (orange and blue color areas nearby are good) to find region in the image.
I my OCR apps I use human assisted area of interest retrieval ( just aiming help overlay over camera preview). Usually ones uses something like haar cascade to locate interesting features like faces. You may also calculate centroid of orange area, or just bounding box of orange pixels simply by traversing all the image and stoing leftmost / rightmost / topmost / bottommost pixels of suitable color
As for recognition itselff I would recommend to use invariant moments ( not sure whether implemented in tesseract, but you can easily port it from out java project: http://sourceforge.net/projects/javaocr/ )
I tried my demo app on monitor image and it recognized digits on the sport (is not trained for characters)
As for binarisation ( separating black from white ) I would recommend sauvola method as this gives best tolerance to luminance changes ( also implemented in our OCR project )
Hi all Thanks for your replies, from all of that replies I am able to get this conclusion as below:
Above 4 steps are combined in to one method like this as below :
Note:
UPDATE :
Just replace the above method's(
getRGBAsFromImage:
) code with this one and the result is same but the time taken is just 0.1 to 0.3 second only.