I've just realized that if I perform OCR process only on the regions that contain text, it would be a lot faster. So what I did were detecting the text regions in the image and then perform OCR process on each one of them. This is the result of "detecting text regions" step using OpenCV (I used it to draw the rectangles on the image):
The only problem remains is I couldn't arrange the text result in the order that they appear on the original image. In this case, it should be:
circle oval triangle square trapezium
diamond rhombus parallelogram rectangle pentagon
hexagon heptagon octagon nonagon decagon
Some other cases:
Basically any other images that have text on them.
So I'm trying to sort the array of rectangles (origin point, width and height) then rearrange the text associate with them.
Further information
I don't know if it's necessary, but here is the code I used:
How I detected the text regions
+(NSMutableArray*) detectLetters:(UIImage*) image
{
cv::Mat img;
UIImageToMat(image, img);
if (img.channels()!=1) {
NSLog(@"NOT A GRAYSCALE IMAGE! CONVERTING TO GRAYSCALE.");
cv::cvtColor(img, img, CV_BGR2GRAY);
}
//The array of text regions (rectangle)
NSMutableArray* array = [[NSMutableArray alloc] init];
cv::Mat img_gray=img, img_sobel, img_threshold, element;
//Edge detection
cv::Sobel(img_gray, img_sobel, CV_8U, 1, 0, 3, 1, 0, cv::BORDER_DEFAULT);
cv::threshold(img_sobel, img_threshold, 0, 255, CV_THRESH_OTSU+CV_THRESH_BINARY);
element = getStructuringElement(cv::MORPH_RECT, cv::Size(17, 3) );
cv::morphologyEx(img_threshold, img_threshold, CV_MOP_CLOSE, element);
std::vector< std::vector< cv::Point> > contours;
//
cv::findContours(img_threshold, contours, 0, 1);
std::vector<std::vector<cv::Point> > contours_poly( contours.size() );
for( int i = 0; i < contours.size(); i++ )
if (contours[i].size()>50)
{
cv::approxPolyDP( cv::Mat(contours[i]), contours_poly[i], 3, true );
cv::Rect appRect( boundingRect( cv::Mat(contours_poly[i]) ));
if (appRect.width>appRect.height){
[array addObject:[NSValue valueWithCGRect:CGRectMake(appRect.x,appRect.y,appRect.width,appRect.height)]];
}
}
return array;
}
This is the OCR process (using Tesseract):
NSMutableArray *arr=[STOpenCV detectLetters:img];
CFTimeInterval totalStartTime = CACurrentMediaTime();
NSMutableString *res=[[NSMutableString alloc] init];
for(int i=0;i<arr.count;i++){
NSLog(@"\n-------------\nPROCESSING REGION %d/%lu",i+1,(unsigned long)arr.count);
//Set the OCR region using the result from last step
tesseract.rect=[[arr objectAtIndex:i] CGRectValue];
CFTimeInterval startTime = CACurrentMediaTime();
NSLog(@"Start to recognize: %f",startTime);
[tesseract recognize];
NSString *result=[tesseract recognizedText];
NSLog(@"Result: %@", result);
[res appendString:result];
CFTimeInterval elapsedTime = CACurrentMediaTime() - startTime;
NSLog(@"FINISHED: %f", elapsedTime);
}
What you want is to sort the array of rects by y position (y - height/2) and then x (x - width/2) if they are on the same vertical line.