I want to recognize numbers in the following image
I am currently using Tess4J library in eclipse java project but it only recognizes the characters in a plane color background. For this image it could not even identify that there are characters(numbers) on this image. Help me find a way to accomplish this task.
Here is my current code:
import net.sourceforge.tess4j.*;
import java.io.File;
public class Main {
public static void main(String[] args) {
File imageFile = new File("image.png");
Tesseract instance = Tesseract.getInstance();
try {
String result = instance.doOCR(imageFile);
System.out.println(result);
} catch (TesseractException e) {
System.err.println(e.getMessage());
}
}
}
and if there is way to count the squares separated by yellow lines.
Thank you
If your image is representative, then all you need as a first step is a binarization at a threshold close to the maximum value followed by discarding of small components.
f = Import["http://i.stack.imgur.com/6AXwH.jpg"]
step1 = SelectComponents[Binarize[ColorConvert[f, "Grayscale"], 0.9],
"Count", #1 > 100 &]
Now, if you know that the digits cannot be too tall or too thin (this is dependent on image dimensions), then you can filter the remaining components based on its bounding box.
SelectComponents[step1, "BoundingBox",
And[10 < #[[2, 1]] - #[[1, 1]] < 100, 50 < #[[2, 2]] - #[[1, 2]] < 100] &]
To separate each of the regions, you could consider using a colorspace where there is a channel dedicated to the yellow color. CMYK
is a possibility here, and again all you need is a threshold at a high value, together with the basic morphological closing to complete the lines (since in your example the lines do not extend to the border of the image). Instead of using morphological closings here, you could detect the lines using Hough or RANSAC, for example.
rects = Closing[
Closing[Binarize[ColorSeparate[f, "CMYK"][[3]], 0.9],
ConstantArray[1, {1, 15}]], ConstantArray[1, {15, 1}]] (* left image *)
Colorize[MorphologicalComponents[ColorNegate[rects]],
ColorFunction -> "Rainbow"] (* right image *)
The tools used here are very simple, and almost any image processing library will provide them. There are also more robust approaches that could be taken, but for the given image it is not needed.