I'm trying to figure out what technologies I would need to process images for characters.
Specifically, in this example, I need to extract the hashtag that is circled. You can see it here:
Any implementations would be of great assistance.
I'm trying to figure out what technologies I would need to process images for characters.
Specifically, in this example, I need to extract the hashtag that is circled. You can see it here:
Any implementations would be of great assistance.
OCR works well with scanned document. What you are referring to is text detection in general images, which requires other techniques (sometimes OCR is used as part of the flow)
I'm not aware of any "production ready" implementations.
for general information try google scholar with: "text detection in images"
a specific method that worked well for me is 'stroke width transform' (SWT) it's not hard to implement, and I believe that there also some implementations available online.
There is a few alternatives: Java OCR implementation
They mention the next tools:
And a few others.
This list of links can also be useful: http://www.javawhat.com/showCategory.do?id=2138003
Generally this kind of task requires lots of trial and testing. Probably the best tool depends much more the profile of your input data than anything else.
You can check this article : http://www.codeproject.com/Articles/196168/Contour-Analysis-for-Image-Recognition-in-C
It comes with math theory and implementation on C# (unfortunately, but there not that much to rewrite if you decide to implement it in java ) + opencv. So you will have to use Visual Studio and rebuild against your opencv version if you would like to test it, but it worth it.
It is possible to solve this problem with OpenCV + Tesseract
though I think there might be easier ways. OpenCV is an open source library used to build computer vision applications and Tesseract is an open source OCR engine.
Before we start, let me clarify something: that is not a circle, its a rounded rectangle.
I'm sharing the source code of the application that I wrote to demonstrate how the problem can be solved, as well as some tips on what's going on. This answer is not supposed to educate anybody on digital image processing and it is expected the reader to have a minimal understanding on this field.
I will describe very briefly what the larger sections of the code does. Most of the next chunk of code came from squares.cpp, a sample application that is shipped with OpenCV to detect squares in images.
Ok, so our program begins at:
What grayscale looks like:
What binary looks like:
What blue looks like:
What binary looks like at this point:
What binary looks like at this point:
What output looks like:
Alright! We solved the first part of the problem which was finding the rounded rectangle. You can see in the image above that the rectangular shape was detected and green lines were drawn over the original image for educational purposes.
The second part is much easier. It begins by creating a ROI (Region of Interested) in the original image so we can crop the image to the area inside the rounded rectangle. Once this is done, the cropped image is saved on the disk as a TIFF file, which is then feeded to Tesseract do it's magic:
What crop looks like:
When this application finishes it's job, it creates a file named
cropped.tiff
on the disk. Go to the command-line and invoke Tesseract to detect the text present on the cropped image:This command creates a file named
out.txt
with the detected text:Tesseract has an API that you can use to add the OCR feature into your application.
This solution is not robust and you will probably have to do some changes here and there to make it work for other test cases.