In my experience, OCR libraries tend to merely output the text found within an image but not where the text was found. Is there an OCR library that outputs both the words found within an image as well as the coordinates (x, y, width, height
) where those words were found?
相关问题
- How to get the bounding box of text that are overl
- How to improve OCR accuracy?
- How to get Ocrad.js example to work [closed]
- Changing image DPI for usage with tesseract
- Using Ruby And Ubuntu With Optical Character Recog
相关文章
- I want to sort the words extracted from image in o
- Moroccan License Plate Recognition (LPR) using Ope
- Tesseract thinks my 1's are 7's
- How to hide the console window when I run tesserac
- Recognizing numbers in an image in java
- OCR code in android platform [duplicate]
- tesseract OCR in iphone application
- What is a good OCR that can detect handwriting? [c
You may also take a look at Gamera framework (http://gamera.informatik.hsnr.de/) it is a set of tools, which allows you to build your own OCR engine. Nevertheless the fastest way is to use Tesseract or OCRopus hOCR (http://en.wikipedia.org/wiki/HOCR) output.
I'm using TessNet (a Tesseract C# wrapper) and I'm getting word coordinates with the following code:
You can use the
hocr
"configfile" with tesseract like so:This will output a mostly HTML5 document with elements like:
While I'm pretty sure that's not how you're supposed to use XML, I found it easier than digging into the tesseract API.
P.S. I realize that several comments and answers allude to this solution, but none of them actually show how to use the
hocr
option or describe the output you get from that.For Java Developers:
I will recommend for this you to use Tesseract and Tess4j.
You can actually find an example on how to find words on a Image in one of the tests of Tess4j.
https://github.com/nguyenq/tess4j/blob/master/src/test/java/net/sourceforge/tess4j/TessAPITest.java#L449-L517
hocr is a one of the output format of tesseract OCR engine,which has both word and it's coordinates and also has some additional info like confident level of word recognition.
Most commercial OCR engines will return word and character coordinate positions but you have to work with their SDK's to extract the information. Even Tesseract OCR will return position information but it has been not easy to get to. Version 3.01 will make easier but a DLL interface is still being worked on.
Unfortunately, most free OCR programs use Tesseract OCR in its basic form and they only report the raw ASCII results.
www.transym.com - Transym OCR - outputs coordinates. www.rerecognition.com - Kasmos engine returns coordinates.
Also Caere Omnipage, Mitek, Abbyy, Charactell return character positions.