OCR (Optical Character Recognition) for on-screen

2019-06-25 12:17发布

问题:

I'm trying to create a piece of software that automate the PC by capturing the screenshot, then OCR (Optical Character Recognition) it looking for a particular button to click (for example). I've got the mouse and keyboard control part, but now, I needed an OCR to process the screenshot. What I discovered is that Tesseract OCR does not seems to work very well with on-screen text. The text is either too small, or that some of text seems to be connected, like for example K and X. How should I go about this?

p/s: this is for an automated test program.

回答1:

I am not sure if this really fits the bill for you, but some of the better OCR that I have seen in automation is done by Tevron's CitraTest. It has a library of fonts included and if a fontset is not present, they will create a new one based on your submissions. Nagative factors with this tool would be cost and the usual issues related to variable screen resolution.



回答2:

Perhaps look at this question on image enhancement prior to OCR. Otherwise this question is pretty similar to "OCR for .NET".

If you are feeling really bold you can always whip up a simple Perceptron or Neural Network based approach :-)