问题:

I'm trying to create a piece of software that automate the PC by capturing the screenshot, then OCR (Optical Character Recognition) it looking for a particular button to click (for example). I've got the mouse and keyboard control part, but now, I needed an OCR to process the screenshot. What I discovered is that Tesseract OCR does not seems to work very well with on-screen text. The text is either too small, or that some of text seems to be connected, like for example K and X. How should I go about this?

p/s: this is for an automated test program.

回答1:

I am not sure if this really fits the bill for you, but some of the better OCR that I have seen in automation is done by Tevron's CitraTest. It has a library of fonts included and if a fontset is not present, they will create a new one based on your submissions. Nagative factors with this tool would be cost and the usual issues related to variable screen resolution.