Is there any way to improve tesseract OCR with sma

2019-02-12 07:59发布

I'm trying to use tesseract-OCR via python-tesseract to read a low resolution font that looks like this:

enter image description here

Unfortunately that image returns

ZIJZHZI

I think the resolution is too low and that is causing problems. I've tried magnifying the image, and cropping it down to individual characters, but neither of these provide much improvement. Is there anything else I should consider doing, preferably something that could be done using the Python Imaging Library? Or should I just give up/train tesseract.

For what it's worth, the PIL has the following built in filters:

BLUR, CONTOUR, DETAIL, EDGE_ENHANCE,
EDGE_ENHANCE_MORE, EMBOSS, FIND_EDGES,
SMOOTH, SMOOTH_MORE, and SHARPEN

标签： ocr tesseract python-imaging-library

1条回答

Rolldiameter

2楼-- · 2019-02-12 08:26

I've tried to magnify the image with:

  convert -resize 400% in.bmp out.bmp

And then read it:

  tesseract out.bmp res

The result is correct:

0人赞添加讨论(0) 举报

Is there any way to improve tesseract OCR with sma

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间