How to fill the gaps in letters after Canny edge d

2020-07-23 04:07发布

问题:

I'm trying to do an Arabic OCR using Tesseract, but the OCR doesn't work unless the letters are filled with black color. How do I fill the gaps after Canny edge detection?

Here is a sample image and sample code:

import tesserocr
from PIL import Image
import pytesseract
import matplotlib as plt
import cv2
import imutils
import numpy as np

image = cv2.imread(r'c:\ahmed\test3.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

gray = cv2.bilateralFilter(gray,30,40,40)
#gray = cv2.GaussianBlur(gray,(1,1), 0)
gray =cv2.fastNlMeansDenoising(gray ,None, 4, 7, 21)

image = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
            cv2.THRESH_BINARY,11,2)
k = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 1))

blur = cv2.medianBlur(image,3)
erode = cv2.erode(blur, k)
dilat = cv2.dilate(erode,k)
cv2.imshow("gray", dilat)

#cv2.imshow("dilation", img_dilation)
#thresh = cv2.Canny(thresh, 70, 200)

#crop_img = gray[215:215+315, 783:783+684]
#cv2.imshow("cropped", crop_img)

#resize = imutils.resize(blur, width = 460)
#cv2.imshow("resize", resize)

text = pytesseract.image_to_string(dilat, lang='ara')
print(text)
with open(r"c:\ahmed\file.txt", "w", encoding="utf-8") as myfile:
    myfile.write(text)
cv2.waitKey(0)

Result:

This is a sample image that won't work with neither thresholding nor Canny.

回答1:

In this case, because the text is black, it is best to simply find all the black pixels.

One very simple way to accomplish this using NumPy is as follows:

import matplotlib.pyplot as pp
import numpy as np

image = pp.imread(r'/home/cris/tmp/Zuv3p.jpg')
bin = np.all(image<100, axis=2)

What this does is find all pixels where all three channels are below a value of 100. I picked the threshold of 100 sort of randomly, there probably are better ways to pick a threshold. :)


Notes:

1- When working with color input, converting to gray-value image as first step is usually a bad idea. This throws away a lot of information. Sometimes it's appropriate, but in this case it is better not to.

2- Edge detection is really nice, but is usually the wrong approach. Use edge detection when you need to find edges. Use something else when you don't want just the edges.


Edit: If for some reason np.all complains about the data type (it doesn't for me), you should be able to convert its input to the right type:

bin = np.all(np.array(image<100, dtype=np.bool), axis=2)

or maybe

bin = np.all(np.array(image<100, dtype=np.uint8), axis=2)