How to split noise and text from the image for pre

2019-08-27 01:34发布

问题:

I am applying OCR against subtitle in TV footage. (I am using Tesseact 3.x w/ C++) I am trying to split text and background part as a preprocessing of OCR.

Here's the original image:

And, preprocessed image:

The OCR result is: Sicemn clone

As the above preprocessed image shown, there're some "fog" remained around the letter which prevents OCR module to do their job properly.

Is there any way to recognize those "fog" programatically to remove, or do some image processing to remove/reduce it from the preprocessed image?

Since preprocessed logic is heavily optimized to handle different images, I rather want to find a way to "clean" the preprocessed image, than modifying preprocessed logic (since optimizing to this pics can affecting to other pics)

Any suggestion is very welcome.


Update

Apparently, sixela's answer is great, and will work with most of the case. The case it does not work is background also include similar color of text

Example of not working case:

Example of result:

Seemingly, Gaussian filter seems to cause a problem in this types of footage. This implies, different footage may requires different approach.

回答1:

I managed to have a clearer (not perfect) image by using morphological operations and thresholding.

Here is how:

  1. I started by converting the original image in greyscale
  2. Applied a gaussian Blur (9x9 kernel) to denoise the greyscale image
  3. Top Hat Morphological operation (3x3 kernel)to get the white text
  4. Otsu thresholding method
  5. dilation
  6. Inverted binary threshold to get the white text in black

I finally obtained the following image

Which gives, as OCR results, this text: "Since vou don'k"

PS: This result can of course be improved by tweaking the parameters (kernel size for example) but i hope it can guide you. I used OpenCv in Python to quickly try out those methods.

import cv2

image = cv2.imread('./inputImg.png', 0)
imgBlur = cv2.GaussianBlur(image, (9, 9), 0)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
imgTH = cv2.morphologyEx(imgBlur, cv2.MORPH_TOPHAT, kernel)
_, imgBin = cv2.threshold(imgTH, 0, 250, cv2.THRESH_OTSU)

imgdil = cv2.dilate(imgBin, kernel)
_, imgBin_Inv = cv2.threshold(imgdil, 0, 250, cv2.THRESH_BINARY_INV)

cv2.imshow('original', image)
cv2.imshow('bin', imgBin)
cv2.imshow('dil', imgdil)
cv2.imshow('inv', imgBin_Inv)

cv2.imwrite('./output.png', imgBin_Inv)
cv2.waitKey(0)

After this i tried the output image on Tesseract with this command:

tesseract output.png stdout