Python openCV matchTemplate on grayscale image wit

2020-06-21 03:23发布

问题:

I have a project where I want to locate a bunch of arrows in images that look like so: ibb.co/dSCAYQ with the following template: ibb.co/jpRUtQ

I'm using cv2's template matching feature in Python. My algorithm is to rotate the template 360 degrees and match for each rotation. I get the following result: ibb.co/kDFB7k

As you can see, it works well except for the 2 arrows that are really close, such that another arrow is in the black region of the template.

I am trying to use a mask, but it seems that cv2 is not applying my masks at all, i.e. no matter what values that mask array has, the matching is the same. Have been trying this for two days but cv2's limited documentation is not helping.

Here is my code:

import numpy as np
import cv2
import os
from scipy import misc, ndimage

STRIPPED_DIR = #Image dir
TMPL_DIR = #Template dir
MATCH_THRESH = 0.9
MATCH_RES = 1  #specifies degree-interval at which to match

def make_templates():
    base = misc.imread(os.path.join(TMPL_DIR,'base.jpg')) # The templ that I rotate to make 360 templates
    for deg in range(360):
        print('making template: ' + str(deg))
        tmpl = ndimage.rotate(base, deg)
        misc.imsave(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.jpg'), tmpl)

def make_masks():
    for deg in range(360):
        tmpl = cv2.imread(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.jpg'), 0)
        ret2, mask = cv2.threshold(tmpl, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
        cv2.imwrite(os.path.join(TMPL_DIR, 'mask' + str(deg) + '.jpg'), mask)

def match(img_name):
    img_rgb = cv2.imread(os.path.join(STRIPPED_DIR, img_name))
    img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)

    for deg in range(0, 360, MATCH_RES):
        tmpl = cv2.imread(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.jpg'), 0)
        mask = cv2.imread(os.path.join(TMPL_DIR, 'mask' + str(deg) + '.jpg'), 0)
        w, h = tmpl.shape[::-1]
        res = cv2.matchTemplate(img_gray, tmpl, cv2.TM_CCORR_NORMED, mask=mask)
        loc = np.where( res >= MATCH_THRESH)

        for pt in zip(*loc[::-1]):
            cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
        cv2.imwrite('res.png',img_rgb)

Some things that I think could be wrong but not sure how to fix:

  1. The number of channels the mask/tmpl/img should have. I have tried an example with colored 4-channel pngs stackoverflow eg., but not sure how it translates to grayscale or 3-channel jpegs.
  2. The values of the mask array. e.g. Should masked out pixels be 1 or 255?

Any help is greatly appreciated.

UPDATE I fixed a trivial error in my code; mask=mask must be used in the argument for matchTemplate(). This combined with using mask values of 255 made the difference. However, now I get a ton of false positives like so: http://ibb.co/esfTnk Note that the false positives are more strongly correlated than the true positives. Any pointers on how to fix my masks to resolve this? Right now I am simply using a black-and-white conversion of my templates.

回答1:

You've already figured out the first questions, but I'll expand a bit on them:

For a binary mask, it should be of type uint8 where the values are simply zero or non-zero. The locations with zero are ignored, and are included in the mask if they are non-zero. You can pass a float32 instead as a mask, in which case, it lets you weight the pixels; so a value of 0 is ignore, 1 is include, and .5 is include but only give it half as much weight as another pixel. Note that a mask is only supported for TM_SQDIFF and TM_CCORR_NORMED, but that's fine since you're using the latter. Masks for matchTemplate are single channel only. And as you found out, mask is not a positional argument, so it must be called with the key in the argument, mask=your_mask. All of this is pretty explicit in this page on the OpenCV docs.

Now to the new issue:

It's related to the method you're using and the fact that you're using jpgs. Have a look at the formulas for the normed methods. Where the image is completely zero, you're going to get faulty results because you'll be dividing by zero. But that's not the exact problem---because that returns nan and np.nan > value always returns false, so you'll never be drawing a square from nan values.

Instead the problem is right at the edge cases where you get a hint of a non-zero value; and because you're using jpg images, not all black values are exactly 0; in fact, many aren't. Note from the formula you're diving by the mean values, and the mean values will be extremely small when you have values like 1, 2, 5, etc inside your image window, so it will blow up the correlation value. You should use TM_SQDIFF instead (because it's the only other method which allows a mask). Additionally because you're using jpg most of your masks are worthless, since any non-zero value (even 1) counts as an inclusion. You should use pngs for the masks. As long as the templates have a proper mask, shouldn't matter whether you use jpg or png for the templates.

With TM_SQDIFF, instead of looking for the maximum values, you're looking for the minimum---you want the smallest difference between the template and image patch. You know that the difference should be really small---exactly 0 for a pixel-perfect match, which you probably won't get. You can play around with thresholding a little bit. Note that you're always going to get pretty close values for every rotation, because the nature of your template---the little arrow bar hardly adds that many positive values, and it's not necessarily guaranteed that the one degree discretization its exactly right (unless you made the image that way). But even an arrow facing the totally wrong direction is going to still going to be extremely close since there's a lot of overlap; and the arrow facing close to the right direction will be really close to values with the exactly right direction.

Preview what the result of the square difference is while you're running the code:

res = cv2.matchTemplate(img_gray, tmpl, cv2.TM_SQDIFF, mask=mask)
cv2.imshow("result", res.astype(np.uint8))
if cv2.waitKey(0) & 0xFF == ord('q'):
    break

You can see that basically every orientation of template matches closely.

Anyways, it seems a threshold of 8 nailed it:

The only thing I modified in your code was changing to pngs for all images, switching to TM_SQDIFF, making sure loc looks for values less than the threshold instead of greater than, and using a MATCH_THRESH of 8. At least I think that's all I changed. Have a look just in case:

import numpy as np
import cv2
import os
from scipy import misc, ndimage

STRIPPED_DIR = ...
TMPL_DIR = ...
MATCH_THRESH = 8
MATCH_RES = 1  #specifies degree-interval at which to match

def make_templates():
    base = misc.imread(os.path.join(TMPL_DIR,'base.jpg')) # The templ that I rotate to make 360 templates
    for deg in range(360):
        print('making template: ' + str(deg))
        tmpl = ndimage.rotate(base, deg)
        misc.imsave(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.png'), tmpl)

def make_masks():
    for deg in range(360):
        tmpl = cv2.imread(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.png'), 0)
        ret2, mask = cv2.threshold(tmpl, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
        cv2.imwrite(os.path.join(TMPL_DIR, 'mask' + str(deg) + '.png'), mask)

def match(img_name):
    img_rgb = cv2.imread(os.path.join(STRIPPED_DIR, img_name))
    img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)

    for deg in range(0, 360, MATCH_RES):
        tmpl = cv2.imread(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.png'), 0)
        mask = cv2.imread(os.path.join(TMPL_DIR, 'mask' + str(deg) + '.png'), 0)
        w, h = tmpl.shape[::-1]
        res = cv2.matchTemplate(img_gray, tmpl, cv2.TM_SQDIFF, mask=mask)

        loc = np.where(res < MATCH_THRESH)
        for pt in zip(*loc[::-1]):
            cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
        cv2.imwrite('res.png',img_rgb)