How can I find endpoints of binary skeleton image

2020-07-11 09:04发布

问题:

I have a skeleton as binary pixels, such as this:

I would like to find the coordinates of the end points of this skeleton (in this case there are four), using Open CV if applicable.

Efficiency is important as I'm analysing a number of these in real time from a video feed and need to be doing lots of other things at the same time.

(Note, apologies that the screenshot above has resizing artefacts, but it is an 8-connected skeleton I am working with.)

回答1:

Given your tags of your questions and answers in your profile, I'm going to assume you want a C++ implementation. When you skeletonize an object, the object should have a 1 pixel thickness. Therefore, one thing that I could suggest is find those pixels that are non-zero in your image, then search in an 8-connected neighbourhood surrounding this pixel and count those pixels that are non-zero. If the count is only 2, then that is a candidate for an skeleton endpoint. Note that I'm also going to ignore the border so we don't go out of bounds. If the count is 1, it's a noisy isolated pixel so we should ignore it. If it's 3 or more, then that means that you're examining part of the skeleton at either a point within the skeleton, or you're at a point where multiple lines are connected together, so this shouldn't be an endpoint either.

I honestly can't think of any algorithm other than checking all of the skeleton pixels for this criteria.... so the complexity will be O(mn), where m and n are the rows and columns of your image. For each pixel in your image, the 8 pixel neighbourhood check takes constant time and this will be the same for all skeleton pixels you check. However, this will certainly be sublinear as the majority of your pixels will be 0 in your image, so the 8 pixel neighbourhood checking won't happen most of the time.

As such, this is something that I would try, assuming that your image is stored in a cv::Mat structure called im, it being a single channel (grayscale) image, and is of type uchar. I'm also going to store the co-ordinates of where the skeleton end points are in a std::vector type. Every time we detect a skeleton point, we will add two integers to the vector at a time - the row and column of where we detect the ending skeleton point.

// Declare variable to count neighbourhood pixels
int count;

// To store a pixel intensity
uchar pix;

// To store the ending co-ordinates
std::vector<int> coords;

// For each pixel in our image...
for (int i = 1; i < im.rows-1; i++) {
    for (int j = 1; j < im.cols-1; j++) {

        // See what the pixel is at this location
        pix = im.at<uchar>(i,j);

        // If not a skeleton point, skip
        if (pix == 0)
            continue;

        // Reset counter
        count = 0;     

        // For each pixel in the neighbourhood
        // centered at this skeleton location...
        for (int y = -1; y <= 1; y++) {
            for (int x = -1; x <= 1; x++) {

                // Get the pixel in the neighbourhood
                pix = im.at<uchar>(i+y,j+x);

                // Count if non-zero
                if (pix != 0)
                    count++;
            }
        }

        // If count is exactly 2, add co-ordinates to vector
        if (count == 2) {
            coords.push_back(i);
            coords.push_back(j);
        }
    }
}

If you want to show the co-ordinates when you're done, just check every pair of elements in this vector:

for (int i = 0; i < coords.size() / 2; i++)
    cout << "(" << coords.at(2*i) << "," coords.at(2*i+1) << ")\n";

To be complete, here's a Python implementation as well. I'm using some of numpy's functions to make this easier for myself. Assuming that your image is stored in img, which is also a grayscale image, and importing the OpenCV library and numpy (i.e. import cv2, import numpy as np), this is the equivalent code:

# Find row and column locations that are non-zero
(rows,cols) = np.nonzero(img)

# Initialize empty list of co-ordinates
skel_coords = []

# For each non-zero pixel...
for (r,c) in zip(rows,cols):

    # Extract an 8-connected neighbourhood
    (col_neigh,row_neigh) = np.meshgrid(np.array([c-1,c,c+1]), np.array([r-1,r,r+1]))

    # Cast to int to index into image
    col_neigh = col_neigh.astype('int')
    row_neigh = row_neigh.astype('int')

    # Convert into a single 1D array and check for non-zero locations
    pix_neighbourhood = img[row_neigh,col_neigh].ravel() != 0

    # If the number of non-zero locations equals 2, add this to 
    # our list of co-ordinates
    if np.sum(pix_neighbourhood) == 2:
        skel_coords.append((r,c))

To show the co-ordinates of the end points, you can do:

print "".join(["(" + str(r) + "," + str(c) + ")\n" for (r,c) in skel_coords])

Minor note: This code is untested. I don't have C++ OpenCV installed on this machine so hopefully what I wrote will work. If it doesn't compile, you can certainly translate what I have done into the right syntax. Good luck!



回答2:

A bit late, but this still might be useful for people!

There's a way of doing the exact same thing as @rayryeng suggests, but with the builtin functions of openCV! This makes it much smaller, and probably way faster (especially with Python, if you are using that, as I am!) It is the same solution as this one.

Basically, what we are trying to find is the pixels that are non-zero, with one non-zero neighbor. So what we do is use openCV's built in filter2D function to convolve the skeleton image with a custom kernel that we make. I just learned about convolution and kernels, and this page is really helpful at explaining what these things mean.

So, what kernel would work? How about

[[1, 1,1],
 [1,10,1],
 [1, 1,1]]? 

Then, after applying this kernel, any pixel with the value 11 is one that we want!

Here is what I use:

def skeleton_endpoints(skel):
    # make out input nice, possibly necessary
    skel = skel.copy()
    skel[skel!=0] = 1
    skel = np.uint8(skel)

    # apply the convolution
    kernel = np.uint8([[1,  1, 1],
                       [1, 10, 1],
                       [1,  1, 1]])
    src_depth = -1
    filtered = cv2.filter2D(skel,src_depth,kernel)

    # now look through to find the value of 11
    # this returns a mask of the endpoints, but if you just want the coordinates, you could simply return np.where(filtered==11)
    out = np.zeros_like(skel)
    out[np.where(filtered==11)] = 1
    return out

Hope this helps!