Partitioning images based on their white space

2020-08-07 02:48发布

I have lots of images of three objects with a white background separated by white space. For example,

enter image description here

Is it possible to split this image (and ones like it) into three images automatically? It would be great if this also worked from the command line.

2条回答
做自己的国王
2楼-- · 2020-08-07 03:23

You need to sum-up over every column in the image and compare the sum with the theoretical sum of all pixels in that column being white (i.e., #lines times 255). Add all columns that match the criterion to a list of indices. In case there is not always a fully clean line between the objects (e.g. due to compression artifacts), you can set a lower threshold instead of the full-white sum.

Now go through your list of indices. Remove all adjacent indices that start at the first column. Also remove all adjacent indices that end at the far right of the image. Create groups of indices that are adjacent to each other. In each group count the number of indices and calculate the mean index.

Now take the two largest groups and take their mean is the index for where to crop.

You can do this in a rather small script in Python with OpenCV, or C++ OpenCV program.

查看更多
beautiful°
3楼-- · 2020-08-07 03:32

As @ypnos said, you want to collapse the rows by summation, or averaging. That will leave you with a vector the width of the image. Next clip everything below a high threshold, remembering that high numbers correspond to high brightness. This will select the white space:

The result of clipping the collapsed brightness.

Then you simply cluster the remaining indices and select the middle two clusters (since the outer two belong to the bordering white space). In python this looks like so:

import sklearn.cluster, PIL.Image, numpy, sys, os.path
# import matplotlib.pyplot as plt

def split(fn, thresh=200):

    img = PIL.Image.open(fn)
    dat = numpy.array(img.convert(mode='L'))
    h, w = dat.shape
    dat = dat.mean(axis=0)
    # plt.plot(dat*(dat>thresh);

    path, fname = os.path.split(fn)
    fname = os.path.basename(fn)
    base, ext = os.path.splitext(fname)

    guesses = numpy.matrix(numpy.linspace(0, len(dat), 4)).T
    km = sklearn.cluster.KMeans(n_clusters=2, init=guesses)
    km.fit(numpy.matrix(numpy.nonzero(dat>thresh)).T)
    c1, c2 = map(int, km.cluster_centers_[[1,2]])

    img.crop((0, 0, c1, h)).save(path + '/' + base + '_1' + ext)
    img.crop((c1, 0, c2, h)).save(path + '/' + base + '_2' + ext)
    img.crop((c2, 0, w, h)).save(path + '/' + base + '_3' + ext)

if __name__ == "__main__":
    split(sys.argv[1], int(sys.argv[2]))

One shortcoming of this method is that it may stumble on images with bright objects (failing to properly identify the white space), or are not separated by a clean vertical line (e.g., overlapping in the composite). In such cases line detection, which is not constrained to vertical lines, would work better. I leave implementing that to someone else.

查看更多
登录 后发表回答