Python numpy array with multiple conditions to ite

2020-06-29 04:27发布

问题:

I'm filtering some images to remove unnecessary background, and so far have had best success with checking for pixel BGR (using openCV) values. The problem is that iterating over the image with 2 nested loops is way too slow:

h, w, channels = img.shape
    for x in xrange(0,h):
        for y in xrange (0,w):
            pixel = img[x,y]
            blue = pixel[0]
            green = pixel[1]
            red = pixel[2]

            if green > 110:
                img[x,y] = [0,0,0]
                continue

            if blue < 100 and red < 50 and green > 80:
                img[x,y] = [0,0,0]
                continue

There are a couple of more similar if-statements, but you get the idea. The problem is this takes around 10 seconds on a 672x1250 on an i7.

Now, I can easily do the first if statement like so:

img[np.where((img > [0,110,0]).all(axis=2))] = [0,0,0]

And it's much much faster, but I can't seem to do the other if-statements with multiple conditions in them using np.where.

Here's what I've tried:

img[np.where((img < [100,0,0]).all(axis=2)) & ((img < [0,0,50]).all(axis=2)) & ((img > [0,80,0]).all(axis=2))] = [0,0,0]

But throws an error:

ValueError: operands could not be broadcast together with shapes (2,0) (1250,672)

Any ideas how to properly iterate over the image using np.where (or anything that's faster than 2 nested loop) will help a lot!

回答1:

You could express the conditions (without np.where) like this:

import numpy as np
img = np.random.randint(255, size=(4,4,3))
blue, green, red = img[..., 0], img[..., 1], img[..., 2]
img[(green > 110) | ((blue < 100) & (red < 50) & (green > 80))] = [0,0,0]

In [229]: %%timeit img = np.random.randint(255, size=(672,1250,3))
   .....: blue, green, red = img[..., 0], img[..., 1], img[..., 2]
   .....: img[(green > 110) | ((blue < 100) & (red < 50) & (green > 80))] = [0,0,0]
   .....: 
100 loops, best of 3: 14.9 ms per loop

In [240]: %%timeit img = np.random.randint(255, size=(672,1250,3))
   .....: using_loop(img)
   .....: 
1 loop, best of 3: 1.39 s per loop

where using_loop(img) executes the double loop posted in the question.