Count the occurrences of a specific value and remo

2019-09-07 10:37发布

问题:

I want to count the occurrences of a specific value (in my case -1) in a numpy array and delete them at the same time.

I could do that so here is what I've done:

a = np.array([1, 2, 0, -1, 3, -1, -1])
b = a[a==-1]
a = np.delete(a, np.where(a==-1)) 
print("a -> ", a) # a ->  [1 2 0 3]
print("b -> ", b) # b ->  3

Is there any more optimised way to do it ?

回答1:

Something like this ? Using numpy like you did is probably more optimized though.

a = [x for x in a if x != -1]


回答2:

First, a list in-place count and delete operation:

In [100]: al=a.tolist(); cnt=0
In [101]: for i in range(len(a)-1,-1,-1):
     ...:     if al[i]==-1:
     ...:         del al[i]
     ...:         cnt += 1

In [102]: al
Out[102]: [1, 2, 0, 3]
In [103]: cnt
Out[103]: 3

It operates in place, but has to work from the end. The list comprehension alternative makes a new list, but often is easier to write and read.

The cleanest array operation uses a boolean mask.

In [104]: idx = a==-1
In [105]: idx
Out[105]: array([False, False, False,  True, False,  True,  True], dtype=bool)
In [106]: np.sum(idx)  # or np.count_nonzero(idx)
Out[106]: 3
In [107]: a[~idx]
Out[107]: array([1, 2, 0, 3])

You have to identify, in one way or other, all elements that match the target. The count is a trivial operation. Masking is also easy.

np.delete has to be told which items to delete; and in one way or other constructs a new array that contains all but the 'deleted' ones. Because of its generality it will almost always be slower than a direct action like this masking.

np.where (aka np.nonzeros) uses count_nonzero to determine how many values it will return.

So I'm proposing the same actions as you are doing, but in a little more direct way.