I want to count the occurrences of a specific value (in my case -1
) in a numpy array and delete them at the same time.
I could do that so here is what I've done:
a = np.array([1, 2, 0, -1, 3, -1, -1])
b = a[a==-1]
a = np.delete(a, np.where(a==-1))
print("a -> ", a) # a -> [1 2 0 3]
print("b -> ", b) # b -> 3
Is there any more optimised way to do it ?
Something like this ?
Using numpy like you did is probably more optimized though.
a = [x for x in a if x != -1]
First, a list in-place count and delete operation:
In [100]: al=a.tolist(); cnt=0
In [101]: for i in range(len(a)-1,-1,-1):
...: if al[i]==-1:
...: del al[i]
...: cnt += 1
In [102]: al
Out[102]: [1, 2, 0, 3]
In [103]: cnt
Out[103]: 3
It operates in place, but has to work from the end. The list comprehension alternative makes a new list, but often is easier to write and read.
The cleanest array operation uses a boolean mask.
In [104]: idx = a==-1
In [105]: idx
Out[105]: array([False, False, False, True, False, True, True], dtype=bool)
In [106]: np.sum(idx) # or np.count_nonzero(idx)
Out[106]: 3
In [107]: a[~idx]
Out[107]: array([1, 2, 0, 3])
You have to identify, in one way or other, all elements that match the target. The count is a trivial operation. Masking is also easy.
np.delete
has to be told which items to delete; and in one way or other constructs a new array that contains all but the 'deleted' ones. Because of its generality it will almost always be slower than a direct action like this masking.
np.where
(aka np.nonzeros
) uses count_nonzero
to determine how many values it will return.
So I'm proposing the same actions as you are doing, but in a little more direct way.