I have array:
a = np.array([[ 0, 1, 2, 0, 0, 0],
[ 0, 4, 1, 35, 0, 10],
[ 0, 0, 5, 4, 0, 4],
[ 1, 2, 5, 4, 0, 4]])
I need select only from first consecutive 0
in each row:
[[ True False False False False False]
[ True False False False False False]
[ True True False False False False]
[ False False False False False False]]
I try:
a[np.arange(len(a)), a.argmax(1): np.arange(len(a)), [0,0,0]] = True
But this is wrong.
You can use np.cumsum
.
Assumption: you are looking for zeros only at the start of each row.
a = np.array([[ 0, 1, 2, 0, 0, 0],
[ 0, 4, 1, 35, 0, 10],
[ 0, 0, 5, 4, 0, 4]])
a.cumsum(axis=1) == 0
array([[ True, False, False, False, False, False],
[ True, False, False, False, False, False],
[ True, True, False, False, False, False]], dtype=bool)
Basis: holds True
for as long as the cumulative sum is 0 along each row.
Error-prone: an array with negative ints would cause this to fail. I.e. for [-1, 1]
, this would evaluate to True
at position 1.
You might use np.minimum.accumulate
with the condition testing a == 0
(over the rows); Since non zero gives False
, so elements come after the first non zero will be set to False
due to the accumulated minimum:
np.minimum.accumulate(a == 0, axis=1)
#array([[ True, False, False, False, False, False],
# [ True, False, False, False, False, False],
# [ True, True, False, False, False, False],
# [False, False, False, False, False, False]], dtype=bool)
Here's one with argmin
+ broadcasting
-
(a==0).argmin(1)[:,None] > np.arange(a.shape[1])
Explanation with a sample step-by-step run
1) Input array :
In [207]: a
Out[207]:
array([[ 0, 1, 2, 0, 0, 0],
[ 0, 4, 1, 35, 0, 10],
[ 0, 0, 5, 4, 0, 4],
[ 1, 2, 5, 4, 0, 4]])
2) Mask of zeros
In [208]: (a==0)
Out[208]:
array([[ True, False, False, True, True, True],
[ True, False, False, False, True, False],
[ True, True, False, False, True, False],
[False, False, False, False, True, False]], dtype=bool)
3) Get the indices where the False occurs signalling the end of first True island for each row. Thus, for any row where there is no zero or if the first element is non-zero would result in argmin output as 0
. Thus, our next task would be to use broadcasting
to create a mask that starts as True from first row and stops being True at those argmin
indices. This would be one with broadcasted-comparison
against a range array extending covering all columns.
In [209]: (a==0).argmin(1)
Out[209]: array([1, 1, 2, 0])
In [210]: (a==0).argmin(1)[:,None] > np.arange(a.shape[1])
Out[210]:
array([[ True, False, False, False, False, False],
[ True, False, False, False, False, False],
[ True, True, False, False, False, False],
[False, False, False, False, False, False]], dtype=bool)
Timings
In [196]: a = np.random.randint(0,9,(5000,5000))
In [197]: %timeit a.cumsum(axis=1) == 0 #@Brad Solomon
...: %timeit np.minimum.accumulate(a == 0, axis=1) #@Psidom
...: %timeit (a==0).argmin(1)[:,None] > np.arange(a.shape[1])
...:
10 loops, best of 3: 69 ms per loop
10 loops, best of 3: 64.9 ms per loop
10 loops, best of 3: 32.8 ms per loop