From an array like db
(which will be approximately (1e6, 300)
) and a mask = [1, 0, 1]
vector, I define the target as a 1 in the first column.
I want to create an out
vector that consists of ones where the corresponding row in db
matches the mask
and target==1
, and zeros everywhere else.
db = np.array([ # out for mask = [1, 0, 1]
# target, vector #
[1, 1, 0, 1], # 1
[0, 1, 1, 1], # 0 (fit to mask but target == 0)
[0, 0, 1, 0], # 0
[1, 1, 0, 1], # 1
[0, 1, 1, 0], # 0
[1, 0, 0, 0], # 0
])
I have defined a vline
function that applies a mask
to each array line using np.array_equal(mask, mask & vector)
to check that vectors 101 and 111 fit the mask, then retains only the indices where target == 1
.
out
is initialized to array([0, 0, 0, 0, 0, 0])
out = [0, 0, 0, 0, 0, 0]
The vline
function is defined as:
def vline(idx, mask):
line = db[idx]
target, vector = line[0], line[1:]
if np.array_equal(mask, mask & vector):
if target == 1:
out[idx] = 1
I get the correct result by applying this function line-by-line in a for
loop:
def check_mask(db, out, mask=[1, 0, 1]):
# idx_db to iterate over db lines without enumerate
for idx in np.arange(db.shape[0]):
vline(idx, mask=mask)
return out
assert check_mask(db, out, [1, 0, 1]) == [1, 0, 0, 1, 0, 0] # it works !
Now I want to vectorize vline
by creating a ufunc
:
ufunc_vline = np.frompyfunc(vline, 2, 1)
out = [0, 0, 0, 0, 0, 0]
ufunc_vline(db, [1, 0, 1])
print out
But the ufunc
complains about broadcasting inputs with those shapes:
In [217]: ufunc_vline(db, [1, 0, 1])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-217-9008ebeb6aa1> in <module>()
----> 1 ufunc_vline(db, [1, 0, 1])
ValueError: operands could not be broadcast together with shapes (6,4) (3,)
In [218]: