I have 2 numpy arrays:
aa = np.random.rand(5,5)
bb = np.random.rand(5,5)
How can I create a new array which has a value of 1 when both aa and bb exceed 0.5?
I have 2 numpy arrays:
aa = np.random.rand(5,5)
bb = np.random.rand(5,5)
How can I create a new array which has a value of 1 when both aa and bb exceed 0.5?
With focus on performance and using two methods few aproaches could be added. One method would be to get the boolean array of valid ones and converting to int
datatype with .astype() method
. Another way could involve using np.where
that lets us select between 0
and 1
based on the same boolean array. Thus, essentially we would have two methods, one that harnesses efficient datatype conversion and another that uses selection criteria. Now, the boolean array could be obtained in two ways - One using simple comparison and another using np.logical_and
. So, with two ways to get the boolean array and two methods to convert the boolean array to int
array, we would end up with four implementations as listed below -
out1 = ((aa>0.5) & (bb>0.5)).astype(int)
out2 = np.logical_and(aa>0.5, bb>0.5).astype(int)
out3 = np.where((aa>0.5) & (bb>0.5),1,0)
out4 = np.where(np.logical_and(aa>0.5, bb>0.5), 1, 0)
You can play around with the datatypes to use less precision types, which shouldn't hurt as we are setting the values to 0
and 1
anyway. The benefit should be noticeable speedup as it leverages memory efficiency. We could use int8
, uint8
, np.int8
, np.uint8
types. Thus, the variants of the earlier listed approaches using the new int
datatypes would be -
out5 = ((aa>0.5) & (bb>0.5)).astype('int8')
out6 = np.logical_and(aa>0.5, bb>0.5).astype('int8')
out7 = ((aa>0.5) & (bb>0.5)).astype('uint8')
out8 = np.logical_and(aa>0.5, bb>0.5).astype('uint8')
out9 = ((aa>0.5) & (bb>0.5)).astype(np.int8)
out10 = np.logical_and(aa>0.5, bb>0.5).astype(np.int8)
out11 = ((aa>0.5) & (bb>0.5)).astype(np.uint8)
out12 = np.logical_and(aa>0.5, bb>0.5).astype(np.uint8)
Runtime test (as we are focusing on performance with this post) -
In [17]: # Input arrays
...: aa = np.random.rand(1000,1000)
...: bb = np.random.rand(1000,1000)
...:
In [18]: %timeit ((aa>0.5) & (bb>0.5)).astype(int)
...: %timeit np.logical_and(aa>0.5, bb>0.5).astype(int)
...: %timeit np.where((aa>0.5) & (bb>0.5),1,0)
...: %timeit np.where(np.logical_and(aa>0.5, bb>0.5), 1, 0)
...:
100 loops, best of 3: 9.13 ms per loop
100 loops, best of 3: 9.16 ms per loop
100 loops, best of 3: 10.4 ms per loop
100 loops, best of 3: 10.4 ms per loop
In [19]: %timeit ((aa>0.5) & (bb>0.5)).astype('int8')
...: %timeit np.logical_and(aa>0.5, bb>0.5).astype('int8')
...: %timeit ((aa>0.5) & (bb>0.5)).astype('uint8')
...: %timeit np.logical_and(aa>0.5, bb>0.5).astype('uint8')
...:
...: %timeit ((aa>0.5) & (bb>0.5)).astype(np.int8)
...: %timeit np.logical_and(aa>0.5, bb>0.5).astype(np.int8)
...: %timeit ((aa>0.5) & (bb>0.5)).astype(np.uint8)
...: %timeit np.logical_and(aa>0.5, bb>0.5).astype(np.uint8)
...:
100 loops, best of 3: 5.6 ms per loop
100 loops, best of 3: 5.61 ms per loop
100 loops, best of 3: 5.63 ms per loop
100 loops, best of 3: 5.63 ms per loop
100 loops, best of 3: 5.62 ms per loop
100 loops, best of 3: 5.62 ms per loop
100 loops, best of 3: 5.62 ms per loop
100 loops, best of 3: 5.61 ms per loop
In [20]: %timeit 1 * ((aa > 0.5) & (bb > 0.5)) #@BPL's vectorized soln
100 loops, best of 3: 10.2 ms per loop
What about this?
import numpy as np
aa = np.random.rand(5, 5)
bb = np.random.rand(5, 5)
print aa
print bb
cc = 1 * ((aa > 0.5) & (bb > 0.5))
print cc
when element of aa and bb at index i is exceed than 0.5 then new array have 1 at index i
aa = np.random.rand(5,5)
bb = np.random.rand(5,5)
new_arr = []
for i in range(5):
for j in range(5):
if aa[i] >0.5 and bb[i]>0.5:
new_arr[i] = 1
else:
new_arr[i] = "any Value You want