Let's say we have the following function:
def f(x, y):
if y == 0:
return 0
return x/y
This works fine with scalar values. Unfortunately when I try to use numpy arrays for x
and y
the comparison y == 0
is treated as an array operation which results in an error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-13-9884e2c3d1cd> in <module>()
----> 1 f(np.arange(1,10), np.arange(10,20))
<ipython-input-10-fbd24f17ea07> in f(x, y)
1 def f(x, y):
----> 2 if y == 0:
3 return 0
4 return x/y
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I tried to use np.vectorize
but it doesn't make a difference, the code still fails with the same error.np.vectorize
is one option which gives the result I expect.
The only solution that I can think of is to use np.where
on the y
array with something like:
def f(x, y):
np.where(y == 0, 0, x/y)
which doesn't work for scalars.
Is there a better way to write a function which contains an if statement? It should work with both scalars and arrays.
You can use a masked array that will perform the division only where
y!=0
:I wonder what the problem is you're facing with
np.vectorize
. It works fine on my system:Note that the result
dtype
is determined by the result of the first element. You can also set the desired output yourself:There are more examples in the docs.
One way is to convert
x
andy
to numpy arrays inside your function:This will work when one of
x
ory
is a scalar and the other is a numpy array. It will also work if they are both arrays that can be broadcast. It won't work if they're arrays of incompatible shapes (e.g., 1D arrays of different lengths), but it's not clear what the desired behavior would be in that case anyway.A kind of clunky but effective way is to basically pre-process the data:
I timed all the different approaches:
The first function is the quickest, and has no warnings. The time ratios are similar if x or y are scalars. For higher dimensional arrays, the masked array approach gets relatively faster (it's still the slowest though).