I have a python function given below:
def myfun(x):
if x > 0:
return 0
else:
return np.exp(x)
where np
is the numpy
library. I want to make the function vectorized in numpy, so I use:
vec_myfun = np.vectorize(myfun)
I did a test to evaluate the efficiency. First I generate a vector of 100 random numbers:
x = np.random.randn(100)
Then I run the following code to obtain the runtime:
%timeit np.exp(x)
%timeit vec_myfun(x)
The runtime for np.exp(x)
is 1.07 µs ± 24.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
.
The runtime for vec_myfun(x)
is 71.2 µs ± 1.68 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
My question is: compared to np.exp
, vec_myfun
has only one extra step to check the value of $x$, but it runs much slowly than np.exp
. Is there an efficient way to vectorize myfun
to make it as efficient as np.exp
?
Just thinking outside of the box, what about implementing a function
piecewise_exp()
that basically multipliesnp.exp()
witharr < 0
?Writing the code proposed so far as functions:
And testing that everything is consistent:
Doing some micro-benchmarks for small inputs:
... and for larger inputs:
This shows that
piecewise_exp()
is faster than anything else proposed so far, especially for larger inputs for whichnp.where()
gets more inefficient since it uses integer indexing instead of boolean masks, and reasonably approachesnp.exp()
speed.EDIT
Also, the performances of the
np.where()
version (bnaeker_exp()
) do depend on the number of elements of the array actually satisfying the condition. If none of them does (like when you test onx = np.random.rand(100)
), this is slightly faster than the boolean array multiplication version (piecewise_exp()
) (128 µs ± 3.26 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
on my machine forn = 100000
).ufunc
likenp.exp
have awhere
parameter, which can be used as:This actually skips the calculation where the
where
is false.In contrast:
calculates
np.exp(arr)
first for allarr
(that's normal Python evaluation order), and then performs thewhere
selection. With thisexp
that isn't a big deal, but withlog
it could be problems.Use
np.where
:For comparison, your vectorized function runs in about 30 microseconds on my machine.
As to why it runs slower, it's just much more complicated than
np.exp
. It's doing lots of type deduction, broadcasting, and possibly making many calls to the actual method. Much of this happens in Python itself, while nearly everything in the call tonp.exp
(and thenp.where
version here) is in C.