I have a vector and wish to make another vector of the same length whose k-th component is
The question is: how can we vectorize this for speed? NumPy vectorize() is actually a for loop, so it doesn't count.
Veedrac pointed out that "There is no way to apply a pure Python function to every element of a NumPy array without calling it that many times". Since I'm using NumPy functions rather than "pure Python" ones, I suppose it's possible to vectorize, but I don't know how.
import numpy as np
from scipy.integrate import quad
ws = 2 * np.random.random(10) - 1
n = len(ws)
integrals = np.empty(n)
def f(x, w):
if w < 0: return np.abs(x * w)
else: return np.exp(x) * w
def temp(x): return np.array([f(x, w) for w in ws]).sum()
def integrand(x, w): return f(x, w) * np.log(temp(x))
## Python for loop
for k in range(n):
integrals[k] = quad(integrand, -1, 1, args = ws[k])[0]
## NumPy vectorize
integrals = np.vectorize(quad)(integrand, -1, 1, args = ws)[0]
On a side note, is a Cython for loop always faster than NumPy vectorization?
@zaq's
answer focusing onquad
is spot on. So I'll look at some other aspects of the problem.In recent https://stackoverflow.com/a/41205930/901925 I argue that
vectorize
is of most value when you need to apply the full broadcasting mechanism to a function that only takes scalar values. Yourquad
qualifies as taking scalar inputs. But you are only iterating on one array,ws
. Thex
that is passed on to your functions is generated byquad
itself.quad
andintegrand
are still Python functions, even if they usenumpy
operations.cython
improves low level iteration, stuff that it can convert toC
code. Your primary iteration is at a high level, calling an imported function,quad
. Cython can't touch or rewrite that.You might be able to speed up
integrand
(and on down) withcython
, but first focus on getting the most speed from that with regularnumpy
code.With
if w<0
w
must be scalar. Can it be written so it works with an arrayw
? If so, thencould be rewritten as
Alternatively, since both
x
andw
are scalar, you might get a bit of speed improvement by usingmath.exp
etc instead ofnp.exp
. Same forlog
andabs
.I'd try to write
f(x,w)
so it takes arrays for bothx
andw
, returning a 2d result. If so, thentemp
andintegrand
would also work with arrays. Sincequad
feeds a scalarx
, that may not help here, but with other integrators it could make a big difference.If
f(x,w)
can be evaluated on a regular nx10 grid ofx=np.linspace(-1,1,n)
andws
, then an integral (of sorts) just requires a couple of summations over that space.The function
quad
executes an adaptive algorithm, which means the computations it performs depend on the specific thing being integrated. This cannot be vectorized in principle.In your case, a
for
loop of length 10 is a non-issue. If the program takes long, it's because integration takes long, not because you have afor
loop.When you absolutely need to vectorize integration (not in the example above), use a non-adaptive method, with the understanding that precision may suffer. These can be directly applied to a 2D NumPy array obtained by evaluating all of your functions on some regularly spaced 1D array (a
linspace
). You'll have to choose the linspace yourself since the methods aren't adaptive.2**n+1
for some integern
.