NumPy vectorization with integration

2019-01-19 22:31发布

I have a vector enter image description here and wish to make another vector of the same length whose k-th component is

enter image description here

The question is: how can we vectorize this for speed? NumPy vectorize() is actually a for loop, so it doesn't count.

Veedrac pointed out that "There is no way to apply a pure Python function to every element of a NumPy array without calling it that many times". Since I'm using NumPy functions rather than "pure Python" ones, I suppose it's possible to vectorize, but I don't know how.

import numpy as np
from scipy.integrate import quad
ws = 2 * np.random.random(10) - 1
n  = len(ws)
integrals = np.empty(n)

def f(x, w):
    if w < 0: return np.abs(x * w)
    else:     return np.exp(x) * w

def temp(x): return np.array([f(x, w) for w in ws]).sum()

def integrand(x, w): return f(x, w) * np.log(temp(x))

## Python for loop
for k in range(n):
    integrals[k] = quad(integrand, -1, 1, args = ws[k])[0]

## NumPy vectorize
integrals = np.vectorize(quad)(integrand, -1, 1, args = ws)[0]

On a side note, is a Cython for loop always faster than NumPy vectorization?

2条回答
叛逆
2楼-- · 2019-01-19 22:39

@zaq's answer focusing on quad is spot on. So I'll look at some other aspects of the problem.

In recent https://stackoverflow.com/a/41205930/901925 I argue that vectorize is of most value when you need to apply the full broadcasting mechanism to a function that only takes scalar values. Your quad qualifies as taking scalar inputs. But you are only iterating on one array, ws. The x that is passed on to your functions is generated by quad itself. quad and integrand are still Python functions, even if they use numpy operations.

cython improves low level iteration, stuff that it can convert to C code. Your primary iteration is at a high level, calling an imported function, quad. Cython can't touch or rewrite that.

You might be able to speed up integrand (and on down) with cython, but first focus on getting the most speed from that with regular numpy code.

def f(x, w):
    if w < 0: return np.abs(x * w)
    else:     return np.exp(x) * w

With if w<0 w must be scalar. Can it be written so it works with an array w? If so, then

 np.array([f(x, w) for w in ws]).sum()

could be rewritten as

 fn(x, ws).sum()

Alternatively, since both x and w are scalar, you might get a bit of speed improvement by using math.exp etc instead of np.exp. Same for log and abs.

I'd try to write f(x,w) so it takes arrays for both x and w, returning a 2d result. If so, then temp and integrand would also work with arrays. Since quad feeds a scalar x, that may not help here, but with other integrators it could make a big difference.

If f(x,w) can be evaluated on a regular nx10 grid of x=np.linspace(-1,1,n) and ws, then an integral (of sorts) just requires a couple of summations over that space.

查看更多
放我归山
3楼-- · 2019-01-19 22:45

The function quad executes an adaptive algorithm, which means the computations it performs depend on the specific thing being integrated. This cannot be vectorized in principle.

In your case, a for loop of length 10 is a non-issue. If the program takes long, it's because integration takes long, not because you have a for loop.

When you absolutely need to vectorize integration (not in the example above), use a non-adaptive method, with the understanding that precision may suffer. These can be directly applied to a 2D NumPy array obtained by evaluating all of your functions on some regularly spaced 1D array (a linspace). You'll have to choose the linspace yourself since the methods aren't adaptive.

  • numpy.trapz is the simplest and least precise
  • scipy.integrate.simps is equally easy to use and more precise (Simpson's rule requires an odd number of samples, but the method works around having an even number, too).
  • scipy.integrate.romb is in principle of higher accuracy than Simpson (for smooth data) but it requires the number of samples to be 2**n+1 for some integer n.
查看更多
登录 后发表回答