Fast cumulative sum and power operator

2019-08-19 05:13发布

问题:

I have a forecast algorithm that works the trend of time series up to a given horizon using the following code:

import numpy as np
horizon = 91
phi = 0.2
trend = -0.004
trend_up_to_horizon = np.cumsum(phi ** np.arange(horizon) + 1) * self.trend

In this example the first two trend_up_horizon values are:

array([-0.008 , -0.0128])

Is there a computationally faster way to achieve this? At the moment this takes a long time as I guess using the np.cumsum method and ** operator are expensive.

Thanks for any help

回答1:

you could use Cython to make it a tiny bit faster, but it's not much

running %timeit on your basic np.cumsum(phi ** np.arange(horizon) + 1) * trend says it takes 17.5µs on my laptop, which isn't much

a Cython version that does the equivalent is:

import numpy as np
cimport numpy as np
cimport cython

@cython.boundscheck(False)
def do_cumsum(size_t horizon, double phi, double trend):
    cdef np.ndarray[double, ndim=1] out = np.empty(horizon, dtype=np.float)
    cdef double csum = 0
    cdef int i

    for i in range(horizon):
        csum += phi ** i + 1
        out[i] = csum * trend

    return out

which reduces the time of do_cumsum(horizon, phi, trend) down to 6.9µs, while if I switch to single precision/32bit floats this goes down to 4.5µs

that said, microseconds aren't much and you're probably better off focusing your efforts elsewhere



回答2:

You can do this operation quite a bit faster. As you already assumed the (unnecessary) power operator is the main problem here.

In addition to that Numpy don't have a special implementation for power(float64,int64), where the exponent is a small positive integer. Instead Numpy always calculates power(float64,float64) which is a much more complicated task.

Numba und Numexpr have a special implementation for the simple case power(float64,int64), so let's try this in a first step.

First approach

import numpy as np
import numba as nb

horizon = 91
phi = 0.2
trend = -0.004

@nb.njit()
def pow_cumsum(horizon,phi,trend):
    out=np.empty(horizon)
    csum=0.
    for i in range(horizon):
        csum+=phi**i+1
        out[i]=csum*trend
    return out

As said before directly calculating a power is unnecessary, the algorithm can be rewritten to avoid this completely.

Second approach

@nb.njit()
def pow_cumsum_2(horizon,phi,trend):
    out=np.empty(horizon)

    out[0]=2.*trend
    TMP=2.
    val=phi
    for i in range(horizon-1):
        TMP=(val+1+TMP)
        out[i+1]=TMP*trend
        val*=phi
    return out

Timings

%timeit np.cumsum(phi ** np.arange(horizon) + 1) * trend
7.44 µs ± 89.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit pow_cumsum(horizon,phi,trend)
882 ns ± 4.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit pow_cumsum_2(horizon,phi,trend)
559 ns ± 3.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)