How to get the spline basis used by scipy.interpol

2020-07-21 06:45发布

问题:

I need to evaluate b-splines in python. To do so i wrote the code below which works very well.

import numpy as np
import scipy.interpolate as si

def scipy_bspline(cv,n,degree):
    """ bspline basis function
        c        = list of control points.
        n        = number of points on the curve.
        degree   = curve degree
    """
    # Create a range of u values
    c = cv.shape[0]
    kv = np.clip(np.arange(c+degree+1)-degree,0,c-degree)
    u  = np.linspace(0,c-degree,n)
    
    # Calculate result
    return np.array(si.splev(u, (kv,cv.T,degree))).T

Giving it 6 control points and asking it to evaluate 100k points on curve is a breeze:

# Control points
cv = np.array([[ 50.,  25., 0.],
       [ 59.,  12., 0.],
       [ 50.,  10., 0.],
       [ 57.,   2., 0.],
       [ 40.,   4., 0.],
       [ 40.,   14., 0.]])

n = 100000  # 100k Points
degree = 3 # Curve degree
points_scipy = scipy_bspline(cv,n,degree) #cProfile clocks this at 0.012 seconds

Here's a plot of "points_scipy":

Now, to make this faster i could compute a basis for all 100k points on the curve, store that in memory, and when i need to draw a curve all i would need to do is multiply the new control point positions with the stored basis to get the new curve. To prove my point i wrote a function that uses DeBoor's algorithm to compute my basis:

def basis(c, n, degree):
    """ bspline basis function
        c        = number of control points.
        n        = number of points on the curve.
        degree   = curve degree
    """
    # Create knot vector and a range of samples on the curve
    kv = np.array([0]*degree + range(c-degree+1) + [c-degree]*degree,dtype='int') # knot vector
    u  = np.linspace(0,c-degree,n) # samples range
    
    # Cox - DeBoor recursive function to calculate basis
    def coxDeBoor(u, k, d):
        # Test for end conditions
        if (d == 0):
            if (kv[k] <= u and u < kv[k+1]):
                return 1
            return 0
        
        Den1 = kv[k+d] - kv[k]
        Den2 = 0
        Eq1  = 0
        Eq2  = 0
        
        if Den1 > 0:
            Eq1 = ((u-kv[k]) / Den1) * coxDeBoor(u,k,(d-1))
            
        try:
            Den2 = kv[k+d+1] - kv[k+1]
            if Den2 > 0:
                Eq2 = ((kv[k+d+1]-u) / Den2) * coxDeBoor(u,(k+1),(d-1))
        except:
            pass
        
        return Eq1 + Eq2
    

    # Compute basis for each point
    b = np.zeros((n,c))
    for i in xrange(n):
        for k in xrange(c):
            b[i][k%c] += coxDeBoor(u[i],k,degree)

    b[n-1][-1] = 1
                
    return b

Now lets use this to compute a new basis, multiply that by the control points and confirm that we're getting the same results as splev:

b = basis(len(cv),n,degree) #5600011 function calls (600011 primitive calls) in 10.975 seconds
points_basis = np.dot(b,cv) #3 function calls in 0.002 seconds
print np.allclose(points_basis,points_scipy) # Returns True

My extremely slow function returned 100k basis values in 11 seconds, but since these values only need to be computed once, calculating the points on curve ended up being 6 times faster this way than doing it via splev.

The fact that I was able to get the exact same results from both my method and splev leads me to believe that internally splev probably calculates a basis like i do, except much faster.

So my goal is to find out how to calculate my basis fast, store it in memory and just use np.dot() to compute the new points on curve, and my question is: Is it possible to use spicy.interpolate to get the basis values that (i presume) splev uses to compute it's result? And if so, how?

[ADDENDUM]

Following unutbu's and ev-br's very useful insight on how scipy calculates a spline's basis, I looked up the fortran code and wrote a equivalent to the best of my abilities:

def fitpack_basis(c, n=100, d=3, rMinOffset=0, rMaxOffset=0):
    """ fitpack's spline basis function
        c = number of control points.
        n = number of points on the curve.
        d = curve degree
    """
    # Create knot vector
    kv = np.array([0]*d + range(c-d+1) + [c-d]*d, dtype='int')

    # Create sample range
    u = np.linspace(rMinOffset, rMaxOffset + c - d, n)  # samples range
    
    # Create buffers
    b  = np.zeros((n,c)) # basis
    bb = np.zeros((n,c)) # basis buffer
    left  = np.clip(np.floor(u),0,c-d-1).astype(int)   # left  knot vector indices
    right = left+d+1 # right knot vector indices

    # Go!
    nrange = np.arange(n)
    b[nrange,left] = 1.0
    for j in xrange(1, d+1):
        crange = np.arange(j)[:,None]
        bb[nrange,left+crange] = b[nrange,left+crange]        
        b[nrange,left] = 0.0
        for i in xrange(j):
            f = bb[nrange,left+i] / (kv[right+i] - kv[right+i-j])
            b[nrange,left+i] = b[nrange,left+i] + f * (kv[right+i] - u)
            b[nrange,left+i+1] = f * (u - kv[right+i-j])
            
    return b

Testing against unutbu's version of my original basis function:

fb = fitpack_basis(c,n,d) #22 function calls in 0.044 seconds
b = basis(c,n,d) #81 function calls (45 primitive calls) in 0.013 seconds  ~5 times faster
print np.allclose(b,fb) # Returns True

My function is 5 times slower, but still relatively fast. What I like about it is that it lets me use sample ranges that go beyond the boundaries, which is something of use in my application. For example:

print fitpack_basis(c,5,d,rMinOffset=-0.1,rMaxOffset=.2)
[[ 1.331  -0.3468  0.0159 -0.0002  0.      0.    ]
[ 0.0208  0.4766  0.4391  0.0635  0.      0.    ]
[ 0.      0.0228  0.4398  0.4959  0.0416  0.    ]
[ 0.      0.      0.0407  0.3621  0.5444  0.0527]
[ 0.      0.     -0.0013  0.0673 -0.794   1.728 ]]

So for that reason I will probably use fitpack_basis since it's relatively fast. But i would love suggestions for improving its performance and hopefully get closer to unutbu's version of the original basis function i wrote.

回答1:

fitpack_basis uses a double loop which iteratively modifies elements in bb and b. I don't see a way to use NumPy to vectorize these loops since the value of bb and b at each stage of the iteration depends on the values from previous iterations. In situations like this sometimes Cython can be used to improve the performance of the loops.

Here is a Cython-ized version of fitpack_basis, which runs as fast as bspline_basis. The main ideas used to boost speed using Cython is to declare the type of every variable, and rewrite all uses of NumPy fancy indexing as loops using plain integer indexing.

See this page for instructions on how to build the code and run it from python.

import numpy as np
cimport numpy as np
cimport cython

ctypedef np.float64_t DTYPE_f
ctypedef np.int64_t DTYPE_i

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
def cython_fitpack_basis(int c, int n=100, int d=3, 
                         double rMinOffset=0, double rMaxOffset=0):
    """ fitpack's spline basis function
        c = number of control points.
        n = number of points on the curve.
        d = curve degree
    """
    cdef Py_ssize_t i, j, k, l
    cdef double f

    # Create knot vector
    cdef np.ndarray[DTYPE_i, ndim=1] kv = np.array(
        [0]*d + range(c-d+1) + [c-d]*d, dtype=np.int64)

    # Create sample range
    cdef np.ndarray[DTYPE_f, ndim=1] u = np.linspace(
        rMinOffset, rMaxOffset + c - d, n)

    # basis
    cdef np.ndarray[DTYPE_f, ndim=2] b  = np.zeros((n,c)) 
    # basis buffer
    cdef np.ndarray[DTYPE_f, ndim=2] bb = np.zeros((n,c)) 
    # left  knot vector indices
    cdef np.ndarray[DTYPE_i, ndim=1] left = np.clip(np.floor(u), 0, c-d-1).astype(np.int64)   
    # right knot vector indices
    cdef np.ndarray[DTYPE_i, ndim=1] right = left+d+1 

    for k in range(n):
        b[k, left[k]] = 1.0

    for j in range(1, d+1):
        for l in range(j):
            for k in range(n):
                bb[k, left[k] + l] = b[k, left[k] + l] 
                b[k, left[k]] = 0.0
        for i in range(j):
            for k in range(n):
                f = bb[k, left[k]+i] / (kv[right[k]+i] - kv[right[k]+i-j])
                b[k, left[k]+i] = b[k, left[k]+i] + f * (kv[right[k]+i] - u[k])
                b[k, left[k]+i+1] = f * (u[k] - kv[right[k]+i-j])
    return b

Using this timeit code to benchmark it's performance,

import timeit
import numpy as np
import cython_bspline as CB
import numpy_bspline as NB

c = 6
n = 10**5
d = 3

fb = NB.fitpack_basis(c, n, d)
bb = NB.bspline_basis(c, n, d) 
cfb = CB.cython_fitpack_basis(c,n,d) 

assert np.allclose(bb, fb) 
assert np.allclose(cfb, fb) 
# print(NB.fitpack_basis(c,5,d,rMinOffset=-0.1,rMaxOffset=.2))

timing = dict()
timing['NB.fitpack_basis'] = timeit.timeit(
    stmt='NB.fitpack_basis(c, n, d)', 
    setup='from __main__ import NB, c, n, d', 
    number=10)
timing['NB.bspline_basis'] = timeit.timeit(
    stmt='NB.bspline_basis(c, n, d)', 
    setup='from __main__ import NB, c, n, d', 
    number=10)
timing['CB.cython_fitpack_basis'] = timeit.timeit(
    stmt='CB.cython_fitpack_basis(c, n, d)', 
    setup='from __main__ import CB, c, n, d', 
    number=10)

for func_name, t in timing.items():
    print "{:>25}: {:.4f}".format(func_name, t)

it appears Cython can make the fitpack_basis code run as fast as (and perhaps a bit faster) than bspline_basis:

         NB.bspline_basis: 0.3322
  CB.cython_fitpack_basis: 0.2939
         NB.fitpack_basis: 0.9182


回答2:

Here is a version of coxDeBoor which is (on my machine) 840x faster than the original.

In [102]: %timeit basis(len(cv), 10**5, degree)
1 loops, best of 3: 24.5 s per loop

In [104]: %timeit bspline_basis(len(cv), 10**5, degree)
10 loops, best of 3: 29.1 ms per loop

import numpy as np
import scipy.interpolate as si

def scipy_bspline(cv, n, degree):
    """ bspline basis function
        c        = list of control points.
        n        = number of points on the curve.
        degree   = curve degree
    """
    # Create a range of u values
    c = len(cv)
    kv = np.array(
        [0] * degree + range(c - degree + 1) + [c - degree] * degree, dtype='int')
    u = np.linspace(0, c - degree, n)

    # Calculate result
    arange = np.arange(n)
    points = np.zeros((n, cv.shape[1]))
    for i in xrange(cv.shape[1]):
        points[arange, i] = si.splev(u, (kv, cv[:, i], degree))

    return points


def memo(f):
    # Peter Norvig's
    """Memoize the return value for each call to f(args).
    Then when called again with same args, we can just look it up."""
    cache = {}

    def _f(*args):
        try:
            return cache[args]
        except KeyError:
            cache[args] = result = f(*args)
            return result
        except TypeError:
            # some element of args can't be a dict key
            return f(*args)
    _f.cache = cache
    return _f


def bspline_basis(c, n, degree):
    """ bspline basis function
        c        = number of control points.
        n        = number of points on the curve.
        degree   = curve degree
    """
    # Create knot vector and a range of samples on the curve
    kv = np.array([0] * degree + range(c - degree + 1) +
                  [c - degree] * degree, dtype='int')  # knot vector
    u = np.linspace(0, c - degree, n)  # samples range

    # Cox - DeBoor recursive function to calculate basis
    @memo
    def coxDeBoor(k, d):
        # Test for end conditions
        if (d == 0):
            return ((u - kv[k] >= 0) & (u - kv[k + 1] < 0)).astype(int)

        denom1 = kv[k + d] - kv[k]
        term1 = 0
        if denom1 > 0:
            term1 = ((u - kv[k]) / denom1) * coxDeBoor(k, d - 1)

        denom2 = kv[k + d + 1] - kv[k + 1]
        term2 = 0
        if denom2 > 0:
            term2 = ((-(u - kv[k + d + 1]) / denom2) * coxDeBoor(k + 1, d - 1))

        return term1 + term2

    # Compute basis for each point
    b = np.column_stack([coxDeBoor(k, degree) for k in xrange(c)])
    b[n - 1][-1] = 1

    return b

# Control points
cv = np.array([[50.,  25., 0.],
               [59.,  12., 0.],
               [50.,  10., 0.],
               [57.,   2., 0.],
               [40.,   4., 0.],
               [40.,   14., 0.]])

n = 10 ** 6
degree = 3  # Curve degree
points_scipy = scipy_bspline(cv, n, degree)

b = bspline_basis(len(cv), n, degree)
points_basis = np.dot(b, cv)  
print np.allclose(points_basis, points_scipy)

The majority of the speedup is achieved by making coxDeBoor compute a vector of results instead of a single value at a time. Notice that u is removed from the arguments passed to coxDeBoor. Instead, the new coxDeBoor(k, d) calculates what was before np.array([coxDeBoor(u[i], k, d) for i in xrange(n)]).

Since NumPy array arithmetic has the same syntax as scalar arithmetic, very little code needed to change. The only syntactic change was in the end condition:

if (d == 0):
    return ((u - kv[k] >= 0) & (u - kv[k + 1] < 0)).astype(int)

(u - kv[k] >= 0) and (u - kv[k + 1] < 0) are boolean arrays. astype changes the array values to 0 and 1. So whereas before a single 0 or 1 was returned, now a whole array of 0s and 1s are returned -- one for each value in u.

Memoization can also improve performance, since the recurrence relations causes coxDeBoor(k, d) to be called for the same values of k and d more than once. The decorator syntax

@memo
def coxDeBoor(k, d):
    ...

is equivalent to

def coxDeBoor(k, d):
    ...
coxDeBoor = memo(coxDeBoor)

and the memo decorator causes coxDeBoor to record in a cache a mapping from (k, d) pairs of arguments to returned values. If coxDeBoor(k, d) is called again, then value from the cache will be returned instead of re-computing the value.


scipy_bspline is still faster, but at least bspline_basis plus np.dot is in the ballpark, and may be useful if you want to re-use b with many control points, cv.

In [109]: %timeit scipy_bspline(cv, 10**5, degree)
100 loops, best of 3: 17.2 ms per loop

In [104]: %timeit b = bspline_basis(len(cv), 10**5, degree)
10 loops, best of 3: 29.1 ms per loop

In [111]: %timeit points_basis = np.dot(b, cv)
100 loops, best of 3: 2.46 ms per loop