I need to evaluate b-splines in python. To do so i wrote the code below which works very well.
import numpy as np
import scipy.interpolate as si
def scipy_bspline(cv,n,degree):
""" bspline basis function
c = list of control points.
n = number of points on the curve.
degree = curve degree
"""
# Create a range of u values
c = cv.shape[0]
kv = np.clip(np.arange(c+degree+1)-degree,0,c-degree)
u = np.linspace(0,c-degree,n)
# Calculate result
return np.array(si.splev(u, (kv,cv.T,degree))).T
Giving it 6 control points and asking it to evaluate 100k points on curve is a breeze:
# Control points
cv = np.array([[ 50., 25., 0.],
[ 59., 12., 0.],
[ 50., 10., 0.],
[ 57., 2., 0.],
[ 40., 4., 0.],
[ 40., 14., 0.]])
n = 100000 # 100k Points
degree = 3 # Curve degree
points_scipy = scipy_bspline(cv,n,degree) #cProfile clocks this at 0.012 seconds
Here's a plot of "points_scipy":
Now, to make this faster i could compute a basis for all 100k points on the curve, store that in memory, and when i need to draw a curve all i would need to do is multiply the new control point positions with the stored basis to get the new curve. To prove my point i wrote a function that uses DeBoor's algorithm to compute my basis:
def basis(c, n, degree):
""" bspline basis function
c = number of control points.
n = number of points on the curve.
degree = curve degree
"""
# Create knot vector and a range of samples on the curve
kv = np.array([0]*degree + range(c-degree+1) + [c-degree]*degree,dtype='int') # knot vector
u = np.linspace(0,c-degree,n) # samples range
# Cox - DeBoor recursive function to calculate basis
def coxDeBoor(u, k, d):
# Test for end conditions
if (d == 0):
if (kv[k] <= u and u < kv[k+1]):
return 1
return 0
Den1 = kv[k+d] - kv[k]
Den2 = 0
Eq1 = 0
Eq2 = 0
if Den1 > 0:
Eq1 = ((u-kv[k]) / Den1) * coxDeBoor(u,k,(d-1))
try:
Den2 = kv[k+d+1] - kv[k+1]
if Den2 > 0:
Eq2 = ((kv[k+d+1]-u) / Den2) * coxDeBoor(u,(k+1),(d-1))
except:
pass
return Eq1 + Eq2
# Compute basis for each point
b = np.zeros((n,c))
for i in xrange(n):
for k in xrange(c):
b[i][k%c] += coxDeBoor(u[i],k,degree)
b[n-1][-1] = 1
return b
Now lets use this to compute a new basis, multiply that by the control points and confirm that we're getting the same results as splev:
b = basis(len(cv),n,degree) #5600011 function calls (600011 primitive calls) in 10.975 seconds
points_basis = np.dot(b,cv) #3 function calls in 0.002 seconds
print np.allclose(points_basis,points_scipy) # Returns True
My extremely slow function returned 100k basis values in 11 seconds, but since these values only need to be computed once, calculating the points on curve ended up being 6 times faster this way than doing it via splev.
The fact that I was able to get the exact same results from both my method and splev leads me to believe that internally splev probably calculates a basis like i do, except much faster.
So my goal is to find out how to calculate my basis fast, store it in memory and just use np.dot() to compute the new points on curve, and my question is: Is it possible to use spicy.interpolate to get the basis values that (i presume) splev uses to compute it's result? And if so, how?
[ADDENDUM]
Following unutbu's and ev-br's very useful insight on how scipy calculates a spline's basis, I looked up the fortran code and wrote a equivalent to the best of my abilities:
def fitpack_basis(c, n=100, d=3, rMinOffset=0, rMaxOffset=0):
""" fitpack's spline basis function
c = number of control points.
n = number of points on the curve.
d = curve degree
"""
# Create knot vector
kv = np.array([0]*d + range(c-d+1) + [c-d]*d, dtype='int')
# Create sample range
u = np.linspace(rMinOffset, rMaxOffset + c - d, n) # samples range
# Create buffers
b = np.zeros((n,c)) # basis
bb = np.zeros((n,c)) # basis buffer
left = np.clip(np.floor(u),0,c-d-1).astype(int) # left knot vector indices
right = left+d+1 # right knot vector indices
# Go!
nrange = np.arange(n)
b[nrange,left] = 1.0
for j in xrange(1, d+1):
crange = np.arange(j)[:,None]
bb[nrange,left+crange] = b[nrange,left+crange]
b[nrange,left] = 0.0
for i in xrange(j):
f = bb[nrange,left+i] / (kv[right+i] - kv[right+i-j])
b[nrange,left+i] = b[nrange,left+i] + f * (kv[right+i] - u)
b[nrange,left+i+1] = f * (u - kv[right+i-j])
return b
Testing against unutbu's version of my original basis function:
fb = fitpack_basis(c,n,d) #22 function calls in 0.044 seconds
b = basis(c,n,d) #81 function calls (45 primitive calls) in 0.013 seconds ~5 times faster
print np.allclose(b,fb) # Returns True
My function is 5 times slower, but still relatively fast. What I like about it is that it lets me use sample ranges that go beyond the boundaries, which is something of use in my application. For example:
print fitpack_basis(c,5,d,rMinOffset=-0.1,rMaxOffset=.2)
[[ 1.331 -0.3468 0.0159 -0.0002 0. 0. ]
[ 0.0208 0.4766 0.4391 0.0635 0. 0. ]
[ 0. 0.0228 0.4398 0.4959 0.0416 0. ]
[ 0. 0. 0.0407 0.3621 0.5444 0.0527]
[ 0. 0. -0.0013 0.0673 -0.794 1.728 ]]
So for that reason I will probably use fitpack_basis since it's relatively fast. But i would love suggestions for improving its performance and hopefully get closer to unutbu's version of the original basis function i wrote.
Here is a version of
coxDeBoor
which is (on my machine) 840x faster than the original.The majority of the speedup is achieved by making coxDeBoor compute a vector of results instead of a single value at a time. Notice that
u
is removed from the arguments passed tocoxDeBoor
. Instead, the newcoxDeBoor(k, d)
calculates what was beforenp.array([coxDeBoor(u[i], k, d) for i in xrange(n)])
.Since NumPy array arithmetic has the same syntax as scalar arithmetic, very little code needed to change. The only syntactic change was in the end condition:
(u - kv[k] >= 0)
and(u - kv[k + 1] < 0)
are boolean arrays.astype
changes the array values to 0 and 1. So whereas before a single 0 or 1 was returned, now a whole array of 0s and 1s are returned -- one for each value inu
.Memoization can also improve performance, since the recurrence relations causes
coxDeBoor(k, d)
to be called for the same values ofk
andd
more than once. The decorator syntaxis equivalent to
and the
memo
decorator causescoxDeBoor
to record in acache
a mapping from(k, d)
pairs of arguments to returned values. IfcoxDeBoor(k, d)
is called again, then value from thecache
will be returned instead of re-computing the value.scipy_bspline
is still faster, but at leastbspline_basis
plusnp.dot
is in the ballpark, and may be useful if you want to re-useb
with many control points,cv
.fitpack_basis
uses a double loop which iteratively modifies elements inbb
andb
. I don't see a way to use NumPy to vectorize these loops since the value ofbb
andb
at each stage of the iteration depends on the values from previous iterations. In situations like this sometimes Cython can be used to improve the performance of the loops.Here is a Cython-ized version of
fitpack_basis
, which runs as fast as bspline_basis. The main ideas used to boost speed using Cython is to declare the type of every variable, and rewrite all uses of NumPy fancy indexing as loops using plain integer indexing.See this page for instructions on how to build the code and run it from python.
Using this timeit code to benchmark it's performance,
it appears Cython can make the
fitpack_basis
code run as fast as (and perhaps a bit faster) than bspline_basis: