I have the following optimization problem. Given two np.arrays X
,Y
and a function K
I would like to compute as fast as possible the matrix incidence gram_matrix where the (i,j)-th
element is computed as K(X[i],Y[j])
.
Here there an implementation using nested for-loops, which are acknowledged to be the slowest to solve these kind of problems.
def proxy_kernel(X,Y,K):
gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
for i, x in enumerate(X):
for j, y in enumerate(Y):
gram_matrix[i, j] = K(x, y)
return gram_matrix
Any help is truly appreciated.
np.vectorize
does make some improvement in speed - about 2x (here I'm using math.atan2
as an black box function that takes 2 scalar arguments).
In [486]: X=np.linspace(0,1,100)
In [487]: K_vect=np.vectorize(math.atan2)
In [488]: timeit proxy_kernel(X,X,math.atan2)
100 loops, best of 3: 7.84 ms per loop
In [489]: timeit K_vect(X[:,None],X)
100 loops, best of 3: 3.49 ms per loop
In [502]: timeit np.arctan2(X[:,None],X) # numpy compiled loop
1000 loops, best of 3: 871 µs per loop
where
def proxy_kernel(X,Y,K):
gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
for i, x in enumerate(X):
for j, y in enumerate(Y):
gram_matrix[i, j] = K(x, y)
return gram_matrix
As long as K
is a black box, you are limited by the time it takes to invoke K
the X.shape[0]*Y.shape[0]
times. You can try to minimize the iteration time, but you are still limited by all those function calls.
https://stackoverflow.com/a/29733040/901925 speeds up the calculation with a Gausian kernel, by taking advantage of the axis
parameter of the np.linalg.norm
function.
You can surely at least vectorize the inner loop:
def proxy_kernel_vect(X, Y, K):
K_vect = np.vectorize(K)
gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
for i, x in enumerate(X):
gram_matrix[i] = K_vect(x, Y)
return gram_matrix
This yields a nice improvement with relatively long arrays:
In [15]: a = np.array(range(1000))
...: b = np.array(range(1000))
...:
In [16]: %timeit proxy_kernel(a, b, k)
1 loops, best of 3: 665 ms per loop
In [17]: %timeit proxy_kernel_vect(a, b, k)
1 loops, best of 3: 266 ms per loop
where k
is just lambda x, y: x+y
.
You can also try vectorize
decorator from numba module.
You particular problem is easily solved using vectorize
and numpy broadcasting:
from numba import vectorize
@vectorize(['float64(float64, float64)'])
def K(x,y):
return x + y
a = np.arange(1000)
b = np.arange(1000)
gram_array = K(a[None,:], b[:, None])