I have the following optimization problem. Given two np.arrays X
,Y
and a function K
I would like to compute as fast as possible the matrix incidence gram_matrix where the (i,j)-th
element is computed as K(X[i],Y[j])
.
Here there an implementation using nested for-loops, which are acknowledged to be the slowest to solve these kind of problems.
def proxy_kernel(X,Y,K):
gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
for i, x in enumerate(X):
for j, y in enumerate(Y):
gram_matrix[i, j] = K(x, y)
return gram_matrix
Any help is truly appreciated.
np.vectorize
does make some improvement in speed - about 2x (here I'm usingmath.atan2
as an black box function that takes 2 scalar arguments).where
As long as
K
is a black box, you are limited by the time it takes to invokeK
theX.shape[0]*Y.shape[0]
times. You can try to minimize the iteration time, but you are still limited by all those function calls.https://stackoverflow.com/a/29733040/901925 speeds up the calculation with a Gausian kernel, by taking advantage of the
axis
parameter of thenp.linalg.norm
function.You can surely at least vectorize the inner loop:
This yields a nice improvement with relatively long arrays:
where
k
is justlambda x, y: x+y
.You can also try
vectorize
decorator from numba module.You particular problem is easily solved using
vectorize
and numpy broadcasting: