Efficient element-wise function computation in Pyt

I have the following optimization problem. Given two np.arrays X,Y and a function K I would like to compute as fast as possible the matrix incidence gram_matrix where the (i,j)-th element is computed as K(X[i],Y[j]).

Here there an implementation using nested for-loops, which are acknowledged to be the slowest to solve these kind of problems.

def proxy_kernel(X,Y,K):
    gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
    for i, x in enumerate(X):
        for j, y in enumerate(Y):
            gram_matrix[i, j] = K(x, y)
    return gram_matrix

Any help is truly appreciated.

标签： python numpy scikit-learn vectorization

3条回答

干净又极端

2楼-- · 2020-03-30 07:33

np.vectorize does make some improvement in speed - about 2x (here I'm using math.atan2 as an black box function that takes 2 scalar arguments).

In [486]: X=np.linspace(0,1,100)
In [487]: K_vect=np.vectorize(math.atan2)

In [488]: timeit proxy_kernel(X,X,math.atan2)
100 loops, best of 3: 7.84 ms per loop

In [489]: timeit K_vect(X[:,None],X)
100 loops, best of 3: 3.49 ms per loop

In [502]: timeit np.arctan2(X[:,None],X)  # numpy compiled loop
1000 loops, best of 3: 871 µs per loop

where

def proxy_kernel(X,Y,K):
    gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
    for i, x in enumerate(X):
            for j, y in enumerate(Y):
                    gram_matrix[i, j] = K(x, y)
    return gram_matrix

As long as K is a black box, you are limited by the time it takes to invoke K the X.shape[0]*Y.shape[0] times. You can try to minimize the iteration time, but you are still limited by all those function calls.

https://stackoverflow.com/a/29733040/901925 speeds up the calculation with a Gausian kernel, by taking advantage of the axis parameter of the np.linalg.norm function.

0人赞添加讨论(0) 举报

forever°为你锁心

3楼-- · 2020-03-30 07:53

You can surely at least vectorize the inner loop:

def proxy_kernel_vect(X, Y, K):
    K_vect = np.vectorize(K)
    gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
    for i, x in enumerate(X):
        gram_matrix[i] = K_vect(x, Y)
    return gram_matrix

This yields a nice improvement with relatively long arrays:

In [15]: a = np.array(range(1000))
    ...: b = np.array(range(1000))
    ...: 

In [16]: %timeit proxy_kernel(a, b, k)
1 loops, best of 3: 665 ms per loop

In [17]: %timeit proxy_kernel_vect(a, b, k)
1 loops, best of 3: 266 ms per loop

where k is just lambda x, y: x+y.

0人赞添加讨论(0) 举报

Luminary・发光体

4楼-- · 2020-03-30 07:55

You can also try vectorize decorator from numba module.

You particular problem is easily solved using vectorize and numpy broadcasting:

from numba import vectorize

@vectorize(['float64(float64, float64)'])
def K(x,y):
    return x + y

a = np.arange(1000)
b = np.arange(1000)

gram_array = K(a[None,:], b[:, None])

0人赞添加讨论(0) 举报

Efficient element-wise function computation in Pyt

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间