Original question
I am correlating row P of size n to every column of matrix O of size n×m. I crafted the following code:
import numpy as np
def ColumnWiseCorrcoef(O, P):
n = P.size
DO = O - (np.sum(O, 0) / np.double(n))
DP = P - (np.sum(P) / np.double(n))
return np.dot(DP, DO) / np.sqrt(np.sum(DO ** 2, 0) * np.sum(DP ** 2))
It is more efficient than the naive approach:
def ColumnWiseCorrcoefNaive(O, P):
return np.corrcoef(P,O.T)[0,1:O[0].size+1]
Here are the timings I get with numpy-1.7.1-MKL on an Intel core:
O = np.reshape(np.random.rand(100000), (1000,100))
P = np.random.rand(1000)
%timeit -n 1000 A = ColumnWiseCorrcoef(O, P)
1000 loops, best of 3: 787 us per loop
%timeit -n 1000 B = ColumnWiseCorrcoefNaive(O, P)
1000 loops, best of 3: 2.96 ms per loop
Now the question: can you suggest an even faster version of the code for this problem? Squeezing out additional 20% would be great.
UPDATE May 2017
After quite some time I got back to this problem, re-run and extended the task and the tests.
Using einsum, I have extended the code to the case where P is not a row but a matrix. So the task is correlating all columns of O to all columns of P.
Being curious on how the same problem can be solved in different languages commonly used for scientific computing, I implemented it (with help from other people) in MATLAB, Julia, and R. MATLAB and Julia are the fastest, and they have a dedicated routine to compute columnwise correlation. R also has a dedicated routine but is the slowest.
In the current version of numpy (1.12.1 from Anaconda) einsum still wins over dedicated functions I used.
All the scripts and timings are available at https://github.com/ikizhvatov/efficient-columnwise-correlation.