Covariance with a columns

2019-06-08 09:32发布

问题:

If I have a numpy array X with X.shape=(m,n) and a second column vector y with y.shape=(m,1), how can I calculate the covariance of each column of X with y wihtout using a for loop? I expect the result to be of shape (m,1) or (1,m).

回答1:

Assuming that the output is meant to be of shape (1,n) i.e. a scalar each for covariance operation for each column of A with B and thus for n columns ending up with n such scalars, you can use two approaches here that use covariance formula.

Approach #1: With Broadcasting

np.sum((A - A.mean(0))*(B - B.mean(0)),0)/B.size

Approach #2: With Matrix-multiplication

np.dot((B - B.mean(0)).T,(A - A.mean(0)))/B.size