Python - How to find a correlation between two vec

2019-05-28 08:56发布

问题:

Given two vectors X and Y, I have to find their correlation, i.e. their linear dependence/independence. Both vectors have equal dimension. The result should be a floating point number from [-1.0 .. 1.0].

Example:

X=[-1, 2,    0]
Y=[ 4, 2, -0.3]

Find y = cor(X,Y) such that y belongs to [-1.0 .. 1.0].

It should be a simple construction involving a list-comprehension. No external library is allowed.

UPDATE: ok, if the dot product is enough, then here is my solution:

nX = 1/(sum([x*x for x in X]) ** 0.5)
nY = 1/(sum([y*y for y in Y]) ** 0.5)
cor = sum([(x*nX)*(y*nY)  for x,y in zip(X,Y) ])

right?

回答1:

Sounds like a dot product to me.

Solve the equation for the cosine of the angle between the two vectors, which is always in the range [-1, 1], and you'll have what you want.

It's equal to the dot product divided by the magnitudes of two vectors.



回答2:

Since range is supposed to be [-1, 1] I think that the Pearson Correlation can be ok for your purposes.

Also dot-product would work but you'll have to normalize vectors before calculating it and you can have a -1,1 range just if you have also negative values.. otherwise you would have 0,1



回答3:

Don't assume because a formula is algebraically correct that its direct implementation in code will work. There can be numerical problems with some definitions of correlation.

See How to calculate correlation accurately