How do I calculate correlation matrix in python? I have an n-dimensional vector in which each element has 5 dimension. For example my vector looks like
[ [0.1, .32, .2, 0.4, 0.8], [.23, .18, .56, .61, .12], [.9, .3, .6, .5, .3], [.34, .75, .91, .19, .21] ]
In this case dimension of the vector is 4 and each element of this vector have 5 dimension. How to construct the matrix in the easiest way?
Thanks
Using numpy, you could use np.corrcoef:
As I almost missed that comment by @Anton Tarasenko, I'll provide a new answer. So given your array:
If you want the correlation matrix of your dimensions (columns), which I assume, you can use numpy (note the transpose!):
Or if you have it in Pandas anyhow:
Both print
You can also use np.array if you don't want to write your matrix all over again.
Here is a pretty good example of calculating a correlations matrix form multiple time series using Python. Included source code calculates correlation matrix for a set of Forex currency pairs using Pandas, NumPy, and matplotlib to produce a graph of correlations.
Sample data is a set of historical data files, and the output is a single correlation matrix and a plot. The code is very well documented.