I have a data set x
. And I use cov(x)
to calculate the covariance of x
. I want to calculate the inverse square root of cov(x)
. But I get negative eigenvalue of cov(x)
.
Here is my code
S11=cov(x)
S=eigen(S11,symmetric=TRUE)
R=solve(S$vectors %*% diag(sqrt(S$values)) %*% t(S$vectors))
This is the eigenvalue of S
.
c(0.897249923338732, 0.814314811717616, 0.437109871173458, 0.334921280373883,
0.291910583884559, 0.257388456770167, 0.166787180227719, 0.148268784967556,
0.121401731579852, 0.0588333377333529, 0.0519459283467876, 0.0472867806813002,
0.0438199555429584, 0.0355421239839632, 0.0325106968911777, 0.0282860419784165,
0.0222240269478354, 0.0174657163114068, 0.012318267910606, 0.00980611646284724,
0.00969450391092417, 0.00804912897151307, 0.00788628666010145,
0.00681419055130702, 0.00664707528670254, 0.00591471779140177,
0.00581608875646686, 0.0057489828718098, 0.00564645095578336,
0.00521029715741059, 0.00503304953884416, 0.0048677189522647,
0.00395692706081966, 0.00391665618240403, 0.00389825739725093,
0.00383611535401152, 0.00374242176786387, 0.0035160324422885,
0.00299245160843966, 0.0029501156885799, 0.00289484923017341,
0.00287327878694529, 0.0028447265712214, 0.00274130080219099,
0.00273159993035393, 0.00265595612239575, 0.00261856622830277,
0.0020004125628823, 0.00199834766485368, 0.00199579695856402,
0.00198945452395265, 0.00197999810684363, 0.00195954105720554,
0.00195502875017394, 0.00194143254092788, 0.00192530399875842,
0.00191287435824908, 0.00187418676921454, 0.00184304720875652,
0.00181132707713659, 0.00167004122321738, 0.00132136106130093,
0.001001001001001, 0.001001001001001, 0.001001001001001, 0.00100089827907564,
0.000999613336959707, 0.000999285885989665, 0.000995390174780253,
0.000990809217795241, 0.000987333916025995, 0.000984260717691378,
0.000982735942052615, 0.000971684328336702, 0.000964125499180901,
0.000961900381008093, 0.000947883827257506, 0.000922293473088298,
0.000862086463606162, 0.000829687294735196, 0.000732694198613695,
1.95782839335209e-17, 4.13905030077713e-18, 2.02289095736911e-18,
8.72989281345777e-19, 3.79161425300691e-19, -7.97468731082902e-20)
While in theory an estimated covariance matrix must be positive (semi-)definite, i.e. no negative values, in practice floating-point error can violate this. To me it's no surprise that an 87-by-87 matrix could have a tiny negative (about -1*10^(-19)) eigenvalue.
Depending on what you want to do, you could use
?nearPD
from theMatrix
package to force your covariance matrix to be positive-definite:Also, it will probably be more efficient to compute the Cholesky decomposition (
?chol
) of your matrix first and then invert it (this is easy in principle -- I think you can usebacksolve()
).