Computing eigenvectors of a sparse matrix in R

2020-06-23 06:47发布

I am trying to compute the m first eigenvectors of a large sparse matrix in R. Using eigen() is not realistic because large means N > 106 here.

So far I figured out that I should use ARPACK from the igraph package, which can deal with sparse matrices. However I can't get it to work on a very simple (3x3) matrix:

library(Matrix)
library(igraph)

TestDiag <- Diagonal(3, 3:1)
TestMatrix <- t(sparseMatrix(i = c(1, 1, 2, 2, 3), j = c(1, 2, 1, 2, 3), x = c(3/5, 4/5, -4/5, 3/5, 1)))
TestMultipliedMatrix <- t(TestMatrix) %*% TestDiag %*% TestMatrix

And then using the code given in example of the help of the arpack() function to extract the 2 first eigenvectors :

func <- function(x, extra=NULL) { as.vector(TestMultipliedMatrix %*% x) } 
arpack(func, options=list(n = 3, nev = 2, ncv = 3, sym=TRUE, which="LM", maxiter=200), complex = FALSE)

I get an error message:

Error in arpack(func, options = list(n = 3, nev = 2, ncv = 3, sym = TRUE,  :
  At arpack.c:1156 : ARPACK error, NCV must be greater than NEV and less than or equal to N

I don't understand this error, as ncv (3) is greater than nev (2) here, and equal to N (3).

Am I making some stupid mistake or is there a better way to compute eigenvectors of a sparse matrix in R?


Update

This error is apparently due to a bug in the arpack() function with uppercase / lowercase NCV and NEV.

Any suggestions to solve the bug (I tried to have a look at the package code but it is far too complex for me to understand) or compute the eigenvectors in an other way are welcome.

2条回答
Anthone
2楼-- · 2020-06-23 07:36

Well it might be kinda irritating, but it works when you change nev=2, ncv=3 to NEV=3, NCV=2. R is case sensitive, that might have caused the problem.

查看更多
趁早两清
3楼-- · 2020-06-23 07:37

There are actually no bugs here, but you made a mistake putting sym=TRUE into the ARPACK option list, but sym is an argument of the arpack() function. I.e. the correct call is:

ev <- arpack(func, options=list(n=3, nev=2, ncv=3, which="LM", maxiter=200), 
             sym=TRUE, complex = FALSE)
ev$values
# [1] 3 2
ev$vectors
#               [,1]          [,2]
# [1,] -6.000000e-01 -8.000000e-01
# [2,]  8.000000e-01 -6.000000e-01
# [3,]  2.220446e-16 -9.714451e-17

If you are interested in the details, what happens is that instead of the symmetric, the general non-symmetric eigensolver is called and for that NCV-NEV >= 2 is also required. From the ARPACK source (dnaupd.f):

...
c          NOTE: 2 <= NCV-NEV in order that complex conjugate pairs of Ritz 
c          values are kept together. (See remark 4 below)
...

Some more comments, only loosely related to your question. arpack() can be quite slow. The problem with it is that you need to call back to R from the C code in each iteration. See this thread: http://lists.gnu.org/archive/html/igraph-help/2012-02/msg00029.html The bottom line is that arpack() only helps if your matrix-vector product callback is fast and you don't need many iterations, the latter being related to the eigenstructure of the matrix.

I created an issue in the igraph issue tracker, to see if it would be possible to optionally use C callback, using Rcpp, instead of the R callback: https://github.com/igraph/igraph/issues/491 You can follow this issue if you are interested.

查看更多
登录 后发表回答