Efficient low-rank appoximation in MATLAB

2019-02-05 00:51发布

I'd like to compute a low-rank approximation to a matrix which is optimal under the Frobenius norm. The trivial way to do this is to compute the SVD decomposition of the matrix, set the smallest singular values to zero and compute the low-rank matrix by multiplying the factors. Is there a simple and more efficient way to do this in MATLAB?

2条回答
Ridiculous、
2楼-- · 2019-02-05 01:28

If your matrix is sparse, use svds.

Assuming it is not sparse but it's large, you can use random projections for fast low-rank approximation.

From a tutorial:

An optimal low rank approximation can be easily computed using the SVD of A in O(mn^2 ). Using random projections we show how to achieve an ”almost optimal” low rank pproximation in O(mn log(n)).

Matlab code from a blog:

clear
% preparing the problem
% trying to find a low approximation to A, an m x n matrix
% where m >= n
m = 1000;
n = 900;
%// first let's produce example A
A = rand(m,n);
%
% beginning of the algorithm designed to find alow rank matrix of A
% let us define that rank to be equal to k
k = 50;
% R is an m x l matrix drawn from a N(0,1)
% where l is such that l > c log(n)/ epsilon^2
%
l = 100;
% timing the random algorithm
trand =cputime;
R = randn(m,l);
B = 1/sqrt(l)* R' * A;
[a,s,b]=svd(B);
Ak = A*b(:,1:k)*b(:,1:k)';
trandend = cputime-trand;
% now timing the normal SVD algorithm
tsvd = cputime;
% doing it the normal SVD way
[U,S,V] = svd(A,0);
Aksvd= U(1:m,1:k)*S(1:k,1:k)*V(1:n,1:k)';
tsvdend = cputime -tsvd;

Also, remember the econ parameter of svd.

查看更多
女痞
3楼-- · 2019-02-05 01:41

You can rapidly compute a low-rank approximation based on SVD, using the svds function.

[U,S,V] = svds(A,r); %# only first r singular values are computed

svds uses eigs to compute a subset of the singular values - it will be especially fast for large, sparse matrices. See the documentation; you can set tolerance and maximum number of iterations or choose to calculate small singular values instead of large.

I thought svds and eigs could be faster than svd and eig for dense matrices, but then I did some benchmarking. They are only faster for large matrices when sufficiently few values are requested:

 n     k       svds          svd         eigs          eig            comment
10     1     4.6941e-03   8.8188e-05   2.8311e-03   7.1699e-05    random matrices
100    1     8.9591e-03   7.5931e-03   4.7711e-03   1.5964e-02     (uniform dist)
1000   1     3.6464e-01   1.8024e+00   3.9019e-02   3.4057e+00
       2     1.7184e+00   1.8302e+00   2.3294e+00   3.4592e+00
       3     1.4665e+00   1.8429e+00   2.3943e+00   3.5064e+00
       4     1.5920e+00   1.8208e+00   1.0100e+00   3.4189e+00
4000   1     7.5255e+00   8.5846e+01   5.1709e-01   1.2287e+02
       2     3.8368e+01   8.6006e+01   1.0966e+02   1.2243e+02
       3     4.1639e+01   8.4399e+01   6.0963e+01   1.2297e+02
       4     4.2523e+01   8.4211e+01   8.3964e+01   1.2251e+02


10     1      4.4501e-03   1.2028e-04   2.8001e-03   8.0108e-05   random pos. def.
100    1      3.0927e-02   7.1261e-03   1.7364e-02   1.2342e-02    (uniform dist)
1000   1      3.3647e+00   1.8096e+00   4.5111e-01   3.2644e+00
       2      4.2939e+00   1.8379e+00   2.6098e+00   3.4405e+00
       3      4.3249e+00   1.8245e+00   6.9845e-01   3.7606e+00
       4      3.1962e+00   1.9782e+00   7.8082e-01   3.3626e+00
4000   1      1.4272e+02   8.5545e+01   1.1795e+01   1.4214e+02
       2      1.7096e+02   8.4905e+01   1.0411e+02   1.4322e+02
       3      2.7061e+02   8.5045e+01   4.6654e+01   1.4283e+02
       4      1.7161e+02   8.5358e+01   3.0066e+01   1.4262e+02

With size-n square matrices, k singular/eigen values and runtimes in seconds. I used Steve Eddins' timeit file exchange function for benchmarking, which tries to account for overhead and runtime variations.

svds and eigs are faster if you want a few values from a very large matrix. It also depends on the properties of the matrix in question (edit svds should give you some idea why).

查看更多
登录 后发表回答