I am using Numeric Library Bindings for Boost UBlas to solve a simple linear system.
The following works fine, except it is limited to handling matrices A(m x m) for relatively
small 'm'.
In practice I have a much larger matrix with dimension m= 10^6 (up to 10^7).
Is there existing C++ approach for solving Ax=b that uses memory efficiently.
#include<boost/numeric/ublas/matrix.hpp>
#include<boost/numeric/ublas/io.hpp>
#include<boost/numeric/bindings/traits/ublas_matrix.hpp>
#include<boost/numeric/bindings/lapack/gesv.hpp>
#include <boost/numeric/bindings/traits/ublas_vector2.hpp>
// compileable with this command
//g++ -I/home/foolb/.boost/include/boost-1_38 -I/home/foolb/.boostnumbind/include/boost-numeric-bindings solve_Axb_byhand.cc -o solve_Axb_byhand -llapack
namespace ublas = boost::numeric::ublas;
namespace lapack= boost::numeric::bindings::lapack;
int main()
{
ublas::matrix<float,ublas::column_major> A(3,3);
ublas::vector<float> b(3);
for(unsigned i=0;i < A.size1();i++)
for(unsigned j =0;j < A.size2();j++)
{
std::cout << "enter element "<<i << j << std::endl;
std::cin >> A(i,j);
}
std::cout << A << std::endl;
b(0) = 21; b(1) = 1; b(2) = 17;
lapack::gesv(A,b);
std::cout << b << std::endl;
return 0;
}
Short answer: Don't use Boost's LAPACK
bindings, these were designed for dense matrices,
not sparse matrices, use UMFPACK
instead.
Long answer: UMFPACK
is one of the best libraries for solving Ax=b when A is large and sparse.
- http://www.cise.ufl.edu/research/sparse/umfpack/
- http://www.cise.ufl.edu/research/sparse/umfpack/UMFPACK/Doc/QuickStart.pdf
Below is sample code (based on umfpack_simple.c
) that generates a simple A
and b
and solves Ax = b
.
#include <stdlib.h>
#include <stdio.h>
#include "umfpack.h"
int *Ap;
int *Ai;
double *Ax;
double *b;
double *x;
/* Generates a sparse matrix problem:
A is n x n tridiagonal matrix
A(i,i-1) = -1;
A(i,i) = 3;
A(i,i+1) = -1;
*/
void generate_sparse_matrix_problem(int n){
int i; /* row index */
int nz; /* nonzero index */
int nnz = 2 + 3*(n-2) + 2; /* number of nonzeros*/
int *Ti; /* row indices */
int *Tj; /* col indices */
double *Tx; /* values */
/* Allocate memory for triplet form */
Ti = malloc(sizeof(int)*nnz);
Tj = malloc(sizeof(int)*nnz);
Tx = malloc(sizeof(double)*nnz);
/* Allocate memory for compressed sparse column form */
Ap = malloc(sizeof(int)*(n+1));
Ai = malloc(sizeof(int)*nnz);
Ax = malloc(sizeof(double)*nnz);
/* Allocate memory for rhs and solution vector */
x = malloc(sizeof(double)*n);
b = malloc(sizeof(double)*n);
/* Construct the matrix A*/
nz = 0;
for (i = 0; i < n; i++){
if (i > 0){
Ti[nz] = i;
Tj[nz] = i-1;
Tx[nz] = -1;
nz++;
}
Ti[nz] = i;
Tj[nz] = i;
Tx[nz] = 3;
nz++;
if (i < n-1){
Ti[nz] = i;
Tj[nz] = i+1;
Tx[nz] = -1;
nz++;
}
b[i] = 0;
}
b[0] = 21; b[1] = 1; b[2] = 17;
/* Convert Triplet to Compressed Sparse Column format */
(void) umfpack_di_triplet_to_col(n,n,nnz,Ti,Tj,Tx,Ap,Ai,Ax,NULL);
/* free triplet format */
free(Ti); free(Tj); free(Tx);
}
int main (void)
{
double *null = (double *) NULL ;
int i, n;
void *Symbolic, *Numeric ;
n = 500000;
generate_sparse_matrix_problem(n);
(void) umfpack_di_symbolic (n, n, Ap, Ai, Ax, &Symbolic, null, null);
(void) umfpack_di_numeric (Ap, Ai, Ax, Symbolic, &Numeric, null, null);
umfpack_di_free_symbolic (&Symbolic);
(void) umfpack_di_solve (UMFPACK_A, Ap, Ai, Ax, x, b, Numeric, null, null);
umfpack_di_free_numeric (&Numeric);
for (i = 0 ; i < 10 ; i++) printf ("x [%d] = %g\n", i, x [i]);
free(b); free(x); free(Ax); free(Ai); free(Ap);
return (0);
}
The function generate_sparse_matrix_problem
creates the matrix A
and the
right-hand side b
. The matrix is first constructed in triplet form. The
vectors Ti, Tj, and Tx fully describe A. Triplet form is easy to create but
efficient sparse matrix methods require Compressed Sparse Column format. Conversion
is performed with umfpack_di_triplet_to_col
.
A symbolic factorization is performed with umfpack_di_symbolic
. A sparse
LU decomposition of A
is performed with umfpack_di_numeric
.
The lower and upper triangular solves are performed with umfpack_di_solve
.
With n
as 500,000, on my machine, the entire program takes about a second to run.
Valgrind reports that 369,239,649 bytes (just a little over 352 MB) were allocated.
Note this page discusses Boost's support for sparse matrices in Triplet (Coordinate)
and Compressed format. If you like, you can write routines to convert these boost objects
to the simple arrays UMFPACK
requires as input.
Assuming your huge matrices are sparse, which I hope they are at that size, have a look at the PARDISO project which is a sparse linear solver, this is what you'll need if you want to handle matrices as big as you said. Allows efficient storage of only non-zero values, and is much faster than solving the same system of dense matrices.
I assume that your matrix is dense. If it is sparse, you can find numerous specialised algorithms as already mentioned by DeusAduro and duffymo.
If you don't have a (large enough) cluster at your disposal, you want to look at out-of-core algorithms. ScaLAPACK has a few out-of-core solvers as part of its prototype package, see the documentation here and Google for more details. Searching the web for "out-of-core LU / (matrix) solvers / packages" will give you links to a wealth of further algorithms and tools. I am not an expert on those.
For this problem, most people would use a cluster, however. The package you will find on almost any cluster is ScaLAPACK, again. In addition, there are usually numerous other packages on the typical cluster, so you can pick and choose what suits your problem (examples here and here).
Before you start coding, you probably want to quickly check how long it will take to solve your problem. A typical solver takes about O(3*N^3) flops (N is dimension of matrix). If N = 100000, you are hence looking at 3000000 Gflops. Assuming that your in-memory solver does 10 Gflops/s per core, you are looking at 3 1/2 days on a single core. As the algorithms scale well, increasing the number of cores should reduce the time close to linearly. On top of that comes the I/O.
Not sure about C++ implementations, but there are several things you can do if memory is an issue depending on the type of matrix you're dealing with:
- If your matrix is sparse or banded, you can use a sparse or bandwidth solver. These don't store zero elements outside the band.
- You can use a wavefront solver, which stores the matrix on disk and only brings in the matrix wavefront for decomposition.
- You can avoid solving the matrix altogether and use iterative methods.
- You can try Monte Carlo methods of solution.
Have a look at the list of freely available software for the solution of linear algebra problems, compiled by Jack Dongarra and Hatem Ltaief.
I think that for the problem size you're looking at, you probably need an iterative algorithm. If you don't want to store the matrix A in a sparse format, you can use a matrix-free implementation. Iterative algorithms typically do not need to access individual entries of the matrix A, they only need to compute matrix-vector products Av (and sometimes A^T v, the product of the transposed matrix with the vector). So if the library is well-designed, it should be enough if you pass it a class that knows how to do matrix-vector products.
As the accepted answer suggests there is UMFPACK. But if you are using BOOST you can still use the compact matrices in BOOST and use UMFPACK to solve the system. There is a binding which makes it easy:
http://mathema.tician.de/software/boost-numeric-bindings
Its about two years dated, but its just a binding (along with a few others).
see related question:
UMFPACK and BOOST's uBLAS Sparse Matrix