How to solve a non-square linear system with R : A X = B
?
(in the case the system has no solution or infinitely many solutions)
Example :
A=matrix(c(0,1,-2,3,5,-3,1,-2,5,-2,-1,1),3,4,T)
B=matrix(c(-17,28,11),3,1,T)
A
[,1] [,2] [,3] [,4]
[1,] 0 1 -2 3
[2,] 5 -3 1 -2
[3,] 5 -2 -1 1
B
[,1]
[1,] -17
[2,] 28
[3,] 11
If the matrix A has more rows than columns, then you should use least squares fit.
If the matrix A has fewer rows than columns, then you should perform singular value decomposition. Each algorithm does the best it can to give you a solution by using assumptions.
Here's a link that shows how to use SVD as a solver:
http://www.ecse.rpi.edu/~qji/CV/svd_review.pdf
Let's apply it to your problem and see if it works:
Your input matrix A
and known RHS vector B
:
> A=matrix(c(0,1,-2,3,5,-3,1,-2,5,-2,-1,1),3,4,T)
> B=matrix(c(-17,28,11),3,1,T)
> A
[,1] [,2] [,3] [,4]
[1,] 0 1 -2 3
[2,] 5 -3 1 -2
[3,] 5 -2 -1 1
> B
[,1]
[1,] -17
[2,] 28
[3,] 11
Let's decompose your A
matrix:
> asvd = svd(A)
> asvd
$d
[1] 8.007081e+00 4.459446e+00 4.022656e-16
$u
[,1] [,2] [,3]
[1,] -0.1295469 -0.8061540 0.5773503
[2,] 0.7629233 0.2908861 0.5773503
[3,] 0.6333764 -0.5152679 -0.5773503
$v
[,1] [,2] [,3]
[1,] 0.87191556 -0.2515803 -0.1764323
[2,] -0.46022634 -0.1453716 -0.4694190
[3,] 0.04853711 0.5423235 0.6394484
[4,] -0.15999723 -0.7883272 0.5827720
> adiag = diag(1/asvd$d)
> adiag
[,1] [,2] [,3]
[1,] 0.1248895 0.0000000 0.00000e+00
[2,] 0.0000000 0.2242431 0.00000e+00
[3,] 0.0000000 0.0000000 2.48592e+15
Here's the key: the third eigenvalue in d
is very small; conversely, the diagonal element in adiag
is very large. Before solving, set it equal to zero:
> adiag[3,3] = 0
> adiag
[,1] [,2] [,3]
[1,] 0.1248895 0.0000000 0
[2,] 0.0000000 0.2242431 0
[3,] 0.0000000 0.0000000 0
Now let's compute the solution (see slide 16 in the link I gave you above):
> solution = asvd$v %*% adiag %*% t(asvd$u) %*% B
> solution
[,1]
[1,] 2.411765
[2,] -2.282353
[3,] 2.152941
[4,] -3.470588
Now that we have a solution, let's substitute it back to see if it gives us the same B
:
> check = A %*% solution
> check
[,1]
[1,] -17
[2,] 28
[3,] 11
That's the B
side you started with, so I think we're good.
Here's another nice SVD discussion from AMS:
http://www.ams.org/samplings/feature-column/fcarc-svd
Aim is to solve Ax = b
where A is p by q, x is q by 1 and b is p by 1 for x given A and b.
Approach 1: Generalized Inverse: Moore-Penrose
https://en.wikipedia.org/wiki/Generalized_inverse
Multiplying both sides of the equation, we get
A'Ax = A' b
where A' is the transpose of A. Note that A'A is q by q matrix now. One way to solve this now multiply both sides of the equation by the inverse of A'A. Which gives,
x = (A'A)^{-1} A' b
This is the theory behind generalized inverse. Here G = (A'A)^{-1} A' is pseudo-inverse of A.
library(MASS)
ginv(A) %*% B
# [,1]
#[1,] 2.411765
#[2,] -2.282353
#[3,] 2.152941
#[4,] -3.470588
Approach 2: Generalized Inverse using SVD.
@duffymo used SVD to obtain a pseudoinverse of A.
However, last elements of svd(A)$d
may not be quite as small as in this example. So, probably one shouldn't use that method as is. Here's an example where none of the last elements of A is close to zero.
A <- as.matrix(iris[11:13, -5])
A
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 11 5.4 3.7 1.5 0.2
# 12 4.8 3.4 1.6 0.2
# 13 4.8 3.0 1.4 0.1
svd(A)$d
# [1] 10.7820526 0.2630862 0.1677126
One option would be to look as the singular values in cor(A)
svd(cor(A))$d
# [1] 2.904194e+00 1.095806e+00 1.876146e-16 1.155796e-17
Now, it is clear there is only two large singular values are present. So, one now can apply svd on A to get pseudo-inverse as below.
svda <- svd(A)
G = svda$v[, 1:2] %*% diag(1/svda$d[1:2]) %*% t(svda$u[, 1:2])
# to get x
G %*% B