I prepare a matrix of random numbers, calculate its inverse and matrix multiply it with the original matrix. This, in theory, gives the unit matrix. How can I let numpy
do that for me?
import numpy
A = numpy.zeros((100,100))
E = numpy.zeros((100,100))
size = 100
for i in range(size):
for j in range(size):
A[i][j]+=numpy.random.randint(10)
if i == j:
E[i][j]+=1
A_inv = numpy.linalg.linalg.inv(A)
print numpy.dot(A, A_inv)
Running the code produces
[me]machine @ numeric $ python rand_diag.py
[[ 1.00000000e+00 -7.99360578e-15 -1.14491749e-16 ..., 3.81639165e-17
-4.42701431e-15 1.17961196e-15]
[ -5.55111512e-16 1.00000000e+00 -2.22044605e-16 ..., -3.88578059e-16
1.33226763e-15 -8.32667268e-16]
It's evident the result is a unit matrix, but not precisely, so print numpy.dot(A, A_inv) == E
evidently gives False
. I'm doing this for practicing linear algebra and trying to find the size of the matrix for which my machine arrives at its limits. Getting a True
would be didactically appealing.
Edit:
Setting size=10000
, I run out of memory
[me]machine @ numeric $ Python(794) malloc:
***mmap(size=800002048) failed (error code=12)
*** error: can\'t allocate region
*** set a breakpoint in malloc_error_break to debug
Traceback (most recent call last):
File "rand_diag.py", line 14, in <module> A_inv = numpy.linalg.linalg.inv(A)
File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 445, in inv
return wrap(solve(a, identity(a.shape[0], dtype=a.dtype)))
File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 323, in solve
a, b = _fastCopyAndTranspose(t, a, b)
File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 143, in _fastCopyAndTranspose
cast_arrays = cast_arrays + (_fastCT(a),)
MemoryError
[1]+ Exit 1 python rand_diag.py
How can I allocate more memory and how can I run this in parallel (I have 4 cores)?
While getting True
would be didactically appealing, it would also be divorced from the realities of floating-point computations.
When dealing with the floating point, one necessarily has to be prepared not only for inexact results, but for all manner of other numerical issues that arise.
I highly recommend reading What Every Computer Scientist Should Know About Floating-Point Arithmetic.
In your particular case, to ensure that A * inv(A)
is close enough to the identity matrix, you could compute a matrix norm of numpy.dot(A, A_inv) - E
and ensure that it is small enough.
As a side note, you don't have to use a loop to populate A
and E
. Instead, you could just use
A = numpy.random.randint(0, 10, (size,size))
E = numpy.eye(size)
Agreeing with most of the points already made. However I would suggest that rather than looking at the individual off-diagonal elements, you take their rms sum; this reflects in some sense the "energy" that leaked into the off-diagonal terms as a result of imperfect calculations. If you then divide this RMS number by the sum of the diagonal terms, you get a metric of just how well the inverse worked. For example, the following code:
import numpy
import matplotlib.pyplot as plt
from numpy import mean, sqrt
N = 1000
R = numpy.zeros(N)
for size in range(50,N,50):
A = numpy.zeros((size, size))
E = numpy.zeros((size, size))
for i in range(size):
for j in range(size):
A[i][j]+=numpy.random.randint(10)
if i == j:
E[i][j]=1
A_inv = numpy.linalg.linalg.inv(A)
D = numpy.dot(A, A_inv) - E
S = sqrt(mean(D**2))
R[size] = S/size
print "size: ", size, "; rms is ", S/size
plt.plot(range(50,N,50), R[range(50, N, 50)])
plt.ylabel('RMS fraction')
plt.show()
Shows that the rms error is pretty stable with size of the array all the way up to a size of 950x950 (it does slow down a bit...). However, it's never "exact", and there are some outliers (presumably when the matrix is more nearly singular - this can happen with random matrices.)
Example plot (every time you run it, it will look a bit different):
In the end, you can round your answer with
m = np.round(m, decimals=10)
or check to see if they're very different:
np.abs(A*A.I - i).mean() < 1e-10
if you want to kill off the tiny numbers.
I would implement this with the numpy.matrix
class.
import numpy
size = 100
A = numpy.matrix(numpy.random.randint(0,10,(size,)*2))
E = numpy.eye(size)
print A * A.I
print np.abs(A * A.I - E).mean() < 1e-10
Your problem can be reduced to a common float-comparison problem. The correct way to compare such arrays would be:
EPS = 1e-8 # for example
(np.abs(numpy.dot(A, A_inv) - E) < EPS).all()