I've been porting MATLAB code over to Python and, after quite a lot of work, I have stuff that works. The downside, however, is that Python is running my code more slowly than MATLAB did. I understand that using optimised ATLAS libraries will speed things up, but actually implementing this is confusing me. Here's what's going on:
I start an ipython session with no BLAS installed:
import numpy.distutils.system_info as sysinfo
import time
In [11]: sysinfo.get_info('atlas')
Out[11]: {}
timeit( eig(randn(1E2,1E2)) )
100 loops, best of 3: 13.4 ms per loop
The same code in Matlab runs twice as fast
tic,eig(randn(1E2));toc*1000
6.5650
I install the non-optimised ATAS deb from the Ubuntu repository. Re-start ipython and now I get:
In [2]: sysinfo.get_info('atlas')
...
Out[2]:
{'define_macros': [('ATLAS_INFO', '"\\"3.8.4\\""')],
'include_dirs': ['/usr/include/atlas'],
'language': 'f77',
'libraries': ['lapack', 'f77blas', 'cblas', 'atlas'],
'library_dirs': ['/usr/lib/atlas-base/atlas', '/usr/lib/atlas-base']}
And the test code:
In [4]: timeit( eig(randn(1E2,1E2)) )
100 loops, best of 3: 16.8 ms per loop
So no faster. If anything a touch slower. But I haven't yet switched to the optimised BLAS. I follow these instructions: http://danielnouri.org/notes/category/python/ I build the libraries and overwrite the non-optimised versions with these. I re-start ipython but there's no change:
In [4]: timeit( eig(randn(1E2,1E2)) )
100 loops, best of 3: 15.3 ms per loop
Can't it get better than this? MATLAB is still twice as fast in this simple example. In a real-world example where I'm doing image registration in the Fourier domain, the Matlab equivalent is 4 to 5 times faster than the Python version. Has anyone managed to get Numpy working at MATLAB speeds?
Simple example
Numpy is calculating both the eigenvectors and eigenvalues, so it will take roughly twice longer, which is consistent with your slowdown (use
np.linalg.eigvals
to compute only the eigenvalues).In the end,
np.linalg.eig
is a tiny wrapper around dgeev, and likely the same thing happens in Matlab, which is using MKL.To get virtually the same speed in linear algebra, you could build Numpy against MKL or OpenBLAS. There are some commercial offers (maybe free for academics) from Continuum or Enthought. You could also get MKL and build Numpy yourself.
Real-world example
4x slower seems like too much (I have rewritten some Matlab code in Numpy and both programs performed in a very similar way). Take into account that recent Matlab versions come with a simple JIT, so loops aren't as bad as in the usual Python implementation. If you're doing many FFT, you could benefit from using a FFTW wrapper (pyFFTW seems nice, but I haven't used it).