Theano fails due to NumPy Fortran mixup under Ubun

2019-02-08 11:42发布

问题:

I installed Theano on my machine, but the nosetests break with a Numpy/Fortran related error message. For me it looks like Numpy was compiled with a different Fortran version than Theano. I already reinstalled Theano (sudo pip uninstall theano + sudo pip install --upgrade --no-deps theano) and Numpy / Scipy (apt-get install --reinstall python-numpy python-scipy), but this did not help.

What steps would you recommend?

Complete error message:

ImportError: ('/home/Nick/.theano/compiledir_Linux-2.6.35-31-generic-x86_64-with-Ubuntu-10.10-maverick--2.6.6/tmpIhWJaI/0c99c52c82f7ddc775109a06ca04b360.so: undefined symbol: _gfortran_st_write_done'

My research:

The Installing SciPy / BuildingGeneral page about the undefined symbol: _gfortran_st_write_done' error:

If you see an error message

ImportError: /usr/lib/atlas/libblas.so.3gf: undefined symbol: _gfortran_st_write_done

when building SciPy, it means that NumPy picked up the wrong Fortran compiler during build (e.g. ifort).

Recompile NumPy using:

python setup.py build --fcompiler=gnu95

or whichever is appropriate (see python setup.py build --help-fcompiler).

But:

Nick@some-serv2:/usr/local/lib/python2.6/dist-packages/numpy$ python setup.py build --help-fcompiler
This is the wrong setup.py file to run

Used software versions:

  • scipy 0.10.1 (scipy.test() works)
  • NumPy 1.6.2 (numpy.test() works)
  • theano 0.5.0 (several tests fails with undefined symbol: _gfortran_st_write_done')
  • python 2.6.6
  • Ubuntu 10.10

[UPDATE]

So I removed numpy and scipy from my system with apt-get remove and using find -name XXX -delete of what was left.

Than I installed numpy and scipy from the github sources with sudo python setpy.py install.

Afterwards I entered again sudo pip uninstall theano and sudo pip install --upgrade --no-deps theano.

Error persists :/

I also tried the apt-get source ... + apt-get build-dep ... approach, but for my old Ubuntu (10.10) it installs too old version of numpy and scipy for theano: ValueError: numpy >= 1.4 is required (detected 1.3.0 from /usr/local/lib/python2.6/dist-packages/numpy/__init__.pyc)

回答1:

I had the same problem, and after reviewing the source code, user212658's answer seemed like it would work (I have not tried it). I then looked for a way to deploy user212658's hack without modifying the source code.

Put these lines in your theanorc file:

[blas]
ldflags = -lblas -lgfortran

This worked for me.



回答2:

Have you tried to recompile NumPy from the sources?

I'm not familiar with the Ubuntu package system, so I can't check what's in your dist-packages/numpy. With a clean archive of the NumPy sources, you should have a setup.py at the same level as the directories numpy, tools and benchmarks (among others). I'm pretty sure that's the one you want to use for a python setup.py build.

[EDIT]

Now that you have recompiled numpy with the proper --fcompiler option, perhaps could you try to do the same with Theano, that is, compiling directly from sources without relying on a apt-get or even pip. You should have a better control on the build process that way, which will make debugging/trying to find a solution easier.



回答3:

I had the same problem. The solution I found is to add a hack in theano/gof/cmodule.py to link against gfortran whenever 'blas' is in the libs. That fixed it.

class GCC_compiler(object):
   ...
    @staticmethod
    def compile_str(module_name, src_code, location=None,
                    include_dirs=None, lib_dirs=None, libs=None,
                    preargs=None):
        ...
        cmd.extend(['-l%s' % l for l in libs])
        if 'blas' in libs:
            cmd.append('-lgfortran')


回答4:

A better fix is to remove atlas and install openblas. openblas is faster then atlas. Also, openblas don't request gfortran and is the one numpy was linked with. So it will work out of the box.