-->

Issues installing mxnet GPU R package for Amazon d

2019-07-24 10:48发布

问题:

I am having trouble installing mxnet GPU for R on Amazon deep learning linux AMI. The environment variables are such a mess that it’s a nightmare for any non-expert sys-admin to figure out.

Step 1: install the ridiculous amount of missing/broken programs and R packages

sudo yum install R
sudo yum install libxml2-devel   
sudo yum install cairo-devel
sudo yum install giflib-devel
sudo yum install libXt-devel
sudo R
install.packages("devtools")
library(devtools)
install_github("igraph/rigraph")
install.packages(‘DiagrammeR’) 
install.packages(‘roxygen2’)
install.packages(‘rgexf’)
install.packages(‘influenceR’)
install.packages(‘Cairo’)
install.packages(“imager”)

Step 2: edit the config.mk file

cd /src/mxnet
cp make/config.mk .
echo "USE_BLAS=openblas" >>config.mk
echo "ADD_CFLAGS += -I/usr/include/openblas" >>config.mk
echo "ADD_LDFLAGS += -lopencv_core -lopencv_imgproc -lopencv_imgcodecs" >>config.mk
echo "USE_CUDA=1" >>config.mk
echo "USE_CUDA_PATH=/usr/local/cuda" >>config.mk
echo "USE_CUDNN=1" >>config.mk

*note even though the USE_CUDA_PATH is set, it STILL cannot find libcudart.so and needs to be linked in the make command (shown later)

Step 3: make new config file so make command can find libcudart.so

/etc/ld.so.conf.d/cuda.conf

add /usr/local/cuda-8.0/lib64

sudo ldconfig
  • note this was posted by nvidia but does absolutely nothing to help the make rpkg

Step 4: set up R directories

Rscript -e "install.packages('devtools', repo = 'https://cran.rstudio.com')"
cd R-package
Rscript -e "library(devtools); library(methods); options(repos=c(CRAN='https://cran.rstudio.com'));

install_deps(dependencies = TRUE)" cd ..

step 5: make

cd /src/mxnet
sudo make -j8

Result:

make CXX=g++ DEPS_PATH=/home/ec2-user/src/mxnet/deps -C /home/ec2-user/src/mxnet/ps-lite ps cd /home/ec2-user/src/mxnet/dmlc-core; make libdmlc.a USE_SSE=1 config=/home/ec2-user/src/mxnet/config.mk; cd /home/ec2-user/src/mxnet make[1]: Entering directory /home/ec2-user/src/mxnet/dmlc-core' make[1]:libdmlc.a' is up to date. make[1]: Leaving directory /home/ec2-user/src/mxnet/dmlc-core' make[1]: Entering directory/home/ec2-user/src/mxnet/ps-lite' make[1]: Nothing to be done for ps'. make[1]: Leaving directory/home/ec2-user/src/mxnet/ps-lite' ar crv lib/libmxnet.a

*note, even when changing the config.mk file, the make command always returns ‘nothing to update’

Step 6: attempt to make rpkg

Cd /src/mxnet
Sudo make rpkg

Error: Error: package or namespace load failed for ‘mxnet’: .onLoad failed in loadNamespace() for 'mxnet', details: call: dyn.load(file, DLLpath = DLLpath, ...) error: unable to load shared object '/usr/lib64/R/library/mxnet/libs/libmxnet.so': libcudart.so.8.0: cannot open shared object file: No such file or directory Error: loading failed Execution halted ERROR: loading failed

So it’s looking in a location that doesn’t exist: /usr/lib64/R/library/mxnet/libs/ When the file actually lives: /home/ec2-user/src/mxnet/R-package/inst/libs/libmxnet.so or /home/ec2-user/src/mxnet/lib/libmxnet.so

What I’ve tried so far:

sudo LD_LIBRARY_PATH=/usr/local/cuda/lib64 make rpkg

This will fix the missing libcudart.so.8.0 issue but it is simply replace with: libmklml_intel.so: cannot open shared object file: No such file or directory as well as the original ‘cannot find libmxnet.so

Also tried: 1. actually creating directories (/usr/lib64/R/library/mxnet/libs/) and then copying libmxnet.so there Result: same error

  1. adding /home/ec2-user/src/mxnet/R-package/inst/libs/ to the make command sudo LD_LIBRARY_PATH=/home/ec2-user/src/mxnet/R-package/inst/libs make rpkg Result: same error

  2. a ridiculous amount of environment labels all of which failed:

    export MXNET_HOME=/usr/lib64/R/library/mxnet/libs/ export MXNET_HOME=/usr/lib64/R/library/mxnet/libs/libmxnet.so
    sudo ldconfig /usr/local/cuda/lib64 sudo ln -s /usr/lib64/R/library/mxnet/libs /usr/lib sudo ln -s /usr/lib64/R/library/mxnet/libs/libmxnet.so /usr/lib sudo ln -s /usr/local/lib/libmklml_intel.so /usr/lib sudo ln -s /usr/local/lib/libiomp5.so /usr/lib sudo ln -s /usr/local /usr/lib export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/libcudart.so.8.0 export LD_LIBRARY_PATH=/usr/lib64/R/library/mxnet/libs/libmxnet.so /usr/lib export LD_LIBRARY_PATH=/usr/local/cuda-8.0/targets/x86_64-linux/lib/:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/libcudart.so.8.0

In all ONE of these worked, because I briefly got mxnet R package working before it fell apart again. I’ve dropped 50+ hours into this installation, which, frankly is ridiculous. Tougher to install the software then it is to program an actual net....

I don’t have 5+ years of linux sys admin knowledge so if you’d like please be a bit more helpful then ‘fix environment variables.’ I can tell that’s obviously what’s wrong yet have no idea what ‘fix environment variables’ entails.

To top it off, even after successful install of the R package, it STILL won’t work until setting Rstudio server’s config file to: rsession-ld-library-path=/opt/local/lib:/usr/local/cuda/lib64

回答1:

Did you try the following when running any sudo commands.

sudo -E make -j8

This means that it will preserve the env variables when running as superuser. You shouldn't have to add a new config file for the make to find the libraries. Just preserving the env variables using the above command should be enough.