Libreoffice gives “Application Error” when called

2020-07-09 08:09发布

问题:

Inside a docker container I am trying to convert an XLSX file to PDF using LibreOffice. The relevant command works on the command line but fails with "Application Error" when called from R. I use this Dockerfile which adds some (in my experience arbitrary) XLSX file:

FROM rocker/r-ver:3.4.3

RUN apt-get update \
 && apt-get install --yes --no-install-recommends \
    default-jre-headless libreoffice-calc \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/* \
 && echo /usr/lib/libreoffice/program > /etc/ld.so.conf.d/libreoffice.conf \
 && ldconfig

COPY foo.xlsx /tmp

(The trick with ldconfig comes from shared library issue with the system function in R.)

On the command line I can then convert the XLSX file to PDF:

root@b395caeba33b:/# loffice --headless --convert-to pdf /tmp/foo.xlsx 
convert /tmp/foo.xlsx -> //foo.pdf using filter : calc_pdf_Export

However, this fails from R:

> system("loffice --version")
LibreOffice 5.2.7.2 20m0(Build:2)

> system("loffice --headless --convert-to pdf /tmp/foo.xlsx")
convert /tmp/foo.xlsx -> //foo.pdf using filter : calc_pdf_Export
Application Error

If I change the base image from rocker/r-ver:3.4.3 to rocker/r-base which uses R 3.4.4 and Debian testing/sid the result changes only marginally:

> system("loffice --version")
LibreOffice 6.0.2.1.0 00m0(Build:1)

> system("loffice --headless --convert-to pdf /tmp/foo.xlsx")
Application Error

How can I get LibreOffice to convert XLSX files to PDF when called from R?

回答1:

I have found a workaround, but I am still interested in a proper explanation. Here is what I found:

  • Start the docker container with option --security-opt seccomp:unconfined and install strace.
  • Within R call

    system("strace -f -o R.trace loffice --headless --convert-to pdf /tmp/foo.xlsx")
    
  • The resulting trace file shows an error loading libsal_textenclo.so. It is strange that it searches for the library in /usr/lib/x86_64-linux-gnu even though ldconfig knows where to find it:

    root@1519f52c05e0:/# grep libsal R.trace 
    257   open("/usr/lib/x86_64-linux-gnu/libsal_textenclo.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
    root@1519f52c05e0:/# ldconfig -p | grep libsal
        libsal_textenclo.so (libc6,x86-64) => /usr/lib/libreoffice/program/libsal_textenclo.so
    
  • Setting LD_LIBRARY_PATH to include /usr/lib/libreoffice/program does not solve the issue.

    root@4a235dfa08e3:~# export LD_LIBRARY_PATH=/usr/lib/libreoffice/program
    root@4a235dfa08e3:~# Rscript -e 'system("loffice --headless --convert-to pdf /tmp/foo.xlsx")'
    Application Error
    
  • My Current workaround is to set LD_LIBRARY_PATH within the R session:

    > Sys.setenv(LD_LIBRARY_PATH="/usr/lib/libreoffice/program")
    > system("loffice --headless --convert-to pdf /tmp/foo.xlsx")
    convert /tmp/foo.xlsx -> //foo.pdf using filter : calc_pdf_Export
    Overwriting: //foo.pdf
    


回答2:

The issue happens because of the environment difference. When you run the env command through system

> system('env')
R_UNZIPCMD=/usr/bin/unzip
HOSTNAME=da4d504ddcb1
LD_LIBRARY_PATH=/usr/local/lib/R/lib:/usr/local/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server
SHLVL=0
HOME=/root
R_LIBS_SITE=
R_BROWSER=xdg-open
PAGER=/usr/bin/pager
R_VERSION=3.4.3
BUILD_DATE=
R_SYSTEM_ABI=linux,gcc,gxx,gfortran,?
TAR=/bin/tar
R_LIBS_USER=/usr/local/lib/R/site-library
TERM=xterm
COLUMNS=200
R_ARCH=
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
R_BZIPCMD=/bin/bzip2
R_INCLUDE_DIR=/usr/local/lib/R/include
R_SESSION_TMPDIR=/tmp/RtmpJsaXba
LANG=en_US.UTF-8
R_GZIPCMD=/bin/gzip
SED=/bin/sed
LN_S=ln -s
R_PDFVIEWER=/usr/bin/xdg-open
R_TEXI2DVICMD=/usr/bin/texi2dvi
R_HOME=/usr/local/lib/R
R_PRINTCMD=/usr/bin/lpr
R_DOC_DIR=/usr/local/lib/R/doc
R_LIBS=/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library
LC_ALL=en_US.UTF-8
R_SHARE_DIR=/usr/local/lib/R/share
PWD=/
R_ZIPCMD=/usr/bin/zip
R_PLATFORM=x86_64-pc-linux-gnu
R_PAPERSIZE=letter
LINES=50
MAKE=make
R_RD4PDF=times,inconsolata,hyper
EDITOR=vi

You can see the default R has set of environment variables and one of them is LD_LIBRARY_PATH.

> system('loffice --headless --convert-to pdf /tmp/foo.xlsx')
Application Error
> system('LD_LIBRARY_PATH= loffice --headless --convert-to pdf /tmp/foo.xlsx')
convert /tmp/foo.xlsx -> //foo.pdf using filter : calc_pdf_Export

Blank it out and it works. The reason it works in bash is because the default environment variable set is small

root@5c5bbcfcebf2:/# env
LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8
HOSTNAME=5c5bbcfcebf2
PWD=/
HOME=/root
R_VERSION=3.4.3
BUILD_DATE=
TERM=xterm
SHLVL=1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
_=/usr/bin/env

Also when you launch r instead of R from bash

root@5c5bbcfcebf2:/# r -i
system('env')
R_UNZIPCMD=/usr/bin/unzip
HOSTNAME=5c5bbcfcebf2
SHLVL=1
R_INSTALL_PKG=littler
HOME=/root
R_ENVIRON=
R_LIBS_SITE=
R_BROWSER=xdg-open
PAGER=/usr/bin/pager
R_VERSION=3.4.3
BUILD_DATE=
R_SYSTEM_ABI=linux,gcc,gxx,gfortran,?
R_PROFILE_USER=
TAR=/bin/tar
_=/usr/local/bin/r
R_LIBS_USER=/usr/local/lib/R/site-library
TERM=xterm
R_ARCH=
R_PAPERSIZE_USER=letter
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
R_BZIPCMD=/bin/bzip2
R_INCLUDE_DIR=/usr/local/lib/R/include
R_SESSION_TMPDIR=/tmp
R_OSTYPE=unix
LANG=en_US.UTF-8
R_CMD=/usr/local/lib/R/bin/Rcmd
R_DEFAULT_PACKAGES=NULL
R_PACKAGE_NAME=littler
R_GZIPCMD=/bin/gzip
LN_S=ln -s
SED=/bin/sed
R_PDFVIEWER=/usr/bin/xdg-open
R_PROFILE=
R_ENVIRON_USER=
R_TEXI2DVICMD=/usr/bin/texi2dvi
R_HOME=/usr/local/lib/R
R_PRINTCMD=/usr/bin/lpr
R_DOC_DIR=/usr/local/lib/R/doc
R_LIBS=/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library
LC_ALL=en_US.UTF-8
PWD=/
R_SHARE_DIR=/usr/local/lib/R/share
R_ZIPCMD=/usr/bin/zip
R_PAPERSIZE=letter
R_PLATFORM=x86_64-pc-linux-gnu
MAKE=make
R_RD4PDF=times,inconsolata,hyper
EDITOR=vi

In the interactive shell launch from bash using r -i

system('LD_LIBRARY_PATH= loffice --headless --convert-to pdf /tmp/foo.xlsx')
convert /tmp/foo.xlsx -> //foo.pdf using filter : calc_pdf_Export

system('loffice --headless --convert-to pdf /tmp/foo.xlsx')
convert /tmp/foo.xlsx -> //foo.pdf using filter : calc_pdf_Export
Overwriting: //foo.pdf

It is the child process environment inherited from the parent that causes the issue in your case