I have spent a good amount of time trying to determine what is going wrong exactly, with the code I am using to convert pdf to docx (and doc to docx) using LibreOffice.
I have used both the windows run interface to test-run some of the code I have found to be relevant, and have tried on python as well, neither of which works.
I have LibreOffice v6.0.2 installed on windows. I have been using variations of this code to attempt to convert some pdfs to docx of which the specific pdf file is not really relevant:
import subprocess
lowriter='C://Program Files/LibreOffice/program/swriter.exe'
subprocess.run('{} --invisible --convert-to docx --outdir "{}" "{}"'
.format(lowriter,'dir',
'filepath.pdf',),shell=True)
I hvae tried code, again, in both the run interface on the windows os, and through python using the above code, with no luck. I have tried without the outdir as well, just in case I was writing that incorrectly, but always get a return code of 1:
CompletedProcess(args='C://Program Files/LibreOffice/program/swriter.exe
--invisible --convert-to docx --outdir "{dir}"
{filepath.pdf}"', returncode=1)
The dir and filepath.pdf are place holders I have put.
I have a similar problem with the doc to docx conversion.
There are a number of problems here. You should first get the
--convert-to
call to work from the command line as @CristiFati commented, and then implement in python.Here is the code that works on my system. No
//
in the path, and quotes are needed. Also, the folder isLibreOffice 5
on my system.Finally, it looks like converting from PDF to DOCX is not supported. LibreOffice Draw can open a PDF file and save as ODG format.
EDIT:
Here is working code to convert from PDF. I upgraded to LO 6, so the version number ("LibreOffice 5") is no longer required in the path.