Converting docx into pdf in java

2019-03-19 20:42发布

问题:

I am trying to convert a docx file which contains table and images into a pdf format file.

I have been searching everywhere but did not get proper solution, request to give proper and correct solution:

here what i have tried :

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import org.apache.poi.xwpf.converter.pdf.PdfConverter;
import org.apache.poi.xwpf.converter.pdf.PdfOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;

public class TestCon {

    public static void main(String[] args) {
        TestCon cwoWord = new TestCon();
        System.out.println("Start");
        cwoWord.ConvertToPDF("D:\\Test.docx", "D:\\Test1.pdf");
    }

    public void ConvertToPDF(String docPath, String pdfPath) {
        try {
            InputStream doc = new FileInputStream(new File(docPath));
            XWPFDocument document = new XWPFDocument(doc);
            PdfOptions options = PdfOptions.create();
            OutputStream out = new FileOutputStream(new File(pdfPath));
            PdfConverter.getInstance().convert(document, out, options);
            System.out.println("Done");
        } catch (FileNotFoundException ex) {
            System.out.println(ex.getMessage());
        } catch (IOException ex) {

            System.out.println(ex.getMessage());
        }
    }

}

Exception:

Exception in thread "main" java.lang.IllegalAccessError: tried to access method org.apache.poi.util.POILogger.log(ILjava/lang/Object;)V from class org.apache.poi.openxml4j.opc.PackageRelationshipCollection
at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.parseRelationshipsPart(PackageRelationshipCollection.java:313)
at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:162)
at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:130)
at org.apache.poi.openxml4j.opc.PackagePart.loadRelationships(PackagePart.java:559)
at org.apache.poi.openxml4j.opc.PackagePart.<init>(PackagePart.java:112)
at org.apache.poi.openxml4j.opc.PackagePart.<init>(PackagePart.java:83)
at org.apache.poi.openxml4j.opc.PackagePart.<init>(PackagePart.java:128)
at org.apache.poi.openxml4j.opc.ZipPackagePart.<init>(ZipPackagePart.java:78)
at org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:239)
at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:665)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:274)
at org.apache.poi.util.PackageHelper.open(PackageHelper.java:39)
at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:121)
at test.TestCon.ConvertToPDF(TestCon.java:31)
at test.TestCon.main(TestCon.java:25)

My requirement is to create a java code to convert existing docx into pdf with proper format and alignment.

Please suggest.

Jars Used:

回答1:

You are missing some libraries.

I am able to run your code by adding the following libraries:

    Apache POI 3.15
    org.apache.poi.xwpf.converter.core-1.0.6.jar
    org.apache.poi.xwpf.converter.pdf-1.0.6.jar
    fr.opensagres.xdocreport.itext.extension-2.0.0.jar
    itext-2.1.7.jar
    ooxml-schemas-1.3.jar

I have successfully converted a 6 pages long Word document (.docx) with tables, images and various formatting.



回答2:

In addition to the VivekRatanSinha answer , i would i like to post full code and required jars for the people who need it in future.

Code:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.poi.xwpf.converter.pdf.PdfConverter;
import org.apache.poi.xwpf.converter.pdf.PdfOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;

public class WordConvertPDF {
    public static void main(String[] args) {
        WordConvertPDF cwoWord = new WordConvertPDF();
        cwoWord.ConvertToPDF("D:/Test.docx", "D:/Test.pdf");
    }

    public void ConvertToPDF(String docPath, String pdfPath) {
        try {
            InputStream doc = new FileInputStream(new File(docPath));
            XWPFDocument document = new XWPFDocument(doc);
            PdfOptions options = PdfOptions.create();
            OutputStream out = new FileOutputStream(new File(pdfPath));
            PdfConverter.getInstance().convert(document, out, options);
        } catch (IOException ex) {
            System.out.println(ex.getMessage());
        }
    }
}

and JARS:

Enjoy :)



回答3:

I use this code.

private byte[] toPdf(ByteArrayOutputStream docx) {
    InputStream isFromFirstData = new ByteArrayInputStream(docx.toByteArray());

    XWPFDocument document = new XWPFDocument(isFromFirstData);
    PdfOptions options = PdfOptions.create();

    //make new file in c:\temp\
    OutputStream out = new FileOutputStream(new File("c:\\tmp\\HelloWord.pdf"));
    PdfConverter.getInstance().convert(document, out, options);

    //return byte array for return in http request.
    ByteArrayOutputStream pdf = new ByteArrayOutputStream();
    PdfConverter.getInstance().convert(document, pdf, options);

    document.write(pdf);
    document.close();
    return pdf.toByteArray();
}


回答4:

You need to add these maven dependencies

  <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-ooxml</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-scratchpad</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-ooxml-schemas</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-excelant</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-examples</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>org.apache.poi.xwpf.converter.core</artifactId>
        <version>1.0.6</version>
    </dependency>

    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>org.apache.poi.xwpf.converter.pdf</artifactId>
        <version>1.0.6</version>
    </dependency>


回答5:

I have done lot of research and found Documents4j is best free api for convert docx to pdf. Alignment, font everthing documents4j doing good job.

Maven Dependencies:

<dependency>
    <groupId>com.documents4j</groupId>
    <artifactId>documents4j-local</artifactId>
    <version>1.0.3</version>
</dependency>
<dependency>
    <groupId>com.documents4j</groupId>
    <artifactId>documents4j-transformer-msoffice-word</artifactId>
    <version>1.0.3</version>
</dependency>

Use the below code for convert docx to pdf.

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;

import com.documents4j.api.DocumentType;
import com.documents4j.api.IConverter;
import com.documents4j.job.LocalConverter;

public class Document4jApp {

    public static void main(String[] args) {

        File inputWord = new File("Tests.docx");
        File outputFile = new File("Test_out.pdf");
        try  {
            InputStream docxInputStream = new FileInputStream(inputWord);
            OutputStream outputStream = new FileOutputStream(outputFile);
            IConverter converter = LocalConverter.builder().build();
            converter.convert(docxInputStream).as(DocumentType.DOCX).to(outputStream).as(DocumentType.PDF).execute();
            outputStream.close();
            System.out.println("success");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}