I'm using Apache PDFBox (http://pdfbox.apache.org/) for creating PDFs out of an arbitrary amount of files, including Images and other PDFs. Now I need to add MS Office Documents (Word, Excel and Outlook MSGs) to the PDF. The files can have nearly every Office Version, so it is not granted that the file is a new office file (e.g. docx) or an old one (e.g. doc).
Is there any way to do this only with free tools? My first idea is to read the contnet of every file with Apache POI (http://poi.apache.org/) and recreate the file as a new PDF page, but this can become very costly, as this PDF creation is used on a server by more than fifty people.
Install open office on you server. and it will convert "docx,doc" document to ".pdf".
package naveed.workingfiles;
import java.io.*;
import com.artofsolving.jodconverter.openoffice.connection.*;
import com.artofsolving.jodconverter.openoffice.converter.*;
import com.artofsolving.jodconverter.*;
public class DocToPdf {
public static void main(String[] args) throws Exception {
//Creating the instance of OpenOfficeConnection and
//passing the port number to SocketOpenOfficeConnection constructor
OpenOfficeConnection con = new SocketOpenOfficeConnection(8100);
//making the connection with openoffice server
con.connect();
// making the object of doc file and pdf file
File inFile = new File("sample.docx");
//this is the final converted pdf file
File outFile = new File("sample.pdf");
//making the instance
DocumentConverter converter = new OpenOfficeDocumentConverter(con);
//passing both files objects
converter.convert(inFile, outFile);
con.disconnect();
}
}