I'm trying to batch convert a bunch of assorted iWork files (Numbers, Pages, Keynote) to PDF on the command line.
I've been trying cups-filter but there's no MIME type filter for the iWork types. I then looked into using qlmanage to generate the preview image and use that, but this doesn't seem to work for multi file Keynote documents as they generate as HTML rather than PDF.
Any suggestions? I'd rather not resort to AppleScript.
Well... you need something that
- understand the iWork file formats,
- can render the documents to then create the PDF.
Unless you want to re-invent the iWork suite... Sounds simpler to just tell the iWork apps what you want from them.
You would do that via the Scripting Bridge
I would use Applescript, but perhaps you can use Ruby and Python with the Scripting Bridge to accomplish what you need
With Scripting Bridge, RubyCocoa and PyObjC scripts can do what AppleScript scripts can do: control scriptable applications and exchange data with them.
I haven't used the Scripting Bridge in a while, but I believe you can tell applications to print documents. And any application that can print in OS X can send it to PDF instead.
Here's what I ended up going with, since I really wanted to avoid, AppleScript.
When saving an iWork document there's a "Include Preview In Document" checkbox. Checking this creates a "QuickLook/Preview.pdf" inside the iWork document bundle (which is actually a zip file). Luckily I had this checked for most of the zip files, so it was simply a case of unzipping to NSTemporaryDirectory and grabbing that file.
For those that didn't I put together a script to run qlmanage to create the document preview. For some that creates the PDF, for others it creates an HTML file. You can then use http://code.google.com/p/wkhtmltopdf/ to convert this HTML to a PDF.
Here are a couple of commands to help those who want to get this working without much thought. It worked for me with a ppt file.
Make sure to get wkhtmltopdf from here.
qlmanage -p -o /tmp /path/of/file.ppt
wkhtmltopdf /tmp/file.ppt.qlpreview/Preview.html /output/to/file.pdf
You may have to fiddle with sizes if you want the original pages to stay consistent, for the ppt I was using the following parameters did the job:
wkhtmltopdf --page-width 200 --page-height 145 Preview.html file.pdf
Edit: I have written a Python script to do a batch conversion. Hopefully people can contribute to make it more robust:
https://github.com/matthewfitch23/DocToPdf
I created an .applescript
script that converts all .pages
files within a folder to .docx
. .pdf
support can be easily added. In pages2docx.applescript
you just need to replace Microsoft Word
with PDF
.