I often get a PDF from our designer (built in Adobe InDesign) which is supposed to be sent out to thousands of people.
I've got the list with all the people, and it's easy doing a mail merge in OpenOffice.org. However, OpenOffice.org doesn't support the advanced PDF. I just want to output some text onto each page and print it out.
Here's how I do it now: print out 6.000 copies of the PDF, then put all of them into the printer again and just print out name, address and other information on top of it. But that's expensive.
Sadly, I can't make the PDF to an image and use that in OpenOffice.org because it grinds the computer to a halt. It also takes extremely long time to send this job to the printer.
So, is there an easy way to do this mail merge (preferably in Python) without paying for third party closed solutions?
Probably the best way would be to generate another PDF with the missing text, and overlay one PDF over the other. A quick Google found this link showing how to do it in Acrobat, and I'm sure there are other methods as well.
http://forums.macrumors.com/showthread.php?t=508226
Now I've made an account. I fixed it by using the ingenious pdftk.
In my quest I totally overlook the feature "background" and "overlay". My solution was this:
Creating the
names.pdf
you can easily do with Python reportlab or similar PDF-creation scripts. It's best using code to do that, creating 6k pages took several hours in LibreOffice/OpenOffice, while it took just a few seconds using Python.Someone asked for specifics. I didn't want to sully my top answer with it, because you can do it how you like (and just knowing pdftk is up to it should give people the idea).
But here's some scripts I used ages ago:
csv_to_pdf.py
When you've ran this, you have a file with thousands of pages, only with a name on it. This is when you can background the fancy PDF under all of them:
For a no-mess, no-fuss solution, use iText to simply add the text to the pdf. For example, you can do the following to add text to a pdf document once loaded:
From there on, save it as a different file, and print it.
However, I've found that form fields are the way to go with pdf document generation from templates.
If you have a template with form fields (added with Adobe Acrobat), you have one of two choices :
A sample FDF file looks like this (stolen from Planet PDF) :
Because of the simple format and the small size of the FDF, this is the preferred approach, and the approach should work well in any language.
As for filling the fields programmatically, you can use iText in the following way :
You could probably look at a PDF library like iText. If you have some programming knowledge and a bit of time you could write some code that adds the contact information to the PDFs
One easy way would be to create a fillable pdf form from the original document in Acrobat and do a mail merge with the form and a csv.
PDF mail merges are relatively easy to do in python and pdftk. Fdfgen (
pip install fdfgen
) is a python library that will create an fdf from a python array, so you can save the excel grid to a csv, make sure that the csv headers match the name of the pdf form field you want to fill with that column, and do something likeI've encountered this problem enough to write my own free solution, PdfZero. PdfZero has a mail merge feature to merge spreadsheets with PDF forms. You will still need to create a PDF form, but you can upload the form and csv to pdfzero, select which form fields you want filled with which columns, create a naming convention for each filled pdf using the csv data if needed, and batch generate the filled PDfs.
DISCLAIMER: I wrote PdfZero