Using iTextSharp
, how can I merge multiple PDFs into one PDF without losing the Form Fields and their properties in each individual PDF?
(I would prefer an example using streams from a database but file system is ok as well)
I found this code that works but it flattens out my PDFs so I can't use it.
UPDATE
@Mark Storer - This is the code I am using now based on your feedback (see below) but it gives me a corrupt document after the save. I tested each of the code parts separately and it seems to be failing in the MergePdfForms
function shown below. I obviously don't want to use the renameFields
part of your example because I need the field names to remain "as is".
Public Sub MergePdfForms(ByVal pdfFiles As ArrayList, ByVal outputPath As String)
Dim ms As New IO.MemoryStream()
Dim copier As New PdfCopyFields(ms)
For Each pfile As String In pdfFiles
Dim reader As New PdfReader(pfile)
copier.AddDocument(reader)
Next
SaveMemoryStream(ms, outputPath)
copier.Close()
End Sub
Public Sub SaveMemoryStream(ms As IO.MemoryStream, FileName As String)
Dim outStream As IO.FileStream = IO.File.OpenWrite(FileName)
ms.WriteTo(outStream)
outStream.Flush()
outStream.Close()
End Sub
Fields in PDFs have an Unusual Property: All fields with the same name are the same field. They share a value. This is handy when the form refers to the same person and you have a nice naming scheme across forms. It's Not Handy when you want to put 20 instances of a single form into a single PDF.
This makes merging multiple forms challenging, to say the least. The most common option (thanks to iText), is to flatten the forms prior to merging them, at which point you're no long merging forms, and the problem Goes Away.
The other option is to rename your fields prior to merging them. This can make data extraction difficult later, can break scripts, and is generally a PITA. That's why flattening is so much more popular.
There's a class in iText called
PdfCopyFields
, and it will correctly copy fields from one document to another... it will also merge fields with the same name correctly, such that they really share a single value and Acrobat/Reader doesn't have to do a bunch of extra work on the file to get it that way before displaying it to a user.However,
PdfCopyFields
will not rename fields for you. To do that, you need to get theAcroFields
object from thePdfReader
in question, and callrenameField(String, String)
on Each And Every Field prior to merging the documents withPdfCopyFields
.All this is for "AcroForm"-based PDF forms. If you're dealing with XFA forms (forms from LiveCycle Designer), all bets are off. You have to muck with the XML, A Lot.
And heaven help you if you have to combine forms from both.
So ass-u-me-ing that you're working with AcroForm fields, the code might look something like this (forgive my Java):
Ideally,
renameFields
would also create a generic field object named prepend's-value and make all the other fields in the document it's children. This would make Acrobat/Reader's life easier and avoid an apparently unnecessary "save changes?" request when closing the resulting PDF from Acrobat.Yes, that's why Acrobat will sometimes ask you to save changes when You Didn't Do Anything! Acrobat did something behind the scenes.
you can also use this code.... it will merge all the pdf file without losing field value..