iText How to create multi-page document from a fil

2019-06-14 08:58发布

问题:

I am trying to create a multi-page PDF document in iText with filled forms, one for each person. I have looked up examples of how to do this on the internet and used those examples in my solution.

The PDF template is one created with Adobe Acrobat Pro.

I have been able to successfully fill in and return a single-page PDF document from my template using iText, but the multi-document process doesn't seem to work right.

This my program that demonstrates what I am trying to do:

import com.itextpdf.text.pdf.AcroFields;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfStamper;
import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.PdfSmartCopy;

import java.util.Date;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.text.NumberFormat;
import java.io.IOException;
import java.io.ByteArrayOutputStream;
import java.io.FileOutputStream;

public class ITextTest
{
    public static final String TEMPLATE =
    "C:\\RAD7_5\\iTextTest\\iTextTest\\input\\LS213_1.pdf";

    public static void main(String[] args)
    {
        ITextTest iTextTest = new ITextTest();
        iTextTest.doItextTest();
    }

    public void doItextTest()
    {
        try
        {
            PdfReader pdfReader;
            PdfStamper pdfStamper;
            ByteArrayOutputStream baos;

            Document document = new Document();
            PdfSmartCopy pdfSmartCopy = new PdfSmartCopy(document,
                    new FileOutputStream("C:\\RAD7_5\\iTextTest\\iTextTest\\output\\LS213_1MultiTest.pdf"));

            DateFormat dateFormat = new SimpleDateFormat("MM/dd/yyyy");
            Date currDate = new Date();
            NumberFormat numberFormat = NumberFormat.getCurrencyInstance();
            double amount = 4127.29d;

            document.open();

            for(int i = 1; i <= 5; i++)
            {
                pdfReader = new PdfReader(TEMPLATE);
                baos = new ByteArrayOutputStream();
                pdfStamper = new PdfStamper(pdfReader, baos);

                AcroFields acroFields = pdfStamper.getAcroFields();

                //key statement 1
                acroFields.setGenerateAppearances(true);

                //acroFields.setExtraMargin(5, 5);
                acroFields.setField("Name and Address", "John Doe\n123 Anywhere St.\nAnytown, USA 12345");
                acroFields.setField("Case Number", "123456789");
                acroFields.setField("Employer", "Employer Co., Inc.\n456 Anyhow ln.\nAnyville, USA 67890");
                acroFields.setField("Date", dateFormat.format(currDate));
                acroFields.setField("Name", "John Doe");
                acroFields.setField("restitution check No", "65432" + i);
                acroFields.setField("in the sum of", numberFormat.format(amount));

                //key statement 2
                pdfStamper.setFormFlattening(false);

                pdfStamper.close();
                pdfReader.close();

                pdfReader = new PdfReader(baos.toByteArray());
                pdfSmartCopy.addPage(pdfSmartCopy.getImportedPage(pdfReader, 1));
                pdfSmartCopy.freeReader(pdfReader);
                pdfReader.close();
            }

            document.close();
        }
        catch(DocumentException dex)
        {
            dex.printStackTrace();
            System.exit(1);
        }
        catch(IOException ex)
        {
            ex.printStackTrace();
            System.exit(1);
        }
    }
}

In the code above, you can see two key statements that affect the result of the filled template:

acroFields.setGenerateAppearances(true);
pdfStamper.setFormFlattening(false);

With the above two statements, if I set the first one to true and the second one to false, it fills the fields, but they are misaligned with the labels. Also, after the first template copy, each copy after that has some unfilled fields for some reason.

If I set them both to true:

acroFields.setGenerateAppearances(true);
pdfStamper.setFormFlattening(true);

it sets all of the fields in all of the template copies. This is the most successful result for me so far, but the filled fields are still misaligned with the labels and setting form flattening to true no longer allows a user to correct a field manually afterwards if the data in the application is wrong.

If I set the first one to false and the second one to true:

acroFields.setGenerateAppearances(false);
pdfStamper.setFormFlattening(true);

all of the fields are completely blank (worst result).

If I set them both to false:

acroFields.setGenerateAppearances(false);
pdfStamper.setFormFlattening(false);

then the fields are filled and appear in the right alignment with the labels. But the fields appear blank for some reason until you click on them. And the problem with some fields being wiped out in subsequent pages occurs like in the true false scenario (first scenario mentioned).

I am wondering if it is possible to get this to work without misaligned field values, without flattening the fields, and without lost fields on subsequent pages.

I know you can adjust margins afterwards using

acroFields.setExtraMargin(extraMarginLeft, extraMarginTop)

but using

acroFields.setGenerateAppearances(false)

works perfectly for a single form without having to adjust margins and I want it to work for a multi-page document as well.

Also, using

acroFields.setGenerateAppearances(true)

causes the text to move and be displaced a little bit in the textbox when you click on it. This happens for both single-page documents and multi-page documents. There seems to be a bug in either iText or PDF templates created with Adobe Pro when setting fields with setGenerateAppearances(true).

I am currently using iText 5.5.8.

Any help with this issue would be greatly appreciated. Thanks for taking the time to read this.

回答1:

It's a very long question, and I guess that makes it difficult for people to answer it. I don't have a conclusive answer either, because I can't reproduce the problem. However, I can clarify a couple of things.

1. In PDF, one field can correspond with more than one widget annotation. One field can have only one value.

Suppose that you have a PDF form with a field named "name". It is possible for that field to appear on different places in the document. For instance: if a form has multiple pages, the field "name" could correspond with a widget annotation on every page (e.g. in the header).

The field "name" can only have a single value, for instance: "Charles Carrington." If the field corresponds with different widget annotations, then each of these visualizations should show the same name.

It is impossible for the field "name" to have the name "Charles Carrington" on one page, and "Bruno Lowagie" on another other page.

Why is this important for you?

You experimented with setFormFlattening().

If you use this method with the value false, then you are doing a couple of things wrong:

  1. You don't tell PdfSmartCopy that you are merging forms: How to merge forms from different files into one PDF?
  2. You are violating the rule "1 field = 1 value" because you first fill out fields in different forms with different values (e.g. form 1: name = Charles Carrington; form 2: name = Bruno Lowagie), then you merge these forms into a form where you suddenly expect one field to have different values (e.g. mergedform: name = Charles Carrington on page 1; name = Bruno Lowagie on page 2). This is in violation with ISO-32000-1.

You can avoid this problem by:

  • renaming the fields if it's important to you that the interactivity is preserved.
  • flattening the fields: in this case, all fields are removed. Instead you add the appearance of the values. All interactivity is lost.

2. In PDF, a field has a value, but it can also have one or more appearances.

Suppose that you have a PDF form with a field named "birthdate" The value of that field is "1970-06-10" (and that's also the way how it's stored in a database). However, when you fill out the field in a PDF document, you want it to show as "June 10, 1970".

This is possible. The value of the /V key of the field dictionary will be the PDF string 1970-06-10. However, the /DA key will define an appearance that shows "June 10, 1970".

It is even possible to have a field with a single value (1970-06-10) corresponding with different widget annotations that have a different appearance: "June 10, 1970", "10 juin 1970", "10 juni 1970", and so on.

Why is this important for you?

You have been experimenting with setGenerateAppearances().

When you use this method with the value false, you instruct iText to omit the /DA: no appearance is created. When you flatten the form, the fields are empty. When you don't flatten the form, the PDF viewer will create the appearance. Since Adobe and other vendors haven't always been consistent in the way they render PDF, it is very hard to predict what that appearance will look like. One viewer will show the value at one position within the rectangle defined for the widget annotation; another viewer will show it with a different offset.

When you use this method with the value true, you instruct iText to create the appearance in a consistent way.

However: if you don't flatten the form, you can have the effect described in my answer to the question Why does iText enter a cross symbol when CheckType style is check mark? In this example, you see that the appearance of a check box depends on whether or not the field is high-lighted. The same goes for text fields: the appearance can be different whether or not you select it. Also: when you change the value, you get the appearance as created by the viewer. E.g. when you click on "June 10, 1970" because you want to change it, you will suddenly see "1970-06-10" because that's the value that is stored for the field and that value is generated by the viewer.

If you flatten the form, then iText creates the appearance and it removes all interactivity. In this case, the viewer doesn't create any appearance: there are no more fields in the form.

3. iText always creates the flattened appearance in the same way.

This is the mystery that remains after reading your question. You claim that the appearance when you fill and flatten a single form is different from the appearance when you fill and flatten many forms and then concatenate them. I can't reproduce that problem. The only answer I can give you to that question is: It works for me. (If you don't believe me, then please watch this tutorial.)

Please adapt your example based on the information given in 1. and 2., then post a new, shorter question if the problem persists.