How to read PDF form data using iTextSharp?

2019-01-07 22:00发布

问题:

I am trying to find out if it is possible to read PDF Form data (Forms filled in and saved with the form) using iTextSharp. How can I do this?

回答1:

You would have to find out the field names in the PDF form. Get the fields and then read their value.

string pdfTemplate = "my.pdf";
PdfReader pdfReader = new PdfReader(pdfTemplate);
AcroFields fields = pdfReader.AcroFields.Fields;
string val = fields.GetField("fieldname");

Obviously in the code above, field name is the name of the PDF form field and the GetField method returns a string representation of that value. Here is an article with example code that you could probably use. It shows how you can both read and write form fields using iTextSharp.



回答2:

Maybe the iTextSharp library has changed recently but I wasn't able to get the accepted answer to work. Here is my solution:

var pdf_filename = "pdf2read.pdf";
using (var reader = new PdfReader(pdf_filename))
{
    var fields = reader.AcroFields.Fields;

    foreach (var key in fields.Keys)
    {
        var value = reader.AcroFields.GetField(key);
        Console.WriteLine(key + " : " + value);
    }
}

A very subtle difference, due to reader.AcroFields.Fields returning an IDictionary instead of just an AcroFields object.



回答3:

If you are using Powershell, the discovery code for fields is:

    Add-Type -Path C:\Users\Micah\Desktop\PDF_Test\itextsharp.dll
    $MyPDF = "C:\Users\Micah\Desktop\PDF_Test\something_important.pdf"
    $PDFDoc = New-Object iTextSharp.text.pdf.pdfreader -ArgumentList $MyPDF
    $PDFDoc.AcroFields.Fields

That code will give you the names of all the fields on the PDF Document, "something_important.pdf".

This is how you access each field once you know the name of the field:

    $PDFDoc.AcroFields.GetField("Name of the field here")


回答4:

This worked for me! Note the parameters when defining stamper! '\0', true

            string TempFilename = Path.GetTempFileName();

            PdfReader pdfReader = new PdfReader(FileName);
            //PdfStamper stamper = new PdfStamper(pdfReader, new FileStream(TempFilename, FileMode.Create));
            PdfStamper stamper = new PdfStamper(pdfReader, new FileStream(TempFilename, FileMode.Create), '\0', true);

            AcroFields fields = stamper.AcroFields;
            AcroFields pdfFormFields = pdfReader.AcroFields;

            foreach (KeyValuePair<string, AcroFields.Item> kvp in fields.Fields)
            {
                string FieldValue = GetXMLNode(XMLFile, kvp.Key);
                if (FieldValue != "")
                {
                    fields.SetField(kvp.Key, FieldValue);
                }
            }

            stamper.FormFlattening = false;
            stamper.Close();
            pdfReader.Close()


回答5:

The PDF name is "report.pdf"..

The data field to be read into TextBox1 is "TextField25" in the PDF..

        Dim pdf As String = "report.pdf"
        Dim reader As New PdfReader(pdf)
        Dim fields As AcroFields = reader.AcroFields
        TextBox1.Text = fields.GetField("TextField25")

Important Note: This can be used ONLY IF the PDF is not flattened (means the fields should be editable) while it was created using iTextSharp..

i.e.

       pdfStamper.FormFlattening = False

This is very simple.. And it works like a charm.. :)



标签: c# forms pdf itext