How to create and apply redactions?

2019-01-20 05:04发布

问题:

Is there any way to implement PDF redaction using iText? Working with the Acrobat SDK API I found that redactions also just seem to be annotations with the subtype "Redact". So I was wondering if it's possible to create those in iTextSharp as well?

With the Acrobat SDK the code would like simply like this:

AcroPDAnnot annot = page.AddNewAnnot(-1, "Redact", rect) as AcroPDAnnot;

(I haven't be able to apply them though as annot.Perform(avDoc) does not seem to work. Ideas?)

In iTextSharp I can create simple text annotations like this

PdfAnnotation annotation = PdfAnnotation.CreateText(stamper.Writer, rect, "Title", "Content", false, null);

The only other option I found so was was to create black rectangles as explained here, but that doesn't remove the text (it can still be selected). I want to create redaction annotations and eventually apply redaction.

// Update:

As I finally got around to create a working example I wanted to share it here. It does not apply the redactions in the end but it creates valid redactions which are properly shown within Acrobat and can then be applied manually.

            using (Stream stream = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
        {
            PdfReader pdfReader = new PdfReader(stream);
            // Create a stamper
            using (PdfStamper stamper = new PdfStamper(pdfReader, new FileStream(newFileName, FileMode.OpenOrCreate)))
                {
                    // Add the annotations
                    int page = 1;
                    iTextSharp.text.Rectangle rect = new iTextSharp.text.Rectangle(500, 50, 200, 300);

                    PdfAnnotation annotation = new PdfAnnotation(stamper.Writer, rect);
                    annotation.Put(PdfName.SUBTYPE, new PdfName("Redact"));

                    annotation.Title = "My Author"; // Title = author
                    annotation.Put(new PdfName("Subj"), new PdfName("Redact")); // Redaction "Subject". When created in Acrobat, this is always set to "Redact"

                    float[] fillColor = { 0, 0, 0 }; // Black
                    annotation.Put(new PdfName("IC"), new PdfArray(fillColor)); // Interior color

                    float[] fillColorRed = { 1, 0, 0 }; // Red
                    annotation.Put(new PdfName("OC"), new PdfArray(fillColorRed)); // Outline color

                    stamper.AddAnnotation(annotation, page);
                }

        }

回答1:

Answer 1: Creating redaction annotations

iText is a toolbox that gives you the power to create any object you want. You are using a convenience method to create a Text annotation. That's scratching the surface.

You can use iText to create any type of annotation you want, because the PdfAnnotation class extends the PdfDictionary class.

This is explained in chapter 7 of my book "iText in Action - Second edition". GenericAnnotations is the example that illustrates this functionality.

If we port this example to C#, we have:

PdfAnnotation annotation = new PdfAnnotation(writer, rect);
annotation.Title = "Text annotation";
annotation.Put(PdfName.SUBTYPE, PdfName.TEXT);
annotation.Put(PdfName.OPEN, PdfBoolean.PDFFALSE);
annotation.Put(PdfName.CONTENTS,
  new PdfString(string.Format("Icon: {0}", text))
);
annotation.Put(PdfName.NAME, new PdfName(text));
writer.AddAnnotation(annotation);

This is a manual way to create a text annotation. You want a Redact annotations, so you'll need something like this:

PdfAnnotation annotation = new PdfAnnotation(writer, rect);
annotation.Put(PdfName.SUBTYPE, new PdfName("Redact"));
writer.AddAnnotation(annotation);

You can use the Put() method to add all the other keys you need for the annotation.

Answer 2: How to "apply" a redaction annotation

The second question requires the itext-xtra.jar (an extra jar shipped with iText) and you need at least iText 5.5.4. The approach to add opaque rectangles doesn't apply redaction: the redacted content is merely covered, not removed. You can still select the text and copy/paste it. If you're not careful, you risk ending up with a so-called PDF Blackout Folly. See for instance the NSA / AT&T scandal.

Suppose that you have a file to which you added some redaction annotations: page229_redacted.pdf

We can now use this code to remove the content marked by the redaction annotations:

public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
    PdfReader reader = new PdfReader(src);
    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    PdfCleanUpProcessor cleaner = new PdfCleanUpProcessor(stamper);
    cleaner.cleanUp();
    stamper.close();
    reader.close();
}

This results in the following PDF: page229_apply_redacted.pdf

As you can see, the red rectangle borders are replaced by filled black rectangles. If you try to select the original text, you'll notice that is is no longer present.