iText - crop out a part of pdf file

2019-05-20 06:49发布

问题:

I have a small problem and I'm trying for some time to find out a solution. Long story short I have to remove the top part of each page from a pdf with itext. I managed to do this with CROPBOX, but the problem is that this will make the pages smaller by removing the top part.

Can someone help me to implement this so the page size remains the same. My idea would be to override the top page with a white rectangle, but after many tries I didn't manage to do this.

This is the current code I'm using to crop the page.

PdfRectangle rect = new PdfRectangle(55, 0, 1000, 1000);
PdfDictionary pageDict;
for (int curentPage = 2; curentPage <= pdfReader.getNumberOfPages(); curentPage++) {
    pageDict = pdfReader.getPageN(curentPage);
    pageDict.put(PdfName.CROPBOX, rect);
}

回答1:

In your code sample, you are cropping the pages. This reduces the visible size of the page.

Based on your description, you don't want cropping. Instead you want clipping.

I've written an example that clips the content of all pages of a PDF by introducing a margin of 200 user units (that's quite a margin). The example is called ClipPdf and you can see a clipped page here: hero_clipped.pdf (the iText superhero has lost arms, feet and part of his head in the clipping process.)

public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
    PdfReader reader = new PdfReader(src);
    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    int n = reader.getNumberOfPages();
    PdfDictionary page;
    PdfArray media;
    for (int p = 1; p <= n; p++) {
        page = reader.getPageN(p);
        media = page.getAsArray(PdfName.CROPBOX);
        if (media == null) {
            media = page.getAsArray(PdfName.MEDIABOX);
        }
        float llx = media.getAsNumber(0).floatValue() + 200;
        float lly = media.getAsNumber(1).floatValue() + 200;
        float w = media.getAsNumber(2).floatValue() - media.getAsNumber(0).floatValue() - 400;
        float h = media.getAsNumber(3).floatValue() - media.getAsNumber(1).floatValue() - 400;
        String command = String.format(
                "\nq %.2f %.2f %.2f %.2f re W n\nq\n",
                llx, lly, w, h);
        stamper.getUnderContent(p).setLiteral(command);
        stamper.getOverContent(p).setLiteral("\nQ\nQ\n");
    }
    stamper.close();
    reader.close();
}

Obviously, you need to study this code before using it. Once you understand this code, you'll know that this code will only work for pages that aren't rotated. If you understand the code well, you should have no problem adapting the example for rotated pages.

Update

The re operator constructs a rectangle. It takes four parameters (the values preceding the operator) that define a rectangle: the x coordinate of the lower-left corner, the y coordinate of the lower-left corner, the width and the height.

The W operator sets the clipping path. We have just drawn a rectangle; this rectangle will be used to clip the content that follows.

The n operator starts a new path. It discards the paths we've constructed so far. In this case, it prevents that the rectangle we have drawn (and that we use as clipping path) is actually drawn.

The q and Q operators save and restore the graphics state stack, but that's rather obvious.

All of this is explained in ISO-32000-1 (available online if you Google well) and in the book The ABC of PDF.