Verifying digital signatures in PDF documents

2019-05-07 08:17发布

问题:

I'm trying to verify PDF's digital signatures.

I know that when a PDF is signed, a byterange is defined, the certificates get embedded, and from what i've read, the signed message digest and the timestamp are also stored in the PDF.

I already can extract the certificates and validate them. Now I'm trying to validate the pdf's integrity and my problem is I don't know where the signed message digest is located.

In this sample signed pdf from Adobe (http://blogs.adobe.com/security/SampleSignedPDFDocument.pdf), i can clearly identify the digest since it is down below the embedded certificates: /DigestMethod/MD5/DigestValue/ (line 1520).

But that PDF sample seems to be from 2009, and I suspect the message digest is stored in a different way now, because I signed a PDF with Adobe Reader and also with iText, and I can't find any message digest field like the previous one. Can someone tell if the digests are now stored in a different way? Where are they located?

Anyway, for now I'm using that sample document from Adobe, and trying to verify its integrity. I'm getting the document's bytes to be signed acording to the specified byterange, and digesting them with MD5 algorithm, but the digest value I get doesn't match with the one from the message digest field... Am I doing something wrong? Is the digest also signed with the signer's private key?

I appreciate any help.

回答1:

There are numerous details to get right when calculating the hash for integrated PDF signatures, among them:

  • Extract the correct bytes from the PDF to hash. The ByteRange tells you exactly which byte ranges are signed. To be accepted in modern signing contexts, the ranges must cover the whole PDF file revision with the exception of the value of Contents.

    Beware, the value of Contents includes the the leading '<' and the trailing '>' brackets.

  • Don't use a regular text editor or text processing instructions (like readln or writeln) to process PDFs. PDFs are binary in nature, even if they look textual to the naked eye. Copying PDF parts using such text related operations most likely changes them in details, definitively breaking the signature hash value.

When in doubt, don't guess but read the specification. A copy of ISO 32000-1 has been made available by Adobe here, and much you need to know about the PDF format to start processing them can be found there and in other public standards referenced in there. A very short introduction to integrated PDF signatures can be found in this answer and documents referenced from there.