This question is related to this one, but a bit more specific. I am suspecting I am not computing the hash of my pdf properly.
I would like to compute the SHA256 hash of a signed PDF.
According to PDF32000 I should:
- Get the
\ByteRange
values - Concatenate the two chunks
- Compute the SHA256
Here is what I did:
$ grep -aPo 'ByteRange\[\s*(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s*\]' dummy-signed.pdf
ByteRange[ 0 59718 72772 5058]
$ dd if=dummy-signed.pdf of=head.bin bs=1 skip=0 count=59718
59718 bytes (60 kB, 58 KiB) copied, 0.630196 s, 94.8 kB/s
$ dd if=dummy-signed.pdf of=tail.bin bs=1 skip=72772 count=5058
5058 bytes (5.1 kB, 4.9 KiB) copied, 0.064317 s, 78.6 kB/s
$ cat head.bin tail.bin > whole.bin
$ sha256sum whole.bin
04b69f55f12fa5cc7923f4307154f2702efde43b32e4a8d9dbb0507a56fcecd3 whole.bin
I checked that I am not including the <
and >
chars:
$ hexdump -C head.bin | tail -n3
0000e930 20 20 20 20 20 20 20 20 20 20 20 20 20 2f 43 6f | /Co|
0000e940 6e 74 65 6e 74 73 |ntents|
0000e946
$ hexdump -C tail.bin | head -n3
00000000 2f 46 69 6c 74 65 72 2f 41 64 6f 62 65 2e 50 50 |/Filter/Adobe.PP|
00000010 4b 4c 69 74 65 2f 4d 28 44 3a 32 30 31 39 30 31 |KLite/M(D:201901|
00000020 32 38 31 33 34 30 35 38 2b 30 31 27 30 30 27 29 |28134058+01'00')|
Unfortunately it seems my signature is wrong, but after decoding the PKCS7
signature I double checked the hash is sha256WithRSAEncryption
, so after verifying this digest I get another hash than the one I computed.
My /SubFilter
is:
$ grep -aPo '/SubFilter.*?(?=>)' dummy-signed.pdf
/SubFilter/adbe.pkcs7.detached/Type/Sig
And my PDF version is:
$ grep -aPo '%PDF-\d.\d' dummy-signed.pdf
%PDF-1.6
So from PDF32000 with adbe.pkcs7.detached
and PDF 1.6
the HASH should be SHA256 which is compatible with what I found in the PKCS7.
Just for the record, the hash I get from the signature is:
#!/bin/bash
PKCS7='out.pkcs7'
# Extract Digest (SHA256)
OFFSET=$(openssl asn1parse -inform der -in $PKCS7 | \
perl -ne 'print $1 + $2 if /(\d+):d=\d\s+hl=(\d).*?256 prim.*HEX DUMP/m')
dd if=$PKCS7 of=signed-sha256.bin bs=1 skip=$OFFSET count=256
# Extract Public key
openssl pkcs7 -print_certs -inform der -in $PKCS7 | \
tac | sed '/-----BEGIN/q' | tac > client.pem
openssl x509 -in client.pem -pubkey -noout > client.pub.pem
# Verify the signature
openssl rsautl -verify -pubin -inkey client.pub.pem < signed-sha256.bin > verified.bin
# Get Hash and compare with the computed hash from the PDF
openssl asn1parse -inform der -in verified.bin | grep -Po '\[HEX DUMP\]:\K\w+$' | tr A-F a-f
$ ./verify-signature.sh
256+0 records in
256+0 records out
256 bytes copied, 0.029548 s, 8.7 kB/s
2a3f629f7bdce750321da7f219ec5759dc9ed14818acbd3cd0b6092d5371c03a
You can access the test PDF file dummy-signed.pdf
from my gist
curl https://gist.githubusercontent.com/nowox/94dd54e484df877e1232c18bd7b91c97/raw/d249f3757137e9b665e895c900f08b1156f1bc4f/dummy-signed.pdf.base64 | base64 --decode > dummy-signed.pdf
In short
You try to extract the wrong hash value from the signature container.
In detail
I didn't recognize this earlier because I'm not really an openssl expert. Analyzing the example PDF, though, the cause of the confusion became clear.
In a PKCS#7 / CMS signature container there usually are (at least) two hash values of interest:
messageDigest
signed attribute andThe
messageDigest
signed attribute in the signature container in your example document looks like this (appearances might differ if you asn1-dump in openssl but the value should be recognizable nonetheless):As you can recognize, this attribute contains the hash value you calculated.
You on the other hand try to extract the signed hash value from the decrypted signature bytes which is not the hash of the document but instead the hash of the signed attributes!
Additionally something appears to go wrong in that extraction step, the value you should retrieve is
and not the
you got.