-->

PDF File header sequence: Why '25 e2 e3 cf d3&

2020-08-01 06:08发布

问题:

I know that inform to a reader whether the pdf contains binary or not.

But why "25 e2 e3 cf d3" not random binary? Because so many document has that.

Is it Just because, so many use same pdf library ?

Refs:

PDF format. function of %-started sequence

comp.text.pdf>pdf format

回答1:

Looking through the PDFs I have here it looks like a number of PDF processors use these very letters "%âãÏÓ", among them Adobe products.

Not all of those processors use the same basic PDF library, so the use of the same letters cannot be explained by something like that.

Most likely it is due to the fact that Adobe software creates PDFs with that second line comment. For many years developers of other software used example files produced by Adobe software as templates for the PDFs they created.

Yes, the specification ISO 32000-1 merely requires

If a PDF file contains binary data, as most do (see 7.2, "Lexical Conventions"), the header line shall be immediately followed by a comment line containing at least four binary characters—that is, characters whose codes are 128 or greater.

(and the earlier PDF references also recommend the same), so there is no need to use the same binary characters.

But there also is no reason not to use them. Why deviate from the working example files produced by Adobe software in this regard?

Especially in the years before the ISO specification, when there only were the PDF references, one tended to be as Adobe-like as possible in the document structure created as the PDF references were not considered normative in nature by Adobe. Thus, if your document was valid by the references, Adobe viewers could still reject it without that counting as a bug...