How to find whether PDF has landscape orientation

2020-06-05 02:07发布

问题:

Are there tools to determine whether a PDF has landscape orientation or portrait?

I have currently looked upon pdfbox and Itext for this but seem that I could not find it. Please tell if they support this.

Extracting the PDF pages information using Origami is providing a information the pdf pages have rotation of some degree. Here is what Origami reports:

{:Parent=>#<PDF::Reader::Reference:0x872349c @id=8, @gen=0>, :Type=>:Page, 
 :Contents=>#<PDF::Reader::Reference:0x8722f24 @id=4, @gen=0>, :Resources=># <PDF::Reader::Reference:0x870dbd8 @id=2, @gen=0>, 
:MediaBox=>[0, 0, 612, 792], :Rotate=>270}

Rotate : 270

What does the 'rotation' actually mean?

回答1:

The pdfinfo commandline utility has a way to let you see the page size info and MediumBox, CropBox, BleedBox, ArtBox and TrimBox values for each and every page. Here I ask about the values for pages 2 to 4 of a specific document:

pdfinfo -box -f 2 -l 4 sample.pdf
  Creator:        FrameMaker 6.0
  Producer:       Acrobat Distiller 5.0.5 (Windows)
  CreationDate:   Thu Aug 17 16:43:06 2006
  ModDate:        Tue Aug 22 12:20:24 2006
  Tagged:         no
  Form:           AcroForm
  Pages:          146
  Encrypted:      no
  Page    2 size: 419.535 x 297.644 pts
  Page    2 rot:  90
  Page    3 size: 297.646 x 419.524 pts
  Page    3 rot:  0
  Page    4 size: 297.646 x 419.524 pts
  Page    4 rot:  0
  Page    2 MediaBox:     0.00     0.00   595.00   842.00
  Page    2 CropBox:     87.25   430.36   506.79   728.00
  Page    2 BleedBox:    87.25   430.36   506.79   728.00
  Page    2 TrimBox:     87.25   430.36   506.79   728.00
  Page    2 ArtBox:      87.25   430.36   506.79   728.00
  Page    3 MediaBox:     0.00     0.00   595.00   842.00
  Page    3 CropBox:    148.17   210.76   445.81   630.28
  Page    3 BleedBox:   148.17   210.76   445.81   630.28
  Page    3 TrimBox:    148.17   210.76   445.81   630.28
  Page    3 ArtBox:     148.17   210.76   445.81   630.28
  Page    4 MediaBox:     0.00     0.00   595.00   842.00
  Page    4 CropBox:    148.17   210.76   445.81   630.28
  Page    4 BleedBox:   148.17   210.76   445.81   630.28
  Page    4 TrimBox:    148.17   210.76   445.81   630.28
  Page    4 ArtBox:     148.17   210.76   445.81   630.28
  Page    4 MediaBox:     0.00     0.00   595.00   842.00
  File size:      6888764 bytes
  Optimized:      yes
  PDF version:    1.4

Note the following:

  • *Box values: these are 4 numbers whose units are PostScript points: the first pair represents the coordinates of the lower left corner, the second pair represents coordinates of the upper right corner.

  • MediaBox: Is a required setting for each page inside the PDF.

  • TrimBox: Is an optional setting and defaults to the same as MediaBox if it is not explicitly defined. If it deviates from the MediaBox, then it tells PDF viewers (and printer drivers) to only render and display that particular part of the full page.

  • Page size: This info is derived + computed from the distances that are set up by the TrimBox value.

  • rot: This gives the value of the page rotation. May be 0, 90, 180 or 270 degrees.

Now, the page's landscape and portrait definitions are this:

  • It is regarded as 'landscape' if the width is greater than the height.
  • It is regarded as 'portrait' if the height is greater than the width.
  • It is undetermined if width and height have the same value.

But!,....

  • ...you can put a non-zero /Rotation value into your PDF source code (which pdfinfo will show as rot: info) and achieve this way that a 'portrait' PDF page will display as 'landscape' and vice-versa;

  • ...you could define a 'landscape' shaped '/TrimBoxinside a 'portrait' shaped/MediaBox` or vice versa, as well as mix it with a non-zero rotation, and achieve this way that the 'landscape' shaped content will appear in 'portrait' (or upside-down) look...

Confused about this? Don't worry, many are. Fact is, 'landscape' and 'portrait' aren't clearly and un-ambiguously defined technical terms. They are just conventions to describe what we see...