可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

at the moment I am using itext to read the page count of a pdf. This takes quite long because the lib seems to scan the whole file.

Is the page information somewhere in the header of the pdf, or is a full filescan needed?

回答1:

That's correct. iText parses quite a bit of a PDF when it is opened (it doesn't read the contents of stream objects, but that's about it)...

UNLESS you use the PdfReader(RandomAccessFileOrArray) constructor, in which case it will only read the xrefs (mostly required), but not parse anything until you start requesting specific objects (directly or via various calls).

The first PDF program I ever wrote did exactly this. It opened up a PDF and doing the bare minimum amount of work necessary, read the number of pages. It didn't even parse the xrefs it didn't have to. Haven't thought about that program in years...

So while not perfectly efficient, it'll be vastly more efficient to use a RandomAccessFileOrArray:

int efficientPDFPageCount(String path) {
  RandomAccessFileOrArray file = new RandomAccessFileOrArray(path, false, true );
  PdfReader reader = new PdfReader(file);
  int ret = reader.getNumberOfPages();
  reader.close();
  return ret;
}

Update:

The itext API underwent a little overhaul. Now (in version 5.4.x) the correct way to use it is to pass through java.io.RandomAccessFile:

int efficientPDFPageCount(File file) {
     RandomAccessFile raf = new RandomAccessFile(file, "r");
     RandomAccessFileOrArray pdfFile = new RandomAccessFileOrArray(
          new RandomAccessSourceFactory().createSource(raf));
     PdfReader reader = new PdfReader(pdfFile, new byte[0]);
     int pages = reader.getNumberOfPages();
     reader.close();
     return pages;
  }

回答2:

You just need to read the Page tree (Catalogue, Pages, Kids) and count the Page entries.

回答3:

Lars Vogel uses the following code:

PdfReader reader = new PdfReader(INPUTFILE);
int n = reader.getNumberOfPages();

I'd be surprised if the implementation of getNumberOfPages is slower than any other solution.

Section F.3.3 says there is a header-field called N described as follows:

N     integer (Required)      The number of pages in the document.

回答4:

PdfReader document = new PdfReader(new FileInputStream(new File("filename")));  
int noPages = document.getNumberOfPages();

回答5:

PdfReader document = new PdfReader(new FileInputStream(new File("filename")));   
int noPages = document.getNumberOfPages();

above is the process for counting the pdf pages

Page count of Pdf with Java

问题:

回答1:

回答2:

回答3:

回答4:

回答5:

收藏的人(0)

Page count of Pdf with Java

问题:

回答1:

回答2:

回答3:

回答4:

回答5:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮