Determine number of pages in a PDF file [closed]

2019-01-06 13:31发布

问题:

I need to determine the number of pages in a specified PDF file using C# code (.NET 2.0). The PDF file will be read from the file system, and not from a URL. Does anyone have any pointers on how this could be done? Note: Adobe Acrobat Reader is installed on the PC where this check will be carried out.

回答1:

You'll need a PDF API for C#. iTextSharp is one possible API, though better ones might exist.

iTextSharp Example

You must install iTextSharp.dll as a reference. Download iTextsharp from SourceForge.net This is a complete working program using a console application.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using iTextSharp.text.pdf;
using iTextSharp.text.xml;
namespace GetPages_PDF
{
  class Program
{
    static void Main(string[] args)
      {
       // Right side of equation is location of YOUR pdf file
        string ppath = "C:\\aworking\\Hawkins.pdf";
        PdfReader pdfReader = new PdfReader(ppath);
        int numberOfPages = pdfReader.NumberOfPages;
        Console.WriteLine(numberOfPages);
        Console.ReadLine();
      }
   }
}


回答2:

This should do the trick:

public int getNumberOfPdfPages(string fileName)
{
    using (StreamReader sr = new StreamReader(File.OpenRead(fileName)))
    {
        Regex regex = new Regex(@"/Type\s*/Page[^s]");
        MatchCollection matches = regex.Matches(sr.ReadToEnd());

        return matches.Count;
    }
}

From Rachael's answer and this one too.



回答3:

found a way at http://www.dotnetspider.com/resources/21866-Count-pages-PDF-file.aspx this does not require purchase of a pdf library



回答4:

I have used pdflib for this.

    p = new pdflib();

    /* Open the input PDF */
    indoc = p.open_pdi_document("myTestFile.pdf", "");
    pageCount = (int) p.pcos_get_number(indoc, "length:pages");


回答5:

Docotic.Pdf library may be used to accomplish the task.

Here is sample code:

PdfDocument document = new PdfDocument();
document.Open("file.pdf");
int pageCount = document.PageCount;

The library will parse as little as possible so performance should be ok.

Disclaimer: I work for Bit Miracle.



回答6:

One Line:

int pdfPageCount = System.IO.File.ReadAllText("example.pdf").Split(new string[] { "/Type /Page" }, StringSplitOptions.None).Count()-2;

Recommended: ITEXTSHARP



回答7:

PDFsharp

this one should be better =)



回答8:

I have good success using CeTe Dynamic PDF products. They're not free, but are well documented. They did the job for me.

http://www.dynamicpdf.com/



回答9:

I've used the code above that solves the problem using regex and it works, but it's quite slow. It reads the entire file to determine the number of pages.

I used it in a web app and pages would sometimes list 20 or 30 PDFs at a time and in that circumstance the load time for the page went from a couple seconds to almost a minute due to the page counting method.

I don't know if the 3rd party libraries are much better, I would hope that they are and I've used pdflib in other scenarios with success.



标签: c# pdf .net-2.0