I need to determine the number of pages in a specified PDF file using C# code (.NET 2.0). The PDF file will be read from the file system, and not from a URL. Does anyone have any pointers on how this could be done? Note: Adobe Acrobat Reader is installed on the PC where this check will be carried out.
问题:
回答1:
You'll need a PDF API for C#. iTextSharp is one possible API, though better ones might exist.
iTextSharp Example
You must install iTextSharp.dll as a reference. Download iTextsharp from SourceForge.net This is a complete working program using a console application.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using iTextSharp.text.pdf;
using iTextSharp.text.xml;
namespace GetPages_PDF
{
class Program
{
static void Main(string[] args)
{
// Right side of equation is location of YOUR pdf file
string ppath = "C:\\aworking\\Hawkins.pdf";
PdfReader pdfReader = new PdfReader(ppath);
int numberOfPages = pdfReader.NumberOfPages;
Console.WriteLine(numberOfPages);
Console.ReadLine();
}
}
}
回答2:
This should do the trick:
public int getNumberOfPdfPages(string fileName)
{
using (StreamReader sr = new StreamReader(File.OpenRead(fileName)))
{
Regex regex = new Regex(@"/Type\s*/Page[^s]");
MatchCollection matches = regex.Matches(sr.ReadToEnd());
return matches.Count;
}
}
From Rachael's answer and this one too.
回答3:
found a way at http://www.dotnetspider.com/resources/21866-Count-pages-PDF-file.aspx this does not require purchase of a pdf library
回答4:
I have used pdflib for this.
p = new pdflib();
/* Open the input PDF */
indoc = p.open_pdi_document("myTestFile.pdf", "");
pageCount = (int) p.pcos_get_number(indoc, "length:pages");
回答5:
Docotic.Pdf library may be used to accomplish the task.
Here is sample code:
PdfDocument document = new PdfDocument();
document.Open("file.pdf");
int pageCount = document.PageCount;
The library will parse as little as possible so performance should be ok.
Disclaimer: I work for Bit Miracle.
回答6:
One Line:
int pdfPageCount = System.IO.File.ReadAllText("example.pdf").Split(new string[] { "/Type /Page" }, StringSplitOptions.None).Count()-2;
Recommended: ITEXTSHARP
回答7:
PDFsharp
this one should be better =)
回答8:
I have good success using CeTe Dynamic PDF products. They're not free, but are well documented. They did the job for me.
http://www.dynamicpdf.com/
回答9:
I've used the code above that solves the problem using regex and it works, but it's quite slow. It reads the entire file to determine the number of pages.
I used it in a web app and pages would sometimes list 20 or 30 PDFs at a time and in that circumstance the load time for the page went from a couple seconds to almost a minute due to the page counting method.
I don't know if the 3rd party libraries are much better, I would hope that they are and I've used pdflib in other scenarios with success.