I'm using a PDF converter to access the graphical data within a PDF. Everything works fine, except that I don't get a list of the bookmarks. Is there a command-line app or a C# component that can read a PDF's bookmarks? I found the iText and SharpPDF libraries and I'm currently looking through them. Have you ever done such a thing?
问题:
回答1:
Try the following code
PdfReader pdfReader = new PdfReader(filename);
IList<Dictionary<string, object>> bookmarks = SimpleBookmark.GetBookmark(pdfReader);
for(int i=0;i<bookmarks.Count;i++)
{
MessageBox.Show(bookmarks[i].Values.ToArray().GetValue(0).ToString());
if (bookmarks[i].Count > 3)
{
MessageBox.Show(bookmarks[i].ToList().Count.ToString());
}
}
Note: Don't forget to add iTextSharp DLL to your project.
回答2:
You might try Docotic.Pdf library for the task if you are fine with a commercial solution.
Here is a sample code to list all top-level items from bookmarks with some of their properties.
using (PdfDocument doc = new PdfDocument("file.pdf"))
{
PdfOutlineItem root = doc.OutlineRoot;
foreach (PdfOutlineItem item in root.Children)
{
Console.WriteLine("{0} ({1} child nodes, points to page {2})",
item.Title, item.ChildCount, item.PageIndex);
}
}
PdfOutlineItem class also provides properties related to outline item styles and more.
Disclaimer: I work for the vendor of the library.
回答3:
If a commercial library is an option for you you could give Amyuni PDF Creator .Net a try.
Use the class Amyuni.PDFCreator.IacDocument.RootBookmark to retrieve the root of the bookmarks' tree, then the properties in IacBookmark to access each tree element, to navigate through the tree, and to add, edit or remove elements if needed.
Usual disclaimer applies
回答4:
You can use the PDFsharp library. It is published under the MIT License so it can be used even in corporate development. Here is an untested example.
using PdfSharp.Pdf;
using (PdfDocument document = PdfReader.IO.Open("bookmarked.pdf", IO.PdfDocumentOpenMode.Import))
{
PdfDictionary outline = document.Internals.Catalog.Elements.GetDictionary("/Outlines");
PrintBookmark(outline);
}
void PrintBookmark(PdfDictionary bookmark)
{
Console.WriteLine(bookmark.Elements.GetString("/Title"));
for (PdfDictionary child = bookmark.Elements.GetDictionary("/First"); child != null; child = child.Elements.GetDictionary("/Next"))
{
PrintBookmark(child);
}
}
Gotchas:
- PdfSharp doesn't support open pdf's over version 1.6 very well. (throws:
cannot handle iref streams. the current implementation of pdfsharp cannot handle this pdf feature introduced with acrobat 6
) - There are many types of strings in PDFs which PDFsharp returns as is including UTF-16BE strings. (7.9.2.1 ISO32000 2008)