iText XML to PDF using latest version

2020-02-05 09:13发布

问题:

I found a few examples showing how to use the XML to PDF using a iText XML document. But they are all for the older version 4.x. Is there any examples or can someone post an example of the required/updated code to do the same in version 5.x?

All the examples refere to code like this, but I can not find what to use to replace the ITextHandler class with in the new version.
http://www.ridgway.co.za/archive/2005/07/31/itextsharpxmltopdfexample.aspx

Document document = new Document();
PdfWriter.GetInstance(document, new FileStream("ExampleDoc.pdf", FileMode.Create));
ITextHandler xmlHandler = new ITextHandler(document);
xmlHandler.Parse("ExampleDoc.xml");

Also, I am not trying to go from HTML to PDF. The CSS styling never comes out as expected.

Editing to bump it up, really need some help here. Anyone at all?

回答1:

iText's processing of XML files using a proprietary syntax was removed a very long time ago. See this and this for direct answers from the author. Instead you are encouraged to use the globally recognized XML standard which is XHTML.

I know you said that you don't want to use HTML because it never comes out correctly but maybe you could post some samples of what you're trying and we could help. Also, please make sure that you are using the XMLWorker and not the HTMLWorker. See these links for additional help/info when using it.

  • List of supported CSS properties
  • Controlling fonts in HTML processing
  • Adding base64 encoded images
  • Changing the default image root path for relative images

EDIT

This edit is in answer to @JohnC's comment

I can't speak for the iText team and their reasons but I can guess at things. PDF doesn't have "paragraphs", "words", "tables", etc. Instead, PDF has text, drawings (lines, patterns) and images. If you want to do these things manually you can use the raw PdfContentByte objects. You are encouraged, however, to use iText's abstractions like Paragraph and PdfPTable which use the PdfContentByte on your behalf.

For iText to support an XML format it would need first create its own propriety DTD and/or XML Schema. If any features get added it would need to then version the schema properly which can cause problems and confusions for consumers. Then it would need to build/maintain a parser that turned the XML abstractions into either iText abstractions or raw PDF commands. For the former, you have an abstraction talking to an abstraction which is just begging to break. For the latter, you now have two abstraction implementations that will run into feature parity issues eventually.

Further, what would the XML represent? Paragraphs, chunks of text, images and tables? Sounds like HTML already so there's no need to repeat that kind of schema. Or would it be "put content Z at coordinate X,Y using Font ABC"? That's where the PdfContentByte comes in. True, there could be a native parser but I'm guessing there just isn't too many people asking for one. Or would the XML be your own format based on your own data with things like <book> and <inventory>? If that's the case, then iText would really have no idea of how to style that either. You could, however, use leverage .Net/Java and XSLT to transform your XML into XHTML commands that it does know.