iText partial HTML rendering

2019-08-14 20:32发布

问题:

I am using iText PDF library for Java in order to generate PDF. I want to partially render some HTML content instead of the whole document. Here is the section that I want to partially render as HTML.

waterIndexTrendTable.getDefaultCell().setHorizontalAlignment(Element.ALIGN_LEFT); 
waterIndexTrendTable.addCell(new Phrase(weit.getUnit(), smallFont));  

waterIndexTrendTable is a PdfPTable. weit.getUnit() returns content with HTML tags. I want to render HTML to the PDF.

回答1:

If weit.getUnit() returns HTML, than you will see that HTML code in your cell if you use the snippet shown in your question.

To avoid this, you need to render the HTML to a list of iText objects. This is shown in the first part of the ParseHtmlObjects example:

// CSS
CSSResolver cssResolver =
        XMLWorkerHelper.getInstance().getDefaultCssResolver(true);
// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.autoBookmark(false);
// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, end);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new FileInputStream(HTML));

Now you have an object elements with iText objects that you can add to a cell:

PdfPCell cell = new PdfPCell;
for (Element e : elements) {
    cell.addElement(e);
}

Suppose that the HTML returned by weit.getUnit() contains more data than you need, then it is very hard for iText to read your mind and to find out which part you want to keep and which part you want to throw away.

Maybe you are only interested in specific element types. In that case, you can examine whether e is a Paragraph, or a List, or any other of the types that are available in iText.

Or maybe you can reduce the HTML to the part that needs to be rendered up-front.

In any case: you should not expect that a computer can guess which parts of some HTML are important to you and which parts aren't ;-)



标签: java html itext