We have a Java-based system that reads data from a database, merges individual data fields with preset XSL-FO
tags and converts the result to PDF
with Apache FOP
.
In XSL-FO
format it looks like this:
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE Html [
<!ENTITY nbsp " ">
<!-- all other entities -->
]>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:output method="xml" indent="yes" />
<xsl:template match="/">
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:svg="http://www.w3.org/2000/svg" font-family="..." font-size="...">
<fo:layout-master-set>
<fo:simple-page-master master-name="Letter Page" page-width="8.500in" page-height="11.000in">
<!-- appropriate settings -->
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="Letter Page">
<!-- some static content -->
<fo:flow flow-name="xsl-region-body">
<fo:block>
<fo:table ...>
<fo:table-column ... />
<fo:table-body>
<fo:table-row>
<fo:table-cell ...>
<fo:block text-align="...">
<fo:inline font-size="..." font-weight="...">
<!-- Header / Title -->
</fo:inline>
</fo:block>
</fo:table-cell>
</fo:table-row>
</fo:table-body>
</fo:table>
</fo:block>
<fo:block>
<fo:table ...>
<fo:table-column ... />
<fo:table-body>
<fo:table-row>
<fo:table-cell>
<fo:block ...>
<!-- Field A -->
</fo:block>
</fo:table-cell>
</fo:table-row>
</fo:table-body>
</fo:table>
<!-- Other fields in a very similar fashion as the above "Field A" -->
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
</xsl:template>
</xsl:stylesheet>
Now I am looking for a way to allow some of the fields to contain static HTML-formatted content. This content will be generated by our HTML-enabled editor (something along the lines of CLEditor
, CKEditor
, etc.) or pasted from outside.
My plan is to follow the recipe from this JavaWorld article:
- use
JTidy
to convert HTML-formatted string to proper XHTML - further modify xhtml2fo.xsl from Antenna House to remove all document-wide and page-wide transformations
- apply this modified XSLT to my XHTML string (javax.xml.transform)
- extract all the nodes under the root with XPath (javax.xml.xpath)
- feed the result directly into existing XSL-FO document
I have a bare-bone version of such code and got the following error:
(Location of error unknown)org.apache.fop1.fo.ValidationException: "{http://www.w3.org/1999/XSL/Format}table-body" is not a valid child of "fo:block"! (No context info available)
My questions:
- What would be the way to troubleshoot this issue?
- Can
<fo:block>
serve as a generic container with other objects (including tables) nested inside? - Is this an overall reasonable approach to solving the task?
If someone already "been there done that", please share your experience.