I have XML data that was extracted from a legacy Lotus Notes application and that has embedded richtext formatting. I am trying to format each block of text based on attributes that appear in a previous sibling. I have XSLT inspired by this response from @Jayvee but it's not working.
This is the XML:
<?xml version="1.0" encoding="UTF-8"?>
<document>
<item name="Unordered list">
<richtext>
<pardef/>
<par def="20">
<run>This is the first </run><run>paragraph of the preamble.</run>
</par>
<par>
<run>This is the second paragraph of the </run><run>preamble.</run>
</par>
<pardef id="21" list="unordered"/>
<par def="21">
<run>This is the </run><run>first bullet.</run>
</par>
<par>
<run>This is the second </run><run>bullet.</run>
</par>
<par def="20">
<run>This is the first </run><run>paragraph of the conclusion.</run>
</par>
<par>
<run>This is the second paragraph of the </run><run>conclusion.</run>
</par>
</richtext>
</item>
<item name="Ordered list">
<richtext>
<pardef/>
<par def="20">
<run>This is the first </run><run>paragraph of the preamble.</run>
</par>
<par>
<run>This is the second paragraph of the </run><run>preamble.</run>
</par>
<pardef id="46" list="ordered"/>
<par def="46">
<run>This is the </run><run>first numbered item.</run>
</par>
<par>
<run>This is the another </run><run>numbered item.</run>
</par>
<par def="20">
<run>This is the first </run><run>paragraph of the conclusion.</run>
</par>
<par>
<run>This is the second paragraph of the </run><run>conclusion.</run>
</par>
</richtext>
</item>
</document>
This is the desired output:
<html>
<body>
<table border="1">
<tr>
<td>Unordered list</td>
<td>
<p>This is the first paragraph of the preamble.</p>
<p>This is the second paragraph of the preamble.</p>
<ul>
<li>This is the first bullet.</li>
<li>This is the second bullet.</li>
</ul>
<p>This is the first paragraph of the conclusion.</p>
<p>This is the second paragraph of the conclusion.</p>
</td>
</tr>
<tr>
<td>Ordered list</td>
<td>
<p>This is the first paragraph of the preamble.</p>
<p>This is the second paragraph of the preamble.</p>
<ol>
<li>This is the first numbered item.</li>
<li>This is the another numbered item.</li>
</ol>
<p>This is the first paragraph of the conclusion.</p>
<p>This is the second paragraph of the conclusion.</p>
</td>
</tr>
</table>
</body>
This is the XSLT:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes"/>
<xsl:template match="/*">
<html>
<body>
<table border="1">
<xsl:apply-templates/>
</table>
</body>
</html>
</xsl:template>
<xsl:template match="item">
<tr>
<td><xsl:value-of select="@name"/></td>
<td>
<xsl:apply-templates/>
</td>
</tr>
</xsl:template>
<xsl:template match="par">
<xsl:choose>
<xsl:when test="preceding-sibling::pardef[@list] = 'unordered' and preceding-sibling::par[@def][1][@def] != preceding-sibling::pardef[@id]"><xsl:text disable-output-escaping="yes"></ul></xsl:text></xsl:when>
<xsl:when test="preceding-sibling::pardef[@list] = 'ordered' and preceding-sibling::par[@def][1][@def] != preceding-sibling::pardef[@id]"><xsl:text disable-output-escaping="yes"></ol></xsl:text></xsl:when>
</xsl:choose>
<xsl:choose>
<xsl:when test="@def=preceding-sibling::pardef[@id] or (not(@def) and preceding-sibling::par[@def][1][@def=preceding-sibling::pardef[@id]])">
<xsl:choose>
<xsl:when test="preceding-sibling::pardef[@list] = 'unordered' and preceding-sibling::par[@def][1][@def] = preceding-sibling::pardef[@id]"><xsl:text disable-output-escaping="yes"><ul></xsl:text></xsl:when>
<xsl:when test="preceding-sibling::pardef[@list] = 'ordered' and preceding-sibling::par[@def][1][@def] = preceding-sibling::pardef[@id]"><xsl:text disable-output-escaping="yes"><ol></xsl:text></xsl:when>
</xsl:choose>
<li>
<xsl:for-each select="run">
<xsl:value-of select="text()" separator=""/>
</xsl:for-each>
</li>
</xsl:when>
<xsl:when test="not(@def=preceding-sibling::pardef[@id])">
<p>
<xsl:for-each select="run">
<xsl:value-of select="text()" separator=""/>
</xsl:for-each>
</p>
</xsl:when>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
The approach in the previous question use
disable-output-escaping
to output the start and end tags, which is not an ideal approach.Instead, consider using a key to group together
par
elements by the first precedingpar
element with adef
attributeAnd, assuming you are matched on a
par
element adef
attribute, you can use the key like so:To work out whether to wrap the group in an
ul
orol
tag, you can possibly get the list type as follows:You can then test this to determine whether to wrap the group in a list tag.
Try this XSLT