I am transforming some generated DocBook xml (from Doxygen) to my companies xml which is really a subset of DocBook. There is a para element like the following:
<para>some text.....
<literallayout>
</literallayout>
more text....
<table>
...
</table>
even more text
<table>...</table>
<literallayout>text also look here</literlayout>
more text <link xlink:href="http://someurl.com">
</para>
As our subset of docbook does NOT like block elements within a para, like table, or figure, i would like to parse this element, and put new para elements around those pieces of text, so that i would have something like this:
<para>some text.....
</para>
<literallayout>
</literallayout>
<para>
more text....
</para>
<table>
...
</table>
<para>
even more text
</para>
<table>...</table>
<literallayout>text also look here </literlayout>
<para> more text</para>
<para> <link xlink:href="http://someurl.com"></para>
Previously, thinking i would never see anything this complex, i was putting tables outside of a para element like this:
<xsl:when test="( child::figure | child::table ) and (./text())">
<Para>
<xsl:value-of select="./text()"/>
</Para>
<xsl:apply-templates select="*"/>
</xsl:when>
But that ended up only catching the first text node, and messing other things up.
Can anyone suggest, hopefully an elegant way to handle this, if para elements are this messy?
Thanks,
Russ
Update: I neglected to introduce a corner case. I've edited the original source above check the link element. The current solution removes the containing para element from the source.
I had to correct a bit of your XML example so that it was well-formed. But the following:
<xsl:template match="para">
<xsl:for-each select="node()">
<xsl:choose>
<xsl:when test="self::text() and normalize-space(.)!=''">
<xsl:element name="para">
<xsl:apply-templates select="."/>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
<xsl:template match="text()">
<xsl:copy-of select="."/>
</xsl:template>
<xsl:template match="literallayout">
<xsl:copy-of select="."/>
</xsl:template>
<xsl:template match="table">
<xsl:copy-of select="."/>
</xsl:template>
Outputs:
<para>some text..... </para>
<literallayout>
</literallayout>
<para> more text.... </para>
<table> ... </table>
<para> even more text </para>
<table>...</table>
<literallayout>text also look here <link xlink:href="http://someurl.com"/></literallayout>
<para> more text. </para>
I hope that helps.
You can turn every text node within a para
element into its own para
using something like
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="@*|node()">
<xsl:copy><xsl:apply-templates select="@*|node()" /></xsl:copy>
</xsl:template>
<xsl:template match="para">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="para/text()">
<para><xsl:value-of select="." /></para>
</xsl:template>
</xsl:stylesheet>
but this may not be sufficient if you only want to break up the para at certain child elements and not others.
I should use these templates:
<xsl:template match="para">
<xsl:apply-templates select="node()" mode="flat" />
</xsl:template>
<xsl:template match="*" mode="flat">
<xsl:copy-of select="." />
</xsl:template>
<xsl:template match="text()[normalize-space()!='']" mode="flat">
<para>
<xsl:value-of select="."/>
</para>
</xsl:template>
<xsl:template match="text()[normalize-space()='']" mode="flat" />