I have some XML that looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<issue xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<comment text="<div class="wiki text"><h4>Tom Fenech</h4>Here is a comment</div> "/>
</issue>
</root>
As you can see, the text
attribute in the comment
node contains escaped HTML. I would like to get the contents of the attribute as XHTML, which I currently do this inside a template using:
<xsl:value-of select="@text" disable-output-escaping="yes" />
That gets me the HTML in the final output:
<div class="wiki text"><h4>Tom Fenech</h4>Here is a comment</div>
But I want to be able to extract the contents of the <h4>
tag to use elsewhere. In general, it would be nice to be able to manipulate the contents of this once it has been escaped.
How do I apply further templates to the output of the <xsl:value-of />
?
I am currently using the PHP built-in XSLT processor, which supports XSLT version 1.0, although I would be willing to consider using an alternative processor if features from newer versions make this possible.
Here's one way you could do it, by calling into a PHP function from XSLT:
function parseHTMLString($html)
{
$doc = new DOMDocument();
$doc->loadHTML($html);
return $doc;
}
$xml = <<<EOB
<root>
<issue xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<comment text="<div class="wiki text"><h4>Tom Fenech</h4>Here is a comment</div> "/>
</issue>
</root>
EOB;
$xsl = <<<EOB
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:php="http://php.net/xsl"
xsl:extension-element-prefixes="php">
<xsl:output method="html" encoding="utf-8" indent="yes"/>
<xsl:template match="comment">
<xsl:apply-templates select="php:functionString('parseHTMLString', @text)//div/h4"/>
</xsl:template>
<xsl:template match="div/h4">
<h2><xsl:apply-templates/></h2>
</xsl:template>
</xsl:stylesheet>
EOB;
$xmldoc = new DOMDocument();
$xmldoc->loadXML($xml);
$xsldoc = new DOMDocument();
$xsldoc->loadXML($xsl);
$proc = new XSLTProcessor();
$proc->registerPHPFunctions('parseHTMLString');
$proc->importStyleSheet($xsldoc);
echo $proc->transformToXML($xmldoc);
You cannot apply templates to unparsed (escaped or CDATA) text. See some previous answers that may be relevant to you:
Parsing html with xslt
XSLT: Reading a param that's an xml document passed as a string
how to parse the xml inside CDATA of another xml using xslt?