Apply transforms to XML attribute containing escap

2019-03-06 00:55发布

问题:

I have some XML that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <issue xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <comment text="&lt;div class=&quot;wiki text&quot;&gt;&lt;h4&gt;Tom Fenech&lt;/h4&gt;Here is a comment&lt;/div&gt;&#10;"/>
    </issue>
</root>

As you can see, the text attribute in the comment node contains escaped HTML. I would like to get the contents of the attribute as XHTML, which I currently do this inside a template using:

<xsl:value-of select="@text" disable-output-escaping="yes" />

That gets me the HTML in the final output:

<div class="wiki text"><h4>Tom Fenech</h4>Here is a comment</div>

But I want to be able to extract the contents of the <h4> tag to use elsewhere. In general, it would be nice to be able to manipulate the contents of this once it has been escaped.

How do I apply further templates to the output of the <xsl:value-of />?

I am currently using the PHP built-in XSLT processor, which supports XSLT version 1.0, although I would be willing to consider using an alternative processor if features from newer versions make this possible.

回答1:

Here's one way you could do it, by calling into a PHP function from XSLT:

function parseHTMLString($html)
{
    $doc = new DOMDocument();
    $doc->loadHTML($html);
    return $doc;
}

$xml = <<<EOB
<root>
    <issue xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <comment text="&lt;div class=&quot;wiki text&quot;&gt;&lt;h4&gt;Tom Fenech&lt;/h4&gt;Here is a comment&lt;/div&gt;&#10;"/>
    </issue>
</root>
EOB;

$xsl = <<<EOB
<xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:php="http://php.net/xsl"
     xsl:extension-element-prefixes="php">
<xsl:output method="html" encoding="utf-8" indent="yes"/>
 <xsl:template match="comment">
   <xsl:apply-templates select="php:functionString('parseHTMLString', @text)//div/h4"/>
 </xsl:template>

 <xsl:template match="div/h4">
   <h2><xsl:apply-templates/></h2>
 </xsl:template>
</xsl:stylesheet>
EOB;

$xmldoc = new DOMDocument();
$xmldoc->loadXML($xml);

$xsldoc = new DOMDocument();
$xsldoc->loadXML($xsl);

$proc = new XSLTProcessor();
$proc->registerPHPFunctions('parseHTMLString');
$proc->importStyleSheet($xsldoc);
echo $proc->transformToXML($xmldoc);


回答2:

You cannot apply templates to unparsed (escaped or CDATA) text. See some previous answers that may be relevant to you:

Parsing html with xslt

XSLT: Reading a param that's an xml document passed as a string

how to parse the xml inside CDATA of another xml using xslt?



标签: xml xslt