How to replace a tag within CDATA in XSLT

2019-07-25 18:34发布

问题:

I have a requirement where I need to replace a particular tag with in CDATA. For example,

<MASTER_COMMENTS>
<![CDATA[<pre> Nice Work done </pre>]]>
</MASTER_COMMENTS>

to

<MASTER_COMMENTS>
<![CDATA[<span> Nice Work done </span>]]>
</MASTER_COMMENTS>

using XSLT sub template.

Can you please help me writing the same?

I tried the following but it is not working

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
 <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" cdata-section-elements="//MASTER_COMMENTS"/>

 <xsl:template match="pre">
      <span><xsl:value-of select="."/></span>
 </xsl:template>

回答1:

<xsl:template match="pre">

will not match anything in your input, because a CDATA section contains purely textual data, not XML markup.

If you can, do the transformation in two passes: first, disable output escaping on MASTER_COMMENTS and save the result to a file; then process the resulting file as "normal" XML.

Alternatively, you could try and process the contents using string functions, for example:

<xsl:template match="MASTER_COMMENTS">
    <xsl:copy>
        <xsl:value-of select="substring-before(., '&lt;pre&gt;')" />
        <xsl:text>&lt;span&gt;</xsl:text>
        <xsl:value-of select="substring-before(substring-after(., '&lt;pre&gt;'),'&lt;/pre&gt;') " />
        <xsl:text>&lt;/span>&gt;</xsl:text>
        <xsl:value-of select="substring-after(., '&lt;/pre&gt;') " />
    </xsl:copy>
</xsl:template>

Note that this example assumes there is exactly one pre "element" in the processed text.



回答2:

Here is an XSLT 3.0 stylesheet using both parse-xml and serialize to implement the requirement, it works fine for me with Saxon 9.7 HE:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math"
    exclude-result-prefixes="xs math"
    version="3.0">

    <xsl:output cdata-section-elements="MASTER_COMMENTS"/>

    <xsl:template match="MASTER_COMMENTS">
        <xsl:copy>
            <xsl:variable name="content">
                <xsl:apply-templates select="parse-xml(.)"/>
            </xsl:variable>
            <xsl:variable name="ser-params">
                <output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization">
                    <output:omit-xml-declaration value="yes"/>
                </output:serialization-parameters>
            </xsl:variable>
            <xsl:value-of select="serialize($content, $ser-params/*)"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="pre">
        <span>
            <xsl:apply-templates/>
        </span>
    </xsl:template>

</xsl:stylesheet>

Output is

<?xml version="1.0" encoding="UTF-8"?><MASTER_COMMENTS><![CDATA[<span> Nice Work done </span>]]></MASTER_COMMENTS>


回答3:

Well you can, but you have to do it with text replacements instead of template matching... Note that this is going to be really hard for the cases when you might have no <pre> or even more than one such tags to be replaced. If that is what your stylesheet is mainly about, I'd suggest to use a text transform

<xsl:transform version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="text" />

<xsl:template match="/">
    <xsl:text>&lt;TOP></xsl:text>
        <xsl:apply-templates/>
    <xsl:text>&lt;/TOP></xsl:text>
</xsl:template>

<xsl:template match="MASTER_COMMENTS">
    <xsl:text>&lt;MASTER_COMMENTS></xsl:text>
        <xsl:value-of select="."/>
    <xsl:text>&lt;/MASTER_COMMENTS></xsl:text>
</xsl:template>

</xsl:transform>

to make the contents available as "text" and then using that text as xml input to you transform where you can then use the normal template matching for what previously was inside the CDATA sections.

For the textual approach, see the answer of michael.hor257k.



回答4:

If a node containing XML markup has been incorrectly labelled as CDATA, then the XML parser will simply return character data, and to extract the tags you will need to put this character data through a second phase of parsing. You can do this in XSLT 3.0 by calling the parse-xml() function; in other XSLT processors you may be able to do the same thing with an extension function.



标签: xml xslt