Converting XML to escaped text in XSLT

2019-01-11 06:44发布

How can I convert the following XML to an escaped text using XSLT?

Source:

<?xml version="1.0" encoding="utf-8"?>
<abc>
  <def ghi="jkl">
    mnop
  </def>
</abc>

Output:

<TestElement>&lt;?xml version="1.0" encoding="utf-8"?&gt;&lt;abc&gt;&lt;def ghi="jkl"&gt;
    mnop
  &lt;/def&gt;&lt;/abc&gt;</TestElement>

Currently, I'm trying the following XSLT and it doesn't seem to work properly:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" encoding="utf-8" />
  <xsl:template match="/">
    <xsl:variable name="testVar">
      <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
      </xsl:copy>
    </xsl:variable>

    <TestElement>
      <xsl:value-of select="$testVar"/>
    </TestElement>
  </xsl:template>
</xsl:stylesheet>

Output of XSLT statement by the .NET XslCompiledTransform comes out as the following:

<?xml version="1.0" encoding="utf-8"?><TestElement>

    mnop

</TestElement>

标签: xml xslt
8条回答
一纸荒年 Trace。
2楼-- · 2019-01-11 06:56

Why can't you just run

<xsl:template match="/">
  <TestElement>
  <xsl:copy-of select="." />
  </TestElement>
</xsl:template>
查看更多
劫难
3楼-- · 2019-01-11 06:59

Do you need to use XSLT? Because, for reasons explained by Pavel Minaev, it would be much simpler to use another tool. An example with xmlstartlet:

% xmlstarlet escape
<?xml version="1.0" encoding="utf-8"?>
<abc>
  <def ghi="jkl">
    mnop
  </def>
</abc>
[Control-D]
&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;abc&gt;
  &lt;def ghi="jkl"&gt;
    mnop
  &lt;/def&gt;
&lt;/abc&gt;
查看更多
一纸荒年 Trace。
4楼-- · 2019-01-11 07:04

Your code works the way it does because xsl:value-of retrieves the string-value of the node set.

To do what you want, I'm afraid that you'll have to code it explicitly:

    <xsl:template match="/">
        <TestElement>
            <xsl:apply-templates mode="escape"/>
        </TestElement>
    </xsl:template>

    <xsl:template match="*" mode="escape">
        <!-- Begin opening tag -->
        <xsl:text>&lt;</xsl:text>
        <xsl:value-of select="name()"/>

        <!-- Namespaces -->
        <xsl:for-each select="namespace::*">
            <xsl:text> xmlns</xsl:text>
            <xsl:if test="name() != ''">
                <xsl:text>:</xsl:text>
                <xsl:value-of select="name()"/>
            </xsl:if>
            <xsl:text>='</xsl:text>
            <xsl:call-template name="escape-xml">
                <xsl:with-param name="text" select="."/>
            </xsl:call-template>
            <xsl:text>'</xsl:text>
        </xsl:for-each>

        <!-- Attributes -->
        <xsl:for-each select="@*">
            <xsl:text> </xsl:text>
            <xsl:value-of select="name()"/>
            <xsl:text>='</xsl:text>
            <xsl:call-template name="escape-xml">
                <xsl:with-param name="text" select="."/>
            </xsl:call-template>
            <xsl:text>'</xsl:text>
        </xsl:for-each>

        <!-- End opening tag -->
        <xsl:text>&gt;</xsl:text>

        <!-- Content (child elements, text nodes, and PIs) -->
        <xsl:apply-templates select="node()" mode="escape" />

        <!-- Closing tag -->
        <xsl:text>&lt;/</xsl:text>
        <xsl:value-of select="name()"/>
        <xsl:text>&gt;</xsl:text>
    </xsl:template>

    <xsl:template match="text()" mode="escape">
        <xsl:call-template name="escape-xml">
            <xsl:with-param name="text" select="."/>
        </xsl:call-template>
    </xsl:template>

    <xsl:template match="processing-instruction()" mode="escape">
        <xsl:text>&lt;?</xsl:text>
        <xsl:value-of select="name()"/>
        <xsl:text> </xsl:text>
        <xsl:call-template name="escape-xml">
            <xsl:with-param name="text" select="."/>
        </xsl:call-template>
        <xsl:text>?&gt;</xsl:text>
    </xsl:template>

    <xsl:template name="escape-xml">
        <xsl:param name="text"/>
        <xsl:if test="$text != ''">
            <xsl:variable name="head" select="substring($text, 1, 1)"/>
            <xsl:variable name="tail" select="substring($text, 2)"/>
            <xsl:choose>
                <xsl:when test="$head = '&amp;'">&amp;amp;</xsl:when>
                <xsl:when test="$head = '&lt;'">&amp;lt;</xsl:when>
                <xsl:when test="$head = '&gt;'">&amp;gt;</xsl:when>
                <xsl:when test="$head = '&quot;'">&amp;quot;</xsl:when>
                <xsl:when test="$head = &quot;&apos;&quot;">&amp;apos;</xsl:when>
                <xsl:otherwise><xsl:value-of select="$head"/></xsl:otherwise>
            </xsl:choose>
            <xsl:call-template name="escape-xml">
                <xsl:with-param name="text" select="$tail"/>
            </xsl:call-template>
        </xsl:if>
    </xsl:template>

Note that this solution ignores comment nodes, and inserts unneccessary namespace nodes (as namespace:: axis will include all nodes inherited from parent). Regarding namespaces, however, the resulting quoted XML will be semantically equivalent to the example that you provided in your reply (since those repeated redeclarations don't really change anything).

Also, this won't escape the <?xml ... ?> declaration, simply because it is not present in XPath 1.0 data model (it's not a processing instruction). If you actually need it in the output, you'll have to insert it manually (and make sure that encoding it specifies is consistent with serialization encoding of your XSLT processor).

查看更多
仙女界的扛把子
5楼-- · 2019-01-11 07:07

Anyone who is concerned about licensing ambiguity when reusing code snippets from stack overflow may be interested in the following 3-clause BSD-licensed code, which appears to do what is requested by the original poster:

http://lenzconsulting.com/xml-to-string/

查看更多
Anthone
6楼-- · 2019-01-11 07:09

instead of escaping you can add the text inside a CDATA section. Text inside a CDATA section will be ignored by the parser, similar to if it was escaped.

your example would look like this

<TestElement>
<![CDATA[
<abc>
  <def ghi="jkl">
    mnop
  </def>
</abc>
]]>
</TestElement>

using following XSLT snippet:

<xsl:text disable-output-escaping="yes">&lt;![CDATA[</xsl:text>
 <xsl:copy-of select="/"/>
 <xsl:text disable-output-escaping="yes">]]</xsl:text>
 <xsl:text disable-output-escaping="yes">&gt;</xsl:text>
查看更多
ら.Afraid
7楼-- · 2019-01-11 07:09

If you have access to it, I would recommend the Saxon extention serialize. It does exactly what you want it to do. If you don't want to do that, you'd have to manually insert the entity references as you build the document. It'd be brittle, but it would work for most documents:

<xsl:template match="/">
    <TestElement>
        <xsl:apply-templates/>
    </TestElement>
</xsl:template>
<xsl:template match="*">
    <xsl:text>&lt;</xsl:text>
    <xsl:value-of select="name()"/>
    <xsl:apply-templates select="@*"/>
    <xsl:text>&gt;</xsl:text>
    <xsl:apply-templates select="node()"/>
    <xsl:text>&lt;/</xsl:text>
    <xsl:value-of select="name()"/>
    <xsl:text>&gt;</xsl:text>
</xsl:template>
<xsl:template match="@*">
    <xsl:text>&#32;</xsl:text>
    <xsl:value-of select="name()"/>
    <xsl:text>="</xsl:text>
    <xsl:value-of select="."/>
    <xsl:text>"</xsl:text>
</xsl:template>
<xsl:template match="text()">
    <xsl:value-of select="."/>
</xsl:template>

Most notably, this will probably break if your attributes have the double-quote character. It's really better to use saxon, or to use a user-written extention that uses a proper serializer if you can't.

查看更多
登录 后发表回答