HTML inside XML CDATA being converted with < an

2019-05-24 23:15发布

问题:

I have some sample XML:

<sample><![CDATA[Line 1<br />Line 2<br />Line 3<br />]]></sample>

I'm using ASP to output this XML using a stylesheet like so:

Set xmlHttp = Server.CreateObject("Microsoft.XMLHTTP")
xmlHttp.open "GET", URLxml, false
xmlHttp.send()

Set xslHttp = Server.CreateObject("Microsoft.XMLHTTP")
xslHttp.open "GET", xXsl, false
xslHttp.send()   

Set xmlDoc = Server.CreateObject("MICROSOFT.XMLDOM")
Set xslDoc = Server.CreateObject("MICROSOFT.XMLDOM")
xmlDoc.async = false
xslDoc.async = false
xmlDoc.Load xmlHttp.responseXML
xslDoc.Load xslHttp.responseXML

Response.Write xmlDoc.transformNode(xslDoc)

However, once this is getting written, the HTML output is showing up as:

Line 1&lt;br /&gt;Line 2&lt;br /&gt;Line 3

I can see that ASP is converting the brackets in the code, but I'm not sure why. Any thoughts?

回答1:

I have some sample XML:

<sample><![CDATA[Line 1<br />Line 2<br />Line 3<br />]]></sample>

This is a sample element with a text node child.

Suppose you apply an identity transform. Then the result will be:

<sample>Line 1&lt;br /&gt;Line 2&lt;br /&gt;Line 3&lt;br /&gt;</sample>

Why? Because text nodes and attribute values have the special character &, < and > escape as character entities.

EDIT: Of course, you could use DOE... But, besides that it's an optional feature, the result will be a text node no matter what (without the encode character entities). You will need other parser fase (this may be useful when output and encode HTML fragment to a (X)HTML document like in feeds, with the risk of malformed output...).

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="sample">
        <p>
            <xsl:value-of select="." disable-output-escaping="yes"/>
        </p>
    </xsl:template>
</xsl:stylesheet>

Output:

<p>Line 1<br />Line 2<br />Line 3<br /></p>

Render as (actual markup):

Line 1
Line 2
Line 3



回答2:

In addition to @Alejandro's explanation, here is the best possible solution:

Never put markup in a text (CDATA) node.

Instead of:

<sample><![CDATA[Line 1<br />Line 2<br />Line 3<br />]]></sample>

always create:

<sample>Line 1<br />Line 2<br />Line 3<br /></sample>

Remember: Putting markup inside of CDATA is losing it.



回答3:

Think it's the XSL transformation that's causing you problems. You should be able to edit your .xsl document to correct this as such:

<xsl:template match=".">
  <xsl:value-of select="." disable-output-escaping="yes" />
  <!-- ... other XSL business here ... -->
</xsl:template>

I'm stealing from this page about disable output escaping.

For the record I hate XML/XSL - a solution in search of a problem. Generally speaking if you need to deal with markup I've found XML/XSL only introduces problems because frequently you want to deal with markup fragments, which are often not valid XML, so you wrap CDATA around it and then hilarity ensues as you're experiencing.

Update

OK, so the above didn't work. Of course didn't know what XSL looked like until comment on question was added. The following does work (idea from this forum thread):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text" />
    <xsl:template match=".">
        <xsl:value-of select="sample"  />
    </xsl:template>
</xsl:stylesheet>

Key is the <xsl:output method="text" />.

Also, for the down vote, comment why.