Transform XML with XSLT and preserve CDATA (in Rub

I am trying to convert a document with content like the following into another document, leaving the CDATA exactly as it was in the first document, but I haven't figured out how to preserve the CDATA with XSLT.

Initial XML:

<node>
    <subNode>
        <![CDATA[ HI THERE ]]>
    </subNode>
    <subNode>
        <![CDATA[ SOME TEXT ]]>
    </subNode>
</node>

Final XML:

<newDoc>
    <data>
        <text>
            <![CDATA[ HI THERE ]]>
        </text>
        <text>
            <![CDATA[ SOME TEXT ]]>
        </text>
    </data>
</newDoc>

I've tried something like this, but no luck, everything gets jumbled:

<xsl:element name="subNode">
    <xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:element>

Any ideas how to preserve the CDATA?

Thanks! Lance

Using ruby/nokogiri

Update: Here's something that works.

<text disable-output-escaping="yes">&lt;![CDATA[</text>
<value-of select="normalize-space(text())" disable-output-escaping="yes"/>
<text disable-output-escaping="yes">]]&gt;</text>

That will wrap all text() nodes in CDATA, which works for what I need, and it will preserve html tags inside the text.

标签： xml xslt parsing nokogiri cdata

3条回答

啃猪蹄的小仙女

2楼-- · 2019-04-07 09:03

Sorry to post an answer to my own question, but I found something that works:


<text disable-output-escaping="yes">&lt;![CDATA[</text>
<value-of select="normalize-space(text())" disable-output-escaping="yes"/>
<text disable-output-escaping="yes">]]&gt;</text>

That will wrap all text() nodes in CDATA, which works for what I need, and it will preserve html tags inside the text.

0人赞添加讨论(0) 举报

forever°为你锁心

3楼-- · 2019-04-07 09:08

You cannot preserve the precise sequence of CDATA nodes if they're mixed with plain text nodes. At best, you can force all content of a particular element in the output to be CDATA, by listing that element name in xsl:output/@cdata-section-elements:

<xsl:output cdata-section-elements="text"/>

0人赞添加讨论(0) 举报

Animai°情兽

4楼-- · 2019-04-07 09:20

I found this article while trying to solve a similar problem (using an XSL transform to take one XML file and create a partial/subset copy of some of the nodes in it, as a second XML file). In my case the first XML files have some elements whose values are entirely wrapped in CDATA blocks, because they happen to be JSON and they carry some HTML formatting markup.

What I found was that rather than using xsl:value-of, I could use xsl:copy-of, and just as @Pavel Minaev points out, I could keep the original CDATA intact by listing every relevant element name in the xsl:output declaration. This might be an approach that would work for the OP.

XML to be copied (sample):

<text_item>
  <id>100</id>
  <stem_text><![CDATA[(any string of text, including HTML)]]></stem_text>
  <answerOptions><![CDATA[{"choices":[{"label":"Atmospheric O<sub>2</sub>",
   "value":"A"},{"label":"Released CO<sub>2</sub>",
   "value":"B"}]}]]></answerOptions>
 ...
</text_item>

Relevant stylesheet lines:

<xsl:output method="xml" indent="yes" cdata-section-elements="stem_text answerOptions" />
...
<xsl:apply-templates select="//text_item" >
...
<xsl:template match="text_item">
    <xsl:element name="text_item" >
        <xsl:copy-of select="node()"  />
    </xsl:element>
</xsl:template>

The cdata-section-elements attribute means that in the output, the original CDATA blocks in the XML copied from will be passed through, as-is, to the output XML file when the transform runs. It appears that you can name as many elements as you want.

In the OP's example, I believe he would select on //node/subNode and then build an element named text, inside newDoc/data of course. His cdata-section-elements attribute would be simply ="text", exactly as Pavel has it.

0人赞添加讨论(0) 举报

Transform XML with XSLT and preserve CDATA (in Rub

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间