I am new to xsl xml transformation. For now, I have an xml file that contains the following information:
<bio>
<published>Tue, 7 Oct 2008 14:47:26 +0000</published>
<summary><![CDATA[
Dream Theater is an American <a
href="http://www.last.fm/tag/progressive%20metal" class="bbcode_tag"
rel="tag">progressive metal</a> band formed in 1985 under the name
"<a href="http://www.last.fm/music/Majesty"
class="bbcode_artist">Majesty</a>" by <a
href="http://www.last.fm/music/John+Myung"
class="bbcode_artist">John Myung</a>,
<a href="http://www.last.fm/music/John+Petrucci"
class="bbcode_artist">John Petrucci</a>
]]>
</summary>
</bio>
And my xsl file contains this:
<h3><xsl:value-of select="lfm/artist/bio/published"/></h3>
<p>
<xsl:value-of select="lfm/artist/bio/summary" disable-output-escaping="yes"/>
</p>
<html>
<body>
<xsl:value-of select="lfm/artist/bio/content"/>
</body>
</html>
What I'm trying to do now is to extract the tag-structured data out of the <summary><[CDATA[]]></summary>
and show it in the browser as in this example:
<a href="http://www.last.fm/tag/progressive%20metal" class="bbcode_tag" rel="tag">progressive metal</a>
<a href="http://www.last.fm/music/Majesty" class="bbcode_artist">Majesty</a>
<a href="http://www.last.fm/music/John+Myung" class="bbcode_artist">John Myung</a>
<a href="http://www.last.fm/music/John+Petrucci" class="bbcode_artist">John Petrucci</a>
For now when I open the xml page, it shows all the CDATA
content, even with those html tags... I want to get those tags to do their job properly in html form.
sorry for the terrible description..hope you guys can get what i mean here...
The CDATA is just (a part of) a text node and what seems like markup inside it is one-dimensional text (badly destroyed markup) and this cannot be accomplished (in XSLT 1.0 and XSLT 2.0) without calling an extension function.
<p><xsl:copy-of select="my:parse(lfm/artist/bio/summary)"></p>
In XSLT 3.0 there may be a new standard function parse-xml()
that does exactly this.
Update:
Here is a complete code example, assuming you are using XslCompiledTransform
in .NET:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:my="my:my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="summary/text()">
<xsl:copy-of select="my:parse(.)/*/*"/>
</xsl:template>
<msxsl:script language="c#" implements-prefix="my">
public XmlDocument parse(string text)
{
XmlDocument doc = new XmlDocument();
doc.LoadXml("<t>"+text+"</t>");
return doc;
}
</msxsl:script>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<bio>
<published>Tue, 7 Oct 2008 14:47:26 +0000</published>
<summary><![CDATA[Dream Theater is an American <a href="http://www.last.fm/tag/progressive%20metal" class="bbcode_tag" rel="tag">progressive metal</a> band formed in 1985 under the name "<a href="http://www.last.fm/music/Majesty" class="bbcode_artist">Majesty</a>" by <a href="http://www.last.fm/music/John+Myung" class="bbcode_artist">John Myung</a>, <a href="http://www.last.fm/music/John+Petrucci" class="bbcode_artist">John Petrucci</a>]]>
</summary>
</bio>
the wanted, correct result (the CDATA is replaced by the reconstituted markup) is produced:
<bio>
<published>Tue, 7 Oct 2008 14:47:26 +0000</published>
<summary>
<a href="http://www.last.fm/tag/progressive%20metal" class="bbcode_tag" rel="tag">progressive metal</a>
<a href="http://www.last.fm/music/Majesty" class="bbcode_artist">Majesty</a>
<a href="http://www.last.fm/music/John+Myung" class="bbcode_artist">John Myung</a>
<a href="http://www.last.fm/music/John+Petrucci" class="bbcode_artist">John Petrucci</a>
</summary>
</bio>