I have a bit of XML as follows:
<section>
<description>
<![CDATA[
This is a "description"
that I have formatted
]]>
</description>
</section>
I'm accessing it using curXmlNode.SelectSingleNode("description").InnerText
but the value returns
\r\n This is a "description"\r\n that I have formattedinstead of
This is a "description" that I have formatted.
Is there a simple way to get that sort of output from a CDATA section? Leaving the actual CDATA tag out seems to have it return the same way.
CDATA blocks are effectively verbatim. Any whitespace inside CDATA is significant, by definition, according to XML spec. Therefore, you get that whitespace when you retrieve the node value. If you want to strip it using your own rules (since XML spec doesn't specify any standard way of stripping whitespace in CDATA), you have to do it yourself, using
String.Replace
,Regex.Replace
etc as needed.You can use Linq to read CDATA.
It's very easy to get the Value this way.
Here's a good overview on MSDN: http://msdn.microsoft.com/en-us/library/bb308960.aspx
for .NET 2.0, you probably just have to pass it through Regex:
that trims your node value, replaces newlines with empty, and replaces 1+ whitespaces with one space. I don't think there's any other way to do it, considering the CDATA is returning significant whitespace.
I think the best way is...
A simpler form of @Franky's solution:
The
Value
property is equivalent to theData
property of the castedXmlCDataSection
type.Actually i think is pretty much simple. the
CDATA
section it will be loaded in theXmlDocument
like anotherXmlNode
the difference is that this node is going to has the property NodeType = CDATA, wich it mean if you have theXmlNode node = doc.SelectSingleNode("section/description");
that node will have aChildNode
with theInnerText
property filled the pure data, and there is you want to remove the especial characters just useTrim()
and you will have the data.The code will look like
Thanks
XOnDaRocks