I have the following XML (test example):
<?xml version="1.0" encoding="UTF-8"?><?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" >
<Style ss:ID="s21"><NumberFormat ss:Format="@"/></Style>
<Worksheet ss:Name="--">
<Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:StyleID="s21">
<Column ss:StyleID="s21" ss:Width="184"/>
<Cell><ss:Data ss:Type="String">42</Data></Cell>
When trying to read the file using DataSet.ReadXml(), the following exception is generated: The 'ss:Data' start tag on line 12 position 14 does not match the end tag of 'Data'. Line 12, position 43.
While all examples in W3C documentation show namespace-qualified end tags, MS Excel opens such file without any warnings.
Setting DataSet.Namespace = "ss";
doesn't change anything.
What can be done to read such file, preferably without adding extra libraries?
Yes, XML end tags must match XML start tags exactly, including any namespace prefixes.
From your question:
The XML must be repaired to be well-formed if it's to be parsed successfully using compliant XML tools. In particular, you must change the the end-tag as @jdweng suggests in the comments:
Per the W3C XML Recommendation, section 3.1:
From your question:
Then MS Excel isn't processing the XML in a compliant manner and may well be missing other issues.
See also How to parse invalid (bad / not well-formed) XML?