I have a text stream that contains segments of both arbitrary plain text and well-formed xml elements. How can I read it and extract the xml elements only? XmlReader with ConformanceLevel set to Fragment still throws an exception when it encounters plain text, which to it is malformed xml.
Any ideas? Thanks
Here's my code so far:
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
using (XmlReader reader = XmlReader.Create(stream, settings))
while (!reader.EOF)
{
reader.MoveToContent();
XmlDocument doc = new XmlDocument();
doc.Load(reader.ReadSubtree());
reader.ReadEndElement();
}
Here's a sample stream content and I have no control over it by the way:
Found two objects:
Object a
<object>
<name>a</name>
<description></description>
</object>
Object b
<object>
<name>b</name>
<description></description>
</object>
Provided that this is a hack, if you wrap your mixed document with a "fake" xml root node, you should be able to do what you need getting only the nodes of type element (i.e. skipping the text nodes) among the children of the root element: