Reading from a stream with mixed XML and plain tex

I have a text stream that contains segments of both arbitrary plain text and well-formed xml elements. How can I read it and extract the xml elements only? XmlReader with ConformanceLevel set to Fragment still throws an exception when it encounters plain text, which to it is malformed xml.

Any ideas? Thanks

Here's my code so far:

XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;

using (XmlReader reader = XmlReader.Create(stream, settings))
    while (!reader.EOF)
    {
        reader.MoveToContent();
        XmlDocument doc = new XmlDocument();
        doc.Load(reader.ReadSubtree());
        reader.ReadEndElement();
    }

Here's a sample stream content and I have no control over it by the way:

Found two objects:
Object a
<object>
    <name>a</name>
    <description></description>
</object>
Object b
<object>
    <name>b</name>
    <description></description>
</object>

标签： c# .net xml stream

1条回答

聊天终结者

2楼-- · 2019-05-23 15:03

Provided that this is a hack, if you wrap your mixed document with a "fake" xml root node, you should be able to do what you need getting only the nodes of type element (i.e. skipping the text nodes) among the children of the root element:

using System;
using System.Linq;
using System.Xml;

static class Program {

    static void Main(string[] args) {

        string mixed = @"
Found two objects:
Object a
<object>
    <name>a</name>
    <description></description>
</object>
Object b
<object>
    <name>b</name>
    <description></description>
</object>
";
        string xml = "<FOO>" + mixed + "</FOO>";
        XmlDocument doc = new XmlDocument();
        doc.LoadXml(xml);
        var xmlFragments = from XmlNode node in doc.FirstChild.ChildNodes 
                           where node.NodeType == XmlNodeType.Element 
                           select node;
        foreach (var fragment in xmlFragments) {
            Console.WriteLine(fragment.OuterXml);
        }

    }

}

0人赞添加讨论(0) 举报

Reading from a stream with mixed XML and plain tex

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间