Parsing xml with inner nodes

2019-07-13 19:14发布

问题:

I'm trying to parse the xml given below:

<Item status="SUCCESS" message="">
   <ItemDate>12/21/2012
      <ItemType>MyType1
         <ItemUrl title="ItemTitle">http://www.itemurl1.com</ItemUrl>
      </ItemType>
   </ItemDate>
   <ItemDate>12/22/2012
      <ItemType>MyType2
         <ItemUrl title="Item2Title">http://www.itemurl2.com</ItemUrl>
      </ItemType>
   </ItemDate>
</Item>

As you could see I'm not sure whether we can call this xml, but this si what I get out of a legacy service. What I'm after is to parse this and load it into an object graph. My object model is as below:

 public class Item
    {
        public string Date { get; set; }
        public string Type { get; set; }
        public string Url { get; set; }
        public string Title { get; set; }
    }

So, basically when I'm done with parsing the above xml/string I get a collection of Item objects. Can you please suggest me how to achieve this with some code snippet?

I tried with XDocument, but I was not able to do it given the peculiar structure of xml.

Thanks, -Mike

回答1:

XDocument xdoc = XDocument.Load(path_to_xml);
var query = from date in xdoc.Descendants("ItemDate")
            let type = date.Element("ItemType")
            let url = type.Element("ItemUrl")
            select new Item()
            {
                ItemDate = ((XText)date.FirstNode).Value,
                ItemType = ((XText)type.FirstNode).Value,
                ItemUrl = (string)url,
                ItemTitle = (string)url.Attribute("title"),
            };


回答2:

As an alternative to lazyberezovsky's Linq2Xml projection, you might also consider doing the flattening using an Xml Transform before loading the Xml.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0"
                >
    <xsl:output omit-xml-declaration="yes" method="xml" version="1.0" indent="yes" />

    <xsl:template match="/">
        <Items>
            <xsl:apply-templates select="Item/ItemDate" />
        </Items>
    </xsl:template>

    <xsl:template match="ItemDate">
        <Item>
            <xsl:attribute name="ItemDate">
                <xsl:value-of select="normalize-space(./text()[1])" />
            </xsl:attribute>
            <xsl:attribute name="ItemType">
                <xsl:value-of select="normalize-space(ItemType/text()[1])" />
            </xsl:attribute>
            <xsl:attribute name="ItemUrl">
                <xsl:value-of select="normalize-space(ItemType/ItemUrl/text()[1])" />
            </xsl:attribute>
            <xsl:attribute name="ItemTitle">
                <xsl:value-of select="normalize-space(ItemType/ItemUrl/@title)" />
            </xsl:attribute>
        </Item>
    </xsl:template>
</xsl:stylesheet>

This produces the following Xml, which is straightforward to deserialize, e.g. using the [XmlAttribute] attribute with XmlDocument.

<Items>
  <Item ItemDate="12/21/2012" ItemType="MyType1" ItemUrl="http://www.itemurl1.com" ItemTitle="ItemTitle" />
  <Item ItemDate="12/22/2012" ItemType="MyType2" ItemUrl="http://www.itemurl2.com" ItemTitle="Item2Title" />
</Items>


回答3:

Because you have the node Item only once in the sent xml you get only one Item from lazyberezovsky's code. And that is correct. I suppose you want to get items but load them by ItemDate nodes. To do so use the following modified code:

XDocument xdoc = XDocument.Load(new StringReader(xml));
var query = from i in xdoc.Descendants( "ItemDate" )
                    let date = i
                    let type = date.Element("ItemType")
                    let url = type.Element("ItemUrl")
                    select new Item()
                            {
                                Date = ((XText) date.FirstNode).Value,
                                Type = ((XText) type.FirstNode).Value,
                                Url = (string) url,
                                Title = (string) url.Attribute("title"),
                            };
        var items = query.ToList();