-->

c# XMLReader skips nodes after using ReadElementCo

2019-08-31 17:35发布

问题:

I am trying to parse an XML file and extract data from elements. The problem is that every time I use ReadElementContentAsX, the reader skips the next element. I don't know why is that. What am I missing?

while (reader.Read() && fileValid)
{
    if (reader.IsStartElement())
    {
        Console.WriteLine(reader.Name);
        switch (reader.Name)
        {
            case "ID": if (reader.ReadElementContentAsString() != ID)
                {
                    fileValid = false;
                } break;
            case "Size":
                if (reader.ReadElementContentAsInt() != EEPROM_SIZE)
                {
                    fileValid = false;
                }
                break;
            case "Data": if (reader.ReadElementContentAsBase64(eeprom_primary, 0, EEPROM_SIZE) != EEPROM_SIZE)
                {
                    fileValid = false;
                }
                break;
            default:
                break;
        }
    }
}

The XML structure is as follows: -ParrentNode --ID:String --Date:TimeDate --SoftwareVersionMajor:int --SoftwareVersionMinor:int --Size: int --Data: encodedBase64

So in my case i read element content for element ID, Size. It will skip element Date and Data. I checked if i remove the readElementContentAs it will not skip the next node

回答1:

From MSND (ReadElementContentAs):

"This method reads the start tag, the contents of the element, and moves the reader past the end element tag."

The method is designed to skip to your next element.

EDIT:

You could try it with XmlDocument:

  XmlDocument xmlDoc = new XmlDocument();
  loadXML(xmlDoc, "inputfile.xml");

And then you can easily process through the Xml file with Xpath expressions and for each loops:

  foreach (XmlNode node in xmlDoc.SelectNodes("/src"))
  {
    // do anything with node
  }


回答2:

From the documentation of ReadElementContentAsString:

This method reads the start tag, the contents of the element, and moves the reader past the end element tag.

So you end up at the start of the next element. You then call Read() again, which moves past the start of that element, either skipping the whole element or moving "into" it so that IsStartElement() returns false. So basically, you don't want to call Read() at the start of your loop if you've used ReadElementContentAs*.

This sort of thing is why I hate XmlReader. Unless you really need to use it, I'd strongly recommend reading the whole document into memory using LINQ to XML. Even if you do need to use XmlReader, you can still read one element at a time into LINQ to XML, in a sort of "somewhat streaming" fashion which minimizes your exposure to the reader part.



回答3:

hmm. ok figured out myself. just changed the reader.Read() in while condition to !reader.EOF and changed a bit how I read it. works now. thanks for the help :)



回答4:

For anyone who wants to use XmlReader for whatever reason, XmlReader.ReadString() appears to not have the same issue as XmlReader.ReadElementContentAsString(). I just don't know if it will also expand entities and skip processing instructions and comments since I don't have to worry about that for my purpose.