Edit: I decided to take the LINQ to XML approach (see the answer below) that was recommended and everything works EXCEPT that I can't replace out the changed records with the records from the incremental file. I managed to make the program work by just removing the full file node and then adding in the incremental node. Is there a way to just swap them instead? Also, while this solution is very nice, is there any way to shrink down memory usage without losing the LINQ code? This solution may still work, but I would be willing to sacrifice time to lower memory usage.
I'm trying to take two XML files (a full file and an incremental file) and merge them together. The XML file looks like this:
<List>
<Records>
<Person id="001" recordaction="add">
...
</Person>
</Records>
</List>
The recordaction attribute can also be "chg" for changes or "del" for deletes. The basic logic of my program is:
1) Read the full file into an XmlDocument.
2) Read the incremental file into an XmlDocument, select the nodes using XmlDocument.SelectNodes(), place those nodes into a dictionary for easier searching.
3) Select all the nodes in the full file, loop through and check each against the dictionary containing the incremental records. If recordaction="chg" or "del" add the node to a list, then delete all the nodes from the XmlNodeList that are in that list. Finally, add recordaction="chg" or "add" records from the incremental file into the full file.
4) Save the XML file.
I'm having some serious problems with step 3. Here's the code for that function:
private void ProcessChanges(XmlNodeList nodeList, Dictionary<string, XmlNode> dictNodes)
{
XmlNode lastNode = null;
XmlNode currentNode = null;
List<XmlNode> nodesToBeDeleted = new List<XmlNode>();
// If node from full file matches to incremental record and is change or delete,
// mark full record to be deleted.
foreach (XmlNode fullNode in fullDocument.SelectNodes("/List/Records/Person"))
{
dictNodes.TryGetValue(fullNode.Attributes[0].Value, out currentNode);
if (currentNode != null)
{
if (currentNode.Attributes["recordaction"].Value == "chg"
|| currentNode.Attributes["recordaction"].Value == "del")
{
nodesToBeDeleted.Add(currentNode);
}
}
lastNode = fullNode;
}
// Delete marked records
for (int i = nodeList.Count - 1; i >= 0; i--)
{
if(nodesToBeDeleted.Contains(nodeList[i]))
{
nodeList[i].ParentNode.RemoveChild(nodesToBeDeleted[i]);
}
}
// Add in the incremental records to the new full file for records marked add or change.
foreach (XmlNode weeklyNode in nodeList)
{
if (weeklyNode.Attributes["recordaction"].Value == "add"
|| weeklyNode.Attributes["recordaction"].Value == "chg")
{
fullDocument.InsertAfter(weeklyNode, lastNode);
lastNode = weeklyNode;
}
}
}
The XmlNodeList being passed in is just all of the incremental records that were selected out from the incremental file, and the dictionary is just those same nodes but key'd on the id so I didn't have to loop through all of the incremental records each time. Right now the program is dying at the "Delete marked records" stage due to indexing out of bounds. I'm pretty sure the "Add in the incremental records" doesn't work either. Any ideas? Also some suggestions on making this more efficient would be nice. I could potentially run into a problem because it's reading in a 250MB file which balloons up to 750MB in memory, so I was wondering if there was an easier way to go node-by-node in the full file. Thanks!