可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm looking for a way to import and export a list of changes to an XML data document (irregular structure; not naturally fitting a DataSet).
If I had a regular structure I would use a DataTable, and I could evaluate which records have been edited and then commit or cancel the changes, and I could also transmit a packet of the required changes.
How do I do this with XML data?
If a good answer isn't available I'm thinking my best bet would be to use a DataTable with the scheme [XPath, Value] despite the inefficient storage, and navigation difficulties.
I expect to make changes to the document (with XPath or LINQ or data-bound controls or whatever), then remember the changes and send only the changes over TCP.
Then I want to receive back another change list and apply it to the XML document. I don't want to send the entire document both for size and because I need to know and evaluate the changes being sent.
(Just to clarify: My program needs to send and receive document changes. The other end of the pipe is not based in .net, and is not part of this question.)
回答1:
Do you need to act on this changes or just store them, if you want just to store the updated version you can use binary diff algorithms to pass the diffs between 2 xml files. And then to updated stored version with the difference. Good algorithm for this is bifdiff
The C# version can be found here.
Another aproach is to use this XmlDiff class from MS
回答2:
- How do you suppose to only send changes?
- Do you expect numerous changes or just slight changes every time?
- What kind of changes do you have to consider?
- Are you trying to maintain to copies of the same document across process boundaries?
- How are you going to resolve conflicting changes?
- Are you going to lock xml documents until changes are propagated?
- Are both copies independent, or one is master copy?
if you used XmlDocument events such as NodeInserted, NodeDeleted, NodeChanged you could build a list of such changes and then execute them on another copy. If total amount of changes is longer than document itself you could send document instead. Zipping xml data also helps.
other than that I do not see any other easy approach.
回答3:
When you get XML data with irregular structure; not naturally fitting a DataSet and you want an Object Model to easily work with the data. You can use the XML Schema Definition Tool (Xsd.exe) with the /classes option to generate C# or VB.Net classes from an XML file.
The XSD.exe lives in :
C:\Program Files\Microsoft SDKs\Windows\v6.0A\bin\xsd.exe
C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Bin\xsd.exe
You run xsd.exe from the Visual Studio Command Line.
-Start
-All Programs
-Visual Studio
-Tools
-Command Line
This is the command to view all the XSD command line parameters:
xsd /?
To convert an irregular XML file (XmlResponseObject.xml) into Classes:
xsd c:\Temp\XmlResponseObject.xml /classes /language:CS /out:c:\Temp\
This will generate a csharp file with classes that represent the XML. You may want to refeactor it out into separate class files being careful about duplicate classes in the single file that are disambiguate by namespace. Either way the classes wont be the nicest looking with all the xml attributes but the good part is you can bind to them via XML. This is an example where I retrive XML via a REST webservice, xmlResponseObject is the ObjectModel of classes that fits the XML.
public interface IYourWebService
{
XmlResponseObject GetData(int dataId);
}
public class YourWebService : IYourWebService
{
public XmlResponseObject GetData(int dataId)
{
XmlResponseObject xmlResponseObject = null;
var url = "http://SomeSite.com/Service/GetData/" + dataId;
try
{
var request = WebRequest.Create(url) as HttpWebRequest;
if (request != null)
{
request.AllowAutoRedirect = true;
request.KeepAlive = true;
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 1.1.4322; InfoPath.2; .NET4.0C; .NET4.0E)";
request.Credentials = CredentialCache.DefaultNetworkCredentials;
request.CookieContainer = new CookieContainer();
var response = request.GetResponse() as HttpWebResponse;
if (request.HaveResponse && response != null)
{
var streamReader = new StreamReader(response.GetResponseStream());
var xmlSerializer = new XmlSerializer(typeof(XmlResponseObject));
xmlResponseObject = (XmlResponseObject)xmlSerializer.Deserialize(streamReader);
}
}
}
catch (Exception ex)
{
string debugInfo = "\nURL: " + url;
Console.Write(ex.Message + " " + debugInfo + " " + ex.StackTrace);
}
return xmlResponseObject;
}
}
Given you wish to only send and receive document changes you could modify the classes with IsDirty flags. I'm sure though once you have the classes to work with, it will be dead easy to detect diff's.
回答4:
To load any XML data into DataSet
, you have to provide corresponding schema.
See Deriving DataSet Relational Structure from XML Schema (XSD).
Besides, DataSet
/DataTable
doesn't work with XML documents. They can import data from, and export data to XML.
回答5:
I haven't found any useable answers anywhere. It seems back in 2003 MS was talking about creating an XPathDocument2 or something that implemented what I'm asking for (books talking about the coming release mention it), but it doesn't seem to have been carried out. So here's my attempt at a solution:
Use XPathDocument/XPathNavigator, and add event handlers for Change/Delete/Insert. For each of these events, put a record in a DataTable {XPath | OldValue | NewValue} indicating the change. When ready to Commit, send the table across then clear it. If instead cancelling, use the Table info to undo the changes in the XPathDocument.
I haven't implemented this yet, but it seems like it might serve.
回答6:
I have tried to find a free or open-source XML diff tool numerous times before, but never dug up anything that really fit the bill. Essentially, you're looking at tree diffing, which is a whole discpline on its own. The fact that you're using XML is subordinate to this, I guess, as it's nothing but a tree in another form. You "just" need to define what specifies a node.
Though the Decomposition Algorithm for Tree Edit Distance calculates the distance between 2 trees, I suspect you can transform it to give you all changes, as that's the base for the distance measurement. How you communicate the changes after detection, is completely up to you. That could range from XML to JSON. Note that the authors of the algorithm mention they created a Python version in a few dozens of lines, so maybe if you drop the a line, they can be of assistance.
It looks like you could be the first one to publish a practical proof of concept if you can get this done :)
回答7:
The problem you have here is that XML is just a form of representing data, its not necessarily the data itself. Is this some sort of XML editor you are using, or is XML just the transport?
If you are talking about xml as a transport then when you talk about sending XML changes descriptions, you probably want to be generating those change descriptions at the point you generate the change itself, and there is every chance that the change descriptions won't be in the same schema that the original data is.
In addition the reason that datasets can do this, is because each row in a dataset has a known unique key. So the change can be sent back for the row instead of the entire set.
XML doesn't work like that, each row doesn't have a unique key. XPath can be used as the change locator but that could be more inefficient than sending the entire document with enough edits.
Why not simply treat the XML as text as use anyone of the standard patching algorithms? (look at the source for Git or Hg)