How to write (big) XML to a file in C#?

2020-05-23 08:17发布

Folks,

Please, what's a good way of writing really big XML documents (upto say 500 MB) in C# .NET 3.5? I've had a bit of search around, and can't seem to find anything which addresses this specific question.

My previous thread (What is the best way to parse (big) XML in C# Code?) covered reading similar magnitude Xml documents... With that solved I need to think about how to write the updated features (http://www.opengeospatial.org/standards/sfa) to an "update.xml" document.

My ideas: Obviously one big DOM is out, considering the maximum size of the document to be produced. I'm using XSD.EXE to generate binding classes form the schema... which works nicely with the XmlSerializer class, but I think it builds a DOM "under the hood". Is this correct?. I can't hold all the features (upto 50,000 of them) in memory at one time. I need to read a feature form the database, serialize it, and write it to file. So I'm thinking I should use the XmlSerializer to write a "doclet" for each individual feature to the file. I've got no idea (yet) if this is even possible/feasible.

What do you think?

Background: I'm porting an old VB6 MapInfo "client plugin" to C#. There is an existing J2EE "update service" (actually just a web-app) which this program (among others) must work with. I can't change the server; unless absapositively necessary; especially of that involves changing the other clients. The server accepts an XML document with a schema which does not specificy any namespaces... ie: there is only default namespace, and everything is in it.

My experience: I'm pretty much a C# and .NET newbie. I've been programming for about 10 year in various languages including Java, VB, C, and some C++.

Cheers all. Keith.

PS: It's dinner time, so I'll be AWOL for about half an hour.

标签: c# xml
3条回答
孤傲高冷的网名
2楼-- · 2020-05-23 09:02

For writing large xml, XmlWriter (directly) is your friend - but it is harder to use. The other option would be to use DOM/object-model approaches and combine them, which is probably doable if you seize control of theXmlWriterSettings and disable the xml marker, and get rid of the namespace declarations...

using System;
using System.Collections.Generic;
using System.Xml;
using System.Xml.Serialization;    
public class Foo {
    [XmlAttribute]
    public int Id { get; set; }
    public string Bar { get; set; }
}
static class Program {
    [STAThread]
    static void Main() {
        using (XmlWriter xw = XmlWriter.Create("out.xml")) {
            xw.WriteStartElement("xml");
            XmlSerializer ser = new XmlSerializer(typeof(Foo));
            XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
            ns.Add("","");
            foreach (Foo foo in FooGenerator()) {
                ser.Serialize(xw, foo, ns);
            }
            xw.WriteEndElement();
        }
    }    
    // streaming approach; only have the smallest amount of program
    // data in memory at once - in this case, only a single `Foo` is
    // ever in use at a time
    static IEnumerable<Foo> FooGenerator() {
        for (int i = 0; i < 40; i++) {
            yield return new Foo { Id = i, Bar = "Foo " + i };
        }
    }
}
查看更多
小情绪 Triste *
3楼-- · 2020-05-23 09:10

Use a XmlWriter:

[...] a writer that provides a fast, non-cached, forward-only means of generating streams or files containing XML data.

查看更多
够拽才男人
4楼-- · 2020-05-23 09:13

Did you consider compressing it before writing it to disk? With XML you can reach more than 10 times compressing and even more. it will probably take you less time to compress the file and write the compressed version than to read the whole 500Mb version.

查看更多
登录 后发表回答