Binary object graph serialization

2019-04-28 20:31发布

问题:

I'm looking for advice on serialization in a .net app. The app is a desktop/thick client app and the serialization represents the persisted document format. The requirements for the serializer is

  • Must allow serializing fields, not public properties only.
  • Must not require parameterless constructors.
  • Must handle general object graphs, i.e. not only DAG but shared/bidirectional references.
  • Must work with framework classes (e.g. Serialize Dictionaries).

Currently we use the BinaryFormatter which handles all of the above quite well, but size/performance and version tolerance is an issue. We use the [OnDeserialized/ing] attributes to provide compatibility, but it does not allow for large refactorings (say a namespace change) without complex use of surrogates and so on.

An ideal solution would be a drop-in replacement for BinaryFormatter that works with our existing [NonSerialized] annotations etc., but performs better, and produces a format that is smaller and easier to maintain.

I have looked at the different protobuf implementations, and even though it seems possible to serialize general object graphs/enums/structs these days, it does not appear trivial to serialize a complex graph with a lot of framework collection types etc. Also, even if we could make it work with fields rather than properties I understand it would still mean having to add parameterless constructors and protobuf annotations to all classes (The domain is around 1000 classes).

So the questions:

  • Are there any "alternative" Binary formatters, that provide a well documented format, perform better?
  • Are protocol buffers ever suitable for persisting large general object graphs including framework types?

回答1:

Protocol buffers as a format has no official support for object graphs, but protobuf-net does provide this, and meets your other requirements. To take the points in turn:

  • Must allow serializing fields, not public properties only

Sure; protobuf-net can do that for both public and non-public fields; tell it about the fields at either runtime or via attributes

  • Must not require parameterless constructors.

That is available in "v2" - again, you can tell it to skip the constructor at runtime or via attributes (SkipConstructor=true on the contract)

  • Must handle general object graphs, i.e. not only DAG but shared/bidirectional references.

Sure; mark AsReference=true on a member

  • Must work with framework classes (e.g. Serialize Dictionaries).

Standard lists and dictionaries work fine; however, I have an outstanding change request to support AsReference inside a dictionary. Meaning, Dictionary<string, Foo> won't currently run the graph code for Foo, but I can probably find a few moments to look at this if it is causing you significant pain

  • We use the [OnDeserialized/ing] attributes to provide compatibility

Serialization callbacks are fully supported

  • but it does not allow for large refactorings (say a namespace change) without complex use of surrogates and so on.

Namespaces etc are not at all interesting to protobuf-net (unless you are using the DynamicType options)

  • it would still mean having to add parameterless constructors and protobuf annotations to all classes

Not necessarily; if you can guarantee that you won't change the field names, you can ask it to infer the field numbers internally - and ultimately in "v2" everything can be specified at runtime, so you can often write a small configuration loop that runs at app-startup and uses reflection to configure the system. Then you do not need to change your existing code at all.



回答2:

Try db4o, it's not realy a serializer but as far as I can tell it meets your requirements (complex types, deep graph, inheritance?, dictionaries etc), you don't have to change anything on your objects, and the API is extremely easy to use.

It supports schema versioning/merging.