We use BinaryFormatter in a C# game, to save user game progress, game levels, etc. We are running into the problem of backwards compatibility.
The aims:
- Level designer creates campaign (levels&rules), we change the code, the campaign should still work fine. This can happen everyday during development before release.
- User saves game, we release a game patch, user should still be able to load game
- The invisible data-conversion process should work no matter how distant the two versions are. For example an user can skip our first 5 minor updates and get the 6th directly. Still, his saved games should still load fine.
The solution needs to be completely invisible to users and level designers, and minimally burden coders who want to change something (e.g. rename a field because they thought of a better name).
Some object graphs we serialize are rooted in one class, some in others. Forward compatibility is not needed.
Potentially breaking changes (and what happens when we serialize the old version and deserialize into the new):
- add field (gets default-initialized)
- change field type (failure)
- rename field (equivalent to removing it and adding a new one)
- change property to field and back (equivalent to a rename)
- change autoimplemented property to use backing field (equivalent to a rename)
- add superclass (equivalent to adding its fields to the current class)
- interpret a field differently (e.g. was in degrees, now in radians)
- for types implementing ISerializable we may change our implementation of the ISerializable methods (e.g. start using compression within the ISerializable implementation for some really large type)
- Rename a class, rename an enum value
I have read about:
- Version Tolerant Serialization
- IDeserializationCallback
- [OptionalField(VersionAdded)]
- [OnDeserializing], [OnDeserialized], [OnSerializing], [OnSerialized].
- [NotSerialized]
My current solution:
- We make as many changes as possible non-breaking, by using stuff like the OnDeserializing callback.
- We schedule breaking changes for once every 2 weeks, so there's less compatibility code to keep around.
- Everytime before we make a breaking change, we copy all the [Serializable] classes we use, into a namespace/folder called OldClassVersions.VersionX (where X is the next ordinal number after the last one). We do this even if we aren't going to be making a release soon.
- When writing to file, what we serialize is an instance of this class: class SaveFileData { int version; object data; }
- When reading from file, we deserialize the SaveFileData and pass it to an iterative "update" routine that does something like this:
.
for(int i = loadedData.version; i < CurrentVersion; i++)
{
// Update() takes an instance of OldVersions.VersionX.TheClass
// and returns an instance of OldVersions.VersionXPlus1.TheClass
loadedData.data = Update(loadedData.data, i);
}
- For convenience, the Update() function, in its implementation, can use a CopyOverlappingPart() function that uses reflection to copy as much data as possible from the old version to the new version. This way, the Update() function can only handle stuff that actually changed.
Some problems with that:
- the deserializer deserializes to class Foo rather than to class OldClassVersions.Version5.Foo - because class Foo is what was serialized.
- almost impossible to test or debug
- requires to keep around old copies of a lot of classes, which is error-prone, fragile and annoying
- I don't know what to do when we want to rename a class
This should be a really common problem. How do people usually solve it?