I am investigating different options in the Java Serialization mechanism to allow flexibility in our class structures for version-tolerant storage (and advocating for a different mechanism, you don't need to tell me).
For instance, the default serialization mechanism can handle both adding and removing fields, if only backwards compatibility is required.
Renaming a class or moving it to a different package has proved to be much harder, though. I found in this question that I was able to do a simple class rename and/or move package, by subclassing ObjectInputStream and overriding readClassDescriptor():
if (resultClassDescriptor.getName().equals("package.OldClass"))
resultClassDescriptor = ObjectStreamClass.lookup(newpackage.NewClass.class);
That is fine for simple renames. But if you then try to add or delete a field, you get a java.io.StreamCorruptedException. Worse, this happens even if a field has been added or deleted, and then you rename the class, which could cause problems with multiple developers or multiple checkins.
Based on some reading I had done, I experimented a bit with also overriding resolveClass(), with the idea that we were correctly repointing the name to the new class, but not loading the old class itself and bombing on the field changes. But this comes from a very vague understanding of some the details of the Serialization mechanism, and I'm not sure if I'm even barking up the right tree.
So 2 precise questions:
- Why is repointing the class name using readClassDescriptor() causing deserialization to fail on normal, compatible class changes?
- Is there a way using resolveClass() or another mechanism to get around this and allow classes to both evolve (add and remove fields) and be renamed/repackaged?
I poked around and could not find an equivalent question on SO. By all means, point me to such a question if it exists, but please read the question carefully enough that you do not close me unless another question actually answers my precise question.
I haven't tinkered with class descriptors much enough, but if your problem is just about renaming and repackaging there is a much easier solution for that. You could just simply edit your serialized data file with a text editor and just replace your old names with new ones. it's there in a human-readable form. for example suppose we have this
OldClass
placed insideoldpackage
and containing anoldField
, like this:Now when we serialize an instance of this class and get something like this:
Now if we want to change class's name to
NewClass
and put insidenewpackage
and change its field's name tonewField
, I simply rename it the file, like this:and define the appropriate serialVersionUID for the new class.
That's all. No extending and overriding required.
The problem is that readClassDescriptor is supposed to tell the ObjectInputStream how to read the data which is currently in the stream you are reading. if you look inside a serialized data stream, you will see that it not only stores the data, but lots of metadata about exactly what fields are present. this is what allows serialization to handle simple field additions/removals. however, when you override that method and discard the info returned from the stream, you are discarding the info about what fields are in the serialized data.
i think the solution to the problem would be to take the value returned by super.readClassDescriptor() and create a new class descriptor which returns the new class name, but otherwise returns the info from the old descriptor. (although, in looking at ObjectStreamField, it may be more complicated than that, but that is the general idea).
I had same problems with flexibility like you and I found the way. So here my version of readClassDescriptor()
This is what writeReplace() and readResolve() are for. You're making it much more complicated than it really is. Note that you can define these methods either in the two objects concerned or in subclasses of thee Object stream classes.