I'm trying to implement an equivalent to String.intern(), but for other objets. My goal is the following: I've an object A which I will serialize and then deserialize. If there is another reference to A somewhere, I want the result of the deserialization to be the same reference.
Here is one example of what I would expect.
MyObject A = new MyObject();
A.data1 = 1;
A.data2 = 2;
byte[] serialized = serialize(A);
A.data1 = 3;
MyObject B = deserialize(serialized); // B!=A and B.data1=1, B.data2=2
MyObject C = B.intern(); // Here we should have C == A. Consequently C.data1=3 AND C.data2=2
Here is my implementation atm. (the MyObject
class extends InternableObject
)
public abstract class InternableObject {
private static final AtomicLong maxObjectId = new AtomicLong();
private static final Map<Long, InternableObject> dataMap = new ConcurrentHashMap<>();
private final long objectId;
public InternableObject() {
this.objectId = maxObjectId.incrementAndGet();
dataMap.put(this.objectId, this);
}
@Override
protected void finalize() throws Throwable {
super.finalize();
dataMap.remove(this.objectId);
}
public final InternableObject intern() {
return intern(this);
}
public static InternableObject intern(InternableObject o) {
InternableObject r = dataMap.get(o.objectId);
if (r == null) {
throw new IllegalStateException();
} else {
return r;
}
}
}
My unit test (which fails):
private static class MyData extends InternableObject implements Serializable {
public int data;
public MyData(int data) {
this.data = data;
}
}
@Test
public void testIntern() throws Exception {
MyData data1 = new MyData(7);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(data1);
oos.flush();
baos.flush();
oos.close();
baos.close();
ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
ObjectInputStream ois = new ObjectInputStream(bais);
MyData data2 = (MyData) ois.readObject();
Assert.assertTrue(data1 == data2.intern()); // Fails here
}
The failure is due to the fact that, when deserializing, the constructor of InternableObject is called, and thus objectId will be 2 (even if the serialized data contains "1")
Any idea about how to solve this particular problem or, another approach to handle the high level problem ?
Thanks guys
Do not use the constructor to create instances. Use a factory method that checks if an instance already exists first, only create an instance if there isn't already a matching one.
To get serialization to cooperate, your class will need to make use of readResolve() / writeReplace(). http://docs.oracle.com/javase/7/docs/platform/serialization/spec/serial-arch.html#4539
The way you implemented your constructor, you're leaking a reference during construction, which can lead to very hard to nail down problems. Also, your instance map isn't protected by any locks, so its not thread save.
Typically
intern()
forms an aspect, and maybe should not be realized as a base class, maybe too restricting its usage in a more complex constellation.There are two aspects:
1. Sharing the "same" object.
Internalizing an object only gives a profit, when several objects can be "internalized" to the same object. So I think, that InternalableObjecte. with a new sequential number is not really adequate. More important is that the class defines a fitting equals and hashCode.
Then you can do an identity
Map<Object, Object>
:InternMap could be used for any class, but above we restrict it to Internalizable things.
2. Replacing a dynamically created non-shared object with it's
.intern()
.Which in Java 8 could be realised with a
defualt
method in an interface:The
Class<T>
parameter needed because of type erasure. Concurrency disregarded here.Prior to Java 8, just use an empty interface Internalizable as _marker: interface, and use a static InternMap.