I'm re-reading Java Concurrency In Practice, and I'm not sure I fully understand the chapter about immutability and safe publication.
What the book says is:
Immutable objects can be used safely by any thread without additional synchronization, even when synchronization is not used to publish them.
What I don't understand is, why would anyone (interested in making his code correct) publish some reference unsafely?
If the object is immutable, and it's published unsafely, I understand that any other thread obtaining a reference to the object would see its correct state, because of the guarantees offered by proper immutability (with final
fields, etc.).
But if the publication is unsafe, another thread might still see null
or the previous reference after the publication, instead of the reference to the immutable object, which seems to me like something no-one would like.
And if safe publication is used to make sure the new reference is seen by all the threads, then even if the object is just effectively immutable (no final
fields, but no way to mute them), then everything is safe again. As the book says :
Safely published effectively immutable objects can be used safely by any thread without additional synchronization.
So, why is immutability (vs. effective immutability) so important? In what case would an unsafe publication be wanted?
It is desirable to design objects that don't need synchronization for two reasons:
Because the above reasons are very important, it is better to learn the sometimes difficult rules and as a writer, make safe objects that don't require synchronization rather than hoping all the users of your code will remember to use it correctly.
Also remember that the author is not saying the object is unsafely published, it is safely published without synchronization.
As for your second question, I just checked, and the book does not promise you that another thread will always see the reference to the updated object, just that if it does, it will see a complete object. But I can imagine that if it is published through the constructor of another (
Runnable
?) object, it will be sweet. That does help with explaining all cases though.EDIT:
effectively immutable and immutable The difference between effectively immutable and immutable is that in the first case you still need to publish the objects in a safe way. For the truly immutable objects this isn't needed. So truly immutable objects are preferred because they are easier to publish for the reasons I stated above.
I had the exact same question as the original poster when finishing reading chapters 1-3 . I think the authors could have done a better job elaborating on this a bit more.
I think the difference lies therein that the internal state of effectively immutable objects can be observed to be in an inconsistent state when they are not safely published whereas the internal state of immutable objects can never be observed to be in an inconsistent state.
However I do think the reference to an immutable object can be observed to be out of date / stale if the reference is not safely published.
I think the main point is that truly immutable objects are harder to break later on. If you've declared a field
final
, then it's final, period. You would have to remove thefinal
in order to change that field, and that should ring an alarm. But if you've initially left thefinal
out, someone could carelessly just add some code that changes the field, and boom - you're screwed - with only some added code (possibly in a subclass), no modification to existing code.I would also assume that explicit immutability enables the (JIT) compiler to do some optimizations that would otherwise be hard or impossible to justify. For example, when using
volatile
fields, the runtime must guarantee a happens-before relation with writing and reading threads. In practice this may require memory barriers, disabling out-of-order execution optimizations, etc. - that is, a performance hit. But if the object is (deeply) immutable (contains onlyfinal
references to other immutable objects), the requirement can be relaxed without breaking anything: the happens-before relation needs to be guaranteed only with writing and reading the one single reference, not the whole object graph.So, explicit immutability makes the program simpler so that it's both easier for humans to reason and maintain and easier for the computer to execute optimally. These benefits grow exponentially as the object graph grows, i.e. objects contain objects that contain objects - it's all simple if everything is immutable. When mutability is needed, localizing it to strictly defined places and keeping everything else immutable still gives lots of these benefits.
"Unsafe publication" is often appropriate in cases where having other threads see the latest value written to a field would be desirable, but having threads see an earlier value would be relatively harmless. A prime example is the cached hash value for
String
. The first timehashCode()
is called on aString
, it will compute a value and cache it. If another thread which callshashCode()
on the same string can see the value computed by the first thread, it won't have to recompute the hash value (thus saving time), but nothing bad will happen if the second thread doesn't see the hash value. It will simply end up performing a redundant-but-harmless computation which could have been avoided. HavinghashCode()
publish the hash value safely would have been possible, but the occasional redundant hash computations are much cheaper than the synchronization required for safe publication. Indeed, except on rather long strings, synchronization costs would probably negate any benefit from caching.Unfortunately, I don't think the creators of Java imagined situations where code would write to a field and prefer that it should be visible to other threads, but not mind too much if it isn't, and where the reference stored to the field would in turn identify another object with a similar field. This leads to situations writing semantically-correct code is much more cumbersome and likely slower than code which would be likely to work but whose semantics would not be guaranteed. I don't know any really good remedy for that in some cases other than using some gratuitous
final
fields to ensure that things get properly "published".