safe publication and the advantage of being immuta

2019-01-17 10:39发布

I'm re-reading Java Concurrency In Practice, and I'm not sure I fully understand the chapter about immutability and safe publication.

What the book says is:

Immutable objects can be used safely by any thread without additional synchronization, even when synchronization is not used to publish them.

What I don't understand is, why would anyone (interested in making his code correct) publish some reference unsafely?

If the object is immutable, and it's published unsafely, I understand that any other thread obtaining a reference to the object would see its correct state, because of the guarantees offered by proper immutability (with final fields, etc.).

But if the publication is unsafe, another thread might still see null or the previous reference after the publication, instead of the reference to the immutable object, which seems to me like something no-one would like.

And if safe publication is used to make sure the new reference is seen by all the threads, then even if the object is just effectively immutable (no final fields, but no way to mute them), then everything is safe again. As the book says :

Safely published effectively immutable objects can be used safely by any thread without additional synchronization.

So, why is immutability (vs. effective immutability) so important? In what case would an unsafe publication be wanted?

4条回答
姐就是有狂的资本
2楼-- · 2019-01-17 11:15

It is desirable to design objects that don't need synchronization for two reasons:

  1. The users of your objects can forget to synchronize.
  2. Even though the overhead is very little, synchronization is not free, especially if your objects are not used often and by many different threads.

Because the above reasons are very important, it is better to learn the sometimes difficult rules and as a writer, make safe objects that don't require synchronization rather than hoping all the users of your code will remember to use it correctly.

Also remember that the author is not saying the object is unsafely published, it is safely published without synchronization.

As for your second question, I just checked, and the book does not promise you that another thread will always see the reference to the updated object, just that if it does, it will see a complete object. But I can imagine that if it is published through the constructor of another (Runnable?) object, it will be sweet. That does help with explaining all cases though.

EDIT:

effectively immutable and immutable The difference between effectively immutable and immutable is that in the first case you still need to publish the objects in a safe way. For the truly immutable objects this isn't needed. So truly immutable objects are preferred because they are easier to publish for the reasons I stated above.

查看更多
狗以群分
3楼-- · 2019-01-17 11:16

I had the exact same question as the original poster when finishing reading chapters 1-3 . I think the authors could have done a better job elaborating on this a bit more.

I think the difference lies therein that the internal state of effectively immutable objects can be observed to be in an inconsistent state when they are not safely published whereas the internal state of immutable objects can never be observed to be in an inconsistent state.

However I do think the reference to an immutable object can be observed to be out of date / stale if the reference is not safely published.

查看更多
劫难
4楼-- · 2019-01-17 11:21

So, why is immutability (vs. effective immutability) so important?

I think the main point is that truly immutable objects are harder to break later on. If you've declared a field final, then it's final, period. You would have to remove the final in order to change that field, and that should ring an alarm. But if you've initially left the final out, someone could carelessly just add some code that changes the field, and boom - you're screwed - with only some added code (possibly in a subclass), no modification to existing code.

I would also assume that explicit immutability enables the (JIT) compiler to do some optimizations that would otherwise be hard or impossible to justify. For example, when using volatile fields, the runtime must guarantee a happens-before relation with writing and reading threads. In practice this may require memory barriers, disabling out-of-order execution optimizations, etc. - that is, a performance hit. But if the object is (deeply) immutable (contains only final references to other immutable objects), the requirement can be relaxed without breaking anything: the happens-before relation needs to be guaranteed only with writing and reading the one single reference, not the whole object graph.

So, explicit immutability makes the program simpler so that it's both easier for humans to reason and maintain and easier for the computer to execute optimally. These benefits grow exponentially as the object graph grows, i.e. objects contain objects that contain objects - it's all simple if everything is immutable. When mutability is needed, localizing it to strictly defined places and keeping everything else immutable still gives lots of these benefits.

查看更多
成全新的幸福
5楼-- · 2019-01-17 11:35

"Unsafe publication" is often appropriate in cases where having other threads see the latest value written to a field would be desirable, but having threads see an earlier value would be relatively harmless. A prime example is the cached hash value for String. The first time hashCode() is called on a String, it will compute a value and cache it. If another thread which calls hashCode() on the same string can see the value computed by the first thread, it won't have to recompute the hash value (thus saving time), but nothing bad will happen if the second thread doesn't see the hash value. It will simply end up performing a redundant-but-harmless computation which could have been avoided. Having hashCode() publish the hash value safely would have been possible, but the occasional redundant hash computations are much cheaper than the synchronization required for safe publication. Indeed, except on rather long strings, synchronization costs would probably negate any benefit from caching.

Unfortunately, I don't think the creators of Java imagined situations where code would write to a field and prefer that it should be visible to other threads, but not mind too much if it isn't, and where the reference stored to the field would in turn identify another object with a similar field. This leads to situations writing semantically-correct code is much more cumbersome and likely slower than code which would be likely to work but whose semantics would not be guaranteed. I don't know any really good remedy for that in some cases other than using some gratuitous final fields to ensure that things get properly "published".

查看更多
登录 后发表回答