I'm trying to understand the Java memory model and threads. As fas has I understand, each thread has a local copy of the "main" memory. So if one thread tries to change an int
variable, for example, of some object, it caches the int
variable and if it changes it, other thread might not see the change. But what if threads cache some object instead of int? What threads cache it in this case? If a thread caches a reference to an object any change to the state of the object are not visible to other threads? Why?
Thank you in advance
CPU have different level caches L1, L2, L3. Every CPU (and also /may CPU Core) has own cache. This caches store minimal set of main memory (RAM) for performance.
_______________ ______________
| CPU 1 | | CPU 2 |
| _________ | | _________ |
| | Level 1 | | | | Level 1 | |
| | Cache | | | | Cache | |
| | | | | | | |
| |_________| | | |_________| |
|_______________| |______________|
| | | |
| | | |
_|_|______________|_|__
| |
| MAIN MEMORY |
|_______________________|
Time Command CPU 1 (Cache) CPU 2 (Cache) Main Memory
------- ---------- ---------------- -------------- -------------
1 --- --- --- x = 10
2 Read x (on cpu1) x = 10 --- x = 10
3 Write x <--20 (on cpu1) x = 20 --- x = 10
4 Read x (on cpu2) x = 20 x = 10 x = 10
5 put cache to Main mem x = 20 x = 10 x = 20
For example, Above execution order, x value is wrong on CPU2. x value already changed by CPU1.
If x variable is defined as volatile, all write operation reflect to main memory instantly.
A thread doesn't have a local copy of memory. Part of the memory the thread reads/writes could be from a cache, instead of main memory. Caches do not need to be in sync with each other, or in sync with main memory. So this is where you can observe inconsistencies.
So if one thread tries to change a int variable for example of some object it caches the int variable and if it changes it other thread might not see the change.
That is correct. The Java Memory model is defined in happens before rules, e.g. there is a happens before rule between a volatile write of field x and a volatile read of field x. So when a write is done, a subsequent read will see the value written.
Without such a happens before relation, all bets are off (also instruction reordering can make life complicated when there is no happens before rule).
If thread caches a reference to an object any change to the state of the object are also not visible to other threads? Why?
It could be visible.. it could also not be visible. Without a happens before rule, all bets are of. The reason why is that otherwise a lot of optimizations like hardware tricks to speed things up, or compiler tricks would not be allowed. And of course, always keeping memory in sync with the cache, would reduce performance.
CPUs have multiple caches. It is these hardware caches which might have inconsistent copies of the data. The reason they might be inconsistent is that keeping everything consistent can slow down your code by a factor of 10 and ruin any benefit you get from having multiple threads. To get decent performance you need to be selectively consistent. The Java Memory Model describes when it will ensure the data is consistent, but in the simplest case it doesn't.
Note: this is not just a CPU issue. A field which doesn't have to consistent between threads can be inlined in the code. This can mean that if one thread changes the value another thread might NEVER see this change as it has been burnt into the code.
"Before you can write decent multi-threaded code, however, you really need to study more on the complexities and subtleties of multi-threaded code.
When it comes to threads, very little is guaranteed.
Can you imagine the havoc that can occur when two different threads have access to a single instance of a class, an both threads invoke methods on that object... and those methods modify the state of the object? ... it's too scary to even visualize.", from Sun Certified Programmer for Java 6, chapter 9: Threads.
My friend,
In Java, threads doesn't cache any object or variable, they just have a reference to an instance of an object. Talking about thread cache memory is more like talking about operative systems threads... Java works in the same way in all OS, no matter how threads are managed internally, which differs very much depending on the different OS's.
Look a this code:
AccountDanger r = new AccountDanger();
Thread one = new Thread(r):
Thread two = new Thread(r);
As you can see, in this case threads have access to the same instance: r. Then, you will have synchronization problems, for sure... it doesn't matters if we talk about native or object members, threads one and two will have access to all members of r (if they are accessible via scope or setters/getters) and they will read directly the values from r instance. This is for sure even if you don't notice it, which is sometimes really hard.
I recommend you reading about java scopes and java synchronization, if you want to code multi-threaded applications.
Regards,
Each Thread does not have a local copy of memory. If a variable is visible (because of its scope) to more than one thread, every thread will see the same value.
However, multi-threaded programs need to be very careful about sharing variables (memory) because it is very easy to introduce race conditions if you are not careful.