I have a snippet of code that change a counter in two threads. It's not thread safe because I didn't put any atomic variable or lock in the code. It gives the right result as I expected if the code only run once, but I want to run it for several times, so I put the code into a for loop. And the question is that only the first or the first two loops will generate the result I expect. For the rest of the loops, the results are always 0, which seems to be thread safe. Is there any inner operator in Java Virtual Machine resulting such thing?
I have tried change the number of loops, and the first one or two are always what I expect, but the others are 0 no matter how many loops there are.
Counter:
private static class Counter {
private int count;
public void increase() {
count++;
}
public void decrease() {
count--;
}
public int getCount() {
return count;
}
}
Person:
// This is just a thread to increase and decrease the counter for many times.
private static class Person extends Thread {
private Counter c;
public Person(Counter c) {
this.c = c;
}
@Override
public void run() {
for (int i = 0; i < 100000; i++) {
c.increase();
c.decrease();
}
}
}
Main method:
public static void main(String[] args) throws InterruptedException {
for (int i = 0; i < 10; i++) {
Counter c = new Counter();
Person p1 = new Person(c);
Person p2 = new Person(c);
p1.start();
p2.start();
p1.join();
p2.join();
System.out.println("run "+i+": "+c.getCount());
}
}
Output:
run 0: 243
run 1: 12
run 2: 0
run 3: 0
run 4: 0
run 5: 0
run 6: 0
run 7: 0
run 8: 0
run 9: 0
I don't know why the rest of the results are always 0. But I guess it's about the optimization of JVM. Is it right that the JVM optimizes the code when some loops have been done, and it omits the rest loops and always gives 0 as answer?
I think the JVM is optimizing here like you said.
I added some outputs with timings to your question, which clearly show, that optimization happens there.
public static void main(String[] args) throws InterruptedException {
for (int i = 0; i < 10; i++) {
final long startTime = System.currentTimeMillis();
Counter c = new Counter();
Person p1 = new Person(c);
Person p2 = new Person(c);
p1.start();
p2.start();
p1.join();
p2.join();
final long endTime = System.currentTimeMillis();
System.out.println(String.format("run %s: %s (%s ms)", i, c.getCount(), endTime - startTime));
}
}
Results:
run 0: 1107 (8 ms)
run 1: 1 (1 ms)
run 2: 0 (2 ms)
run 3: 0 (0 ms)
run 4: 0 (0 ms)
run 5: 0 (0 ms)
run 6: 0 (1 ms)
run 7: 0 (0 ms)
run 8: 0 (0 ms)
run 9: 0 (0 ms)
The first iterations the program needs a lot of time, wheras in later execution nearly no time at all is used.
Seems to be legit to suspect optimazation for this behaviour.
Using a volatile int count
:
run 0: 8680 (15 ms)
run 1: 6943 (12 ms)
run 2: 446 (7 ms)
run 3: -398 (7 ms)
run 4: 431 (8 ms)
run 5: -5489 (6 ms)
run 6: 237 (7 ms)
run 7: 122 (7 ms)
run 8: -87 (7 ms)
run 9: 112 (7 ms)
This took a surprising turn.
The first thing that one can say (relatively sure) is that the effect is caused by the JIT. I combined the code snippets into this MCVE:
public class CounterJitTest
{
private static class Counter
{
private int count;
public void increase()
{
count++;
}
public void decrease()
{
count--;
}
public int getCount()
{
return count;
}
}
private static class Person extends Thread
{
private Counter c;
public Person(Counter c)
{
this.c = c;
}
@Override
public void run()
{
for (int i = 0; i < 1000000; i++)
{
c.increase();
c.decrease();
}
}
}
public static void main(String[] args) throws InterruptedException
{
for (int i = 0; i < 10; i++)
{
Counter c = new Counter();
Person p1 = new Person(c);
Person p2 = new Person(c);
p1.start();
p2.start();
p1.join();
p2.join();
System.out.println("run " + i + ": " + c.getCount());
}
}
}
Running it with
java CounterJitTest
causes the output that was mentioned in the question:
run 0: 6703
run 1: 178
run 2: 1716
run 3: 0
run 4: 0
run 5: 0
run 6: 0
run 7: 0
run 8: 0
run 9: 0
Turning off the JIT with -Xint
(interpreted mode), that is, starting it as
java -Xint CounterJitTest
causes the following results:
run 0: 38735
run 1: 53174
run 2: 86770
run 3: 27244
run 4: 61885
run 5: 1746
run 6: 32458
run 7: 52864
run 8: 75978
run 9: 22824
In order to dive deeper into what the JIT actually does, I started the whole thing in a HotSpot disassembler VM, to have a look at the generated assembly. However, the execution time was so fast that I thought: Well, I'll just increase the counter in the for
-loop:
for (int i = 0; i < 1000000; i++)
But even increasing it to 100000000
caused the program to finish immediately. That already raised a suspicion. After generating the disassembly with
java -server -XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:+PrintAssembly -XX:+PrintInlining CounterJitTest
I looked at the compiled versions of the increase
and decrease
methods, but didn't find anything obvious. However, the run
method seemed to be the culprit here. Initially, the assembly of the run
method contained the expected code (only posting the most relevant parts here) :
Decoding compiled method 0x0000000002b32fd0:
Code:
[Entry Point]
[Constants]
# {method} {0x00000000246d0f00} 'run' '()V' in 'CounterJitTest$Person'
...
[Verified Entry Point]
...
0x0000000002b33198: je 0x0000000002b33338 ;*iconst_0
; - CounterJitTest$Person::run@0 (line 35)
0x0000000002b3319e: mov $0x0,%esi
0x0000000002b331a3: jmpq 0x0000000002b332bc ;*iload_1
; - CounterJitTest$Person::run@2 (line 35)
0x0000000002b331a8: mov 0x178(%rdx),%edi ; implicit exception: dispatches to 0x0000000002b3334f
0x0000000002b331ae: shl $0x3,%rdi ;*getfield c
; - CounterJitTest$Person::run@9 (line 37)
0x0000000002b331b2: cmp (%rdi),%rax ;*invokevirtual increase
; - CounterJitTest$Person::run@12 (line 37)
; implicit exception: dispatches to 0x0000000002b33354
...
0x0000000002b33207: je 0x0000000002b33359
0x0000000002b3320d: mov 0xc(%rdi),%ebx ;*getfield count
; - CounterJitTest$Counter::increase@2 (line 9)
; - CounterJitTest$Person::run@12 (line 37)
0x0000000002b33210: inc %ebx
0x0000000002b33212: mov %ebx,0xc(%rdi) ;*putfield count
; - CounterJitTest$Counter::increase@7 (line 9)
; - CounterJitTest$Person::run@12 (line 37)
...
0x0000000002b3326f: mov %ebx,0xc(%rdi) ;*putfield count
; - CounterJitTest$Counter::decrease@7 (line 14)
; - CounterJitTest$Person::run@19 (line 38)
...
I don't deeply "understand" this, admittedly, but one can see that it does a getfield c
, and some invocations of the (partially inlined?) increase
and decrease
methods.
However, the final compiled version of the run
method is this:
Decoding compiled method 0x0000000002b34590:
Code:
[Entry Point]
[Constants]
# {method} {0x00000000246d0f00} 'run' '()V' in 'CounterJitTest$Person'
# [sp+0x20] (sp of caller)
0x0000000002b346c0: mov 0x8(%rdx),%r10d
0x0000000002b346c4:
<writer thread='2060'/>
[Loaded java.lang.Shutdown from C:\Program Files\Java\jre1.8.0_131\lib\rt.jar]
<writer thread='5944'/>
shl $0x3,%r10
0x0000000002b346c8: cmp %r10,%rax
0x0000000002b346cb: jne 0x0000000002a65f60 ; {runtime_call}
0x0000000002b346d1: data32 xchg %ax,%ax
0x0000000002b346d4: nopw 0x0(%rax,%rax,1)
0x0000000002b346da: nopw 0x0(%rax,%rax,1)
[Verified Entry Point]
0x0000000002b346e0: mov %eax,-0x6000(%rsp)
0x0000000002b346e7: push %rbp
0x0000000002b346e8: sub $0x10,%rsp ;*synchronization entry
; - CounterJitTest$Person::run@-1 (line 35)
0x0000000002b346ec: cmp 0x178(%rdx),%r12d
0x0000000002b346f3: je 0x0000000002b34701
0x0000000002b346f5: add $0x10,%rsp
0x0000000002b346f9: pop %rbp
0x0000000002b346fa: test %eax,-0x1a24700(%rip) # 0x0000000001110000
; {poll_return}
0x0000000002b34700: retq
0x0000000002b34701: mov %rdx,%rbp
0x0000000002b34704: mov $0xffffff86,%edx
0x0000000002b34709: xchg %ax,%ax
0x0000000002b3470b: callq 0x0000000002a657a0 ; OopMap{rbp=Oop off=80}
;*aload_0
; - CounterJitTest$Person::run@8 (line 37)
; {runtime_call}
0x0000000002b34710: int3 ;*aload_0
; - CounterJitTest$Person::run@8 (line 37)
0x0000000002b34711: hlt
0x0000000002b34712: hlt
0x0000000002b34713: hlt
0x0000000002b34714: hlt
0x0000000002b34715: hlt
0x0000000002b34716: hlt
0x0000000002b34717: hlt
0x0000000002b34718: hlt
0x0000000002b34719: hlt
0x0000000002b3471a: hlt
0x0000000002b3471b: hlt
0x0000000002b3471c: hlt
0x0000000002b3471d: hlt
0x0000000002b3471e: hlt
0x0000000002b3471f: hlt
[Exception Handler]
[Stub Code]
0x0000000002b34720: jmpq 0x0000000002a8c9e0 ; {no_reloc}
[Deopt Handler Code]
0x0000000002b34725: callq 0x0000000002b3472a
0x0000000002b3472a: subq $0x5,(%rsp)
0x0000000002b3472f: jmpq 0x0000000002a67200 ; {runtime_call}
0x0000000002b34734: hlt
0x0000000002b34735: hlt
0x0000000002b34736: hlt
0x0000000002b34737: hlt
This is the complete assembly of the method! And it does ... well, basically nothing.
To confirm my suspicion, I explicitly disabled the inlining of the increase
method, by starting with
java -XX:CompileCommand=dontinline,CounterJitTest$Counter.increase CounterJitTest
And the output was again the expected one:
run 0: 3497
run 1: -71826
run 2: -22080
run 3: -20893
run 4: -17
run 5: -87781
run 6: -11
run 7: -380
run 8: -43354
run 9: -29719
So my conclusion is:
The JIT inlines the increase
and decrease
methods. They only increment and decrement the same value. And after inlining, the JIT is smart enough to figure out that the sequence of calls to
c.increase();
c.decrease();
is essentially a no-op, and hence, just does exactly that: Nothing.
You can't be sure that a multithread code incrementing and decrementing a variable will always give 0 as result.
TO be sure you can:
- Synchronize access to the
Counter
object
- Use inside the
Counter
object an AtomicInteger
Infact the code count++
or count--
is not thread safe.
Internally it is equivalent to something similar to the following:
load count - load count from ram to the registry
increment count - increment by 1
store count - save from the registry to ram
But this code can have this behaviour if called by two threads
first second ram
---------- -------- ------
count = 0
load count
load count
(here count in registry == 0) (here count in the second registry == 0)
increment count
increment count
(here count in registry == 1) (here count in the second registry == 1)
store count
store count
count == 1
Knowing that you can't assume nothing on the real behaviour of this not synchronized code.
It depends from many factor, for example:
- number of processors
- speed of execution of the increment and decrement code
- kind of processors (the behaviour can be different for a I7 machine and for an Atom processor)
- JVM Implementation (you can have different behaviours for Open JDK or Oracle JVM)
- Load of the CPU
- Absence or presence of execution of the GC process
You know that this code is thread unsafe. You can't try to predict any behaviour on that code that is reproducible on other pc or using a different configurations or also in the same machine with the same configuration because you can't control what happens outside the JVM (load of the CPU by other applications).
Additional note: microbenchmarks have a side effect related to the fact that some of the resources are not yet loaded. In your code the race condition can be more frequent on the first iterations because classes Counter
and Person
are not yet loaded (note that also the execution time for the first iteration is much longer than the others).