Is there any difference in performance between this
synchronized void x() {
y();
}
synchronized void y() {
}
and this
synchronized void x() {
y();
}
void y() {
}
Is there any difference in performance between this
synchronized void x() {
y();
}
synchronized void y() {
}
and this
synchronized void x() {
y();
}
void y() {
}
Yes, there is an additional performance cost, unless and until the JVM inlines the call to
y()
, which a modern JIT compiler will do in fairly short order. First, consider the case you've presented in whichy()
is visible outside the class. In this case, the JVM must check on enteringy()
to ensure that it can enter the monitor on the object; this check will always succeed when the call is coming fromx()
, but it can't be skipped, because the call could be coming from a client outside the class. This additional check incurs a small cost.Additionally, consider the case in which
y()
isprivate
. In this case, the compiler still does not optimize away the synchronization; see the following disassembly of an emptyy()
:According to the spec's definition of
synchronized
, each entrance into asynchronized
block or method performs lock action on the object, and leaving performs an unlock action. No other thread can acquire that object's monitor until the lock counter goes down to zero. Presumably some sort of static analysis could demonstrate that aprivate synchronized
method is only ever called from within othersynchronized
methods, but Java's multi-source-file support would make that fragile at best, even ignoring reflection. This means that the JVM must still increment the counter on enteringy()
:@AmolSonawane correctly notes that the JVM may optimize this code at runtime by performing lock coarsening, essentially inlining the
y()
method. In this case, after the JVM has decided to perform a JIT optimization, calls fromx()
toy()
will not incur any additional performance overhead, but of course calls directly toy()
from any other location will still need to acquire the monitor separately.Results of a micro benchmark run with jmh
=> no statistical difference.
Looking at the generated assembly shows that lock coarsening has been performed and
y_sync
has been inlined inx_sync
although it is synchronized.Full results:
Why not test it!? I ran a quick benchmark. The
benchmark()
method is called in a loop for warm-up. This may not be super accurate but it does show some consistent interesting pattern.Results (last 10)
Looks like the second variation is indeed slightly faster.
Test can be found below ( You have to guess what some methods do but nothing complicated ) :
It tests them with 100 threads each and starts counting the averages after 70% of them has completed ( as warmup ).
It prints it out once at the end.
MovingAverage.Cumulative add is basically ( performed atomically ): average = (average * (n) + number) / (++n);
MovingAverage.Converging you can look up but uses another formula.
The results after a 50 second warmup:
With: jiterations -> 1000000
That's nano seconds averages. That's really nothing and even shows that the zynced one takes less time.
With: jiterations -> original * 10 (takes much longer time)
As you can see the results show it's really not a big difference. The zynced one actually has lower average time for the last 30% completions.
With one thread each (iterations = 1) and jiterations = original * 100;
In a same thread environment ( removing Threads.async calls )
With: jiterations -> original * 10
The zynced one here seems to be slower. On an order of ~10. The reason for this could be due to the zynced one running after each time, who knows. No energy to try the reverse.
Last test run with:
Conclusion, there really is no difference.
In the case where both methods are synchronized, you would be locking monitor twice. So first approach would have additional overhead of lock again. But your JVM can reduce the cost of locking by lock coarsening and may in-line call to y().
No difference will be there. Since threads content only to acquire lock at x(). Thread that acquired lock at x() can acquire lock at y() without any contention(Because that is only thread that can reach that point at one particular time). So placing synchronized over there has no effect.