Mutexes are pretty common in many programming languages, like e.g. C/C++. I miss them in Java. However, there are multiple ways I could write my own class Mutex
:
- Using a simple synchronized keyword on
Mutex
.
- Using a binary semaphore.
- Using atomic variables, like discussed here.
- ...?
What is the fastest (best runtime) way? I think synchronized is most common, but what about performance?
Mutexes are pretty common in many programming languages, like e.g. C/C++. I miss them in Java.
Not sure I follow you (especially because you give the answer in your question).
public class SomeClass {
private final Object mutex = new Object();
public void someMethodThatNeedsAMutex() {
synchronized(mutex) {
//here you hold the mutex
}
}
}
Alternatively, you can simply make the whole method synchronized, which is equivalent to using this
as the mutex object:
public class SomeClass {
public synchronized void someMethodThatNeedsAMutex() {
//here you hold the mutex
}
}
What is the fastest (best runtime) way?
Acquiring / releasing a monitor is not going to be a significant performance issue per se (you can read this blog post to see an analysis of the impact). But if you have many threads fighting for the lock, it will create contention and degrade performance.
In that case, the best strategy is to not use mutexes by using "lock-free" algorithms if you are mostly reading data (as pointed out by Marko in the comments, lock-free uses CAS operations, which may involve retrying writes many times if you have lots of writing threads, eventually leading to worse performance) or even better, by avoiding to share too much stuff across threads.
The opposite is the case: Java designers solved it so well that you don't even recognize it: you don't need a first-class Mutex
object, just the synchronized
modifier.
If you have a special case where you want to juggle your mutexes in a non-nesting fashion, there's always the ReentrantLock
and java.util.concurrent
offers a cornucopia of synchronization tools that go way beyond the crude mutex.
In Java each object can be uses as Mutex.
This objects are typicaly named "lock" or "mutex".
You can create that object for yourself which is the prefered variant, because it avoids external access to that lock:
// usually a field in the class
private Object mutex = new Object();
// later in methods
synchronized(mutex) {
// mutual exclusive section for all that uses synchronized
// ob this mutex object
}
Faster is to avoid the mutex, by thinking what happens if another thread reads an non actual value. In some situations this would produce wrong calculation results, in other results only in a minimal delay. (but faster than with syncing)
Detailed explanation in book
Java Concurreny in practise
.
What is the fastest (best runtime) way?
That depends on many things. For example, ReentrantLock
used to perform better under contention than using synchronized
, but that changed when a new HotSpot version, optimizing synchronized
locking, was released. So there's nothing inherent in any way of locking that favors one flavor of mutexes over the other (from a performance point of view) - in fact, the "best" solution can change with the data you're processing and the machine you're running on.
Also, why did the inventors of Java not solve this question for me?
They did - in several ways: synchronized
, Lock
s, atomic variables, and a whole slew of other utilities in java.util.concurrent
.
You can run micro benchmarks of each variant, like atomic, synchronized, locked. As others have pointed out, it depends a lot on the machine and number of threads in use. In my own experiments incrementing long integers, I found that with only one thread on a Xeon W3520, synchronized wins over atomic: Atomic/Sync/Lock: 8.4/6.2/21.8, in nanos per increment operation.
This is of course a border case since there is never any contention. Of course, in that case, we can also look at unsynchronized single-threads long increment, which comes out six times faster than atomic.
With 4 threads I get 21.8/40.2/57.3. Note that these are all increments across all threads, so we actually see a slowdown. It gets a bit better for locks with 64 threads: 22.2/45.1/45.9.
Another test on a 4-way/64T machine using Xeon E7-4820 yields for 1 thread: 9.1/7.8/29.1, 4 threads: 18.2/29.1/55.2 and 64 Threads: 53.7/402/420.
One more data point, this time a dual Xeon X5560, 1T: 6.6/5.8/17.8, 4T: 29.7/81.5/121, 64T: 31.2/73.4/71.6.
So, on a multi-socket machine, there is a heavy cache coherency tax.
you can use java.util.concurrent.locks.Lock in the same way as the mutex or java.util.concurrent.Semaphore. But using synchronized-keyword is a better way :-)
Regards
Andrej