This question regards the implementation of ThreadLocalRandom
in OpenJDK version 1.8.0.
ThreadLocalRandom
provides a per-thread random number generator without the synchronization overhead imposed by Random. The most obvious implementation (IMO) would be something like this, which appears to preserve backward compatibility without much complexity:
public class ThreadLocalRandom extends Random {
private static final ThreadLocal<ThreadLocalRandom> tl =
ThreadLocal.withInitial(ThreadLocalRandom::new);
public static ThreadLocalRandom current() {
return tl.get();
}
// Random methods moved here without synchronization
// stream methods here
}
public class Random {
private ThreadLocalRandom delegate = new ThreadLocalRandom();
// methods synchronize and delegate for backward compatibility
}
However, the actual implementation is totally different and quite bizarre:
ThreadLocalRandom
duplicates some of the methods in Random
verbatim and others with minor modifications; surely much of this code could have been reused.
Thread
stores the seed and a probe variable used to initialize the `ThreadLocalRandom, violating encapsulation;
ThreadLocalRandom
uses Unsafe
to access the variables in Thread
, which I suppose is because the two classes are in different packages yet the state variables must be private in Thread
- Unsafe
is only necessary because of the encapsulation violation;
ThreadLocalRandom
stores its next nextGaussian
in a static ThreadLocal
instead of in an instance variable as Random
does.
Overall my cursory inspection seems to reveal an ugly copy of Random
with no advantages over the simple implementation above. But the authors of the standard library are smart so there must be some reason for this weird approach. Does anyone have any insight into why ThreadLocalRandom
was implemented this way?
The key problem is a lot of the code is legacy and can't (easily) be changed - Random
was designed to be "thread-safe" by synchronizing all its methods. This works, in that instances of Random
can be used across multiple threads, but it's a severe bottleneck as no two threads can simultaneously retrieve random data. A simple solution would be to construct a ThreadLocal<Random>
object thereby avoiding the lock contention, however this still isn't ideal. There's still some overhead to synchronized
methods even when uncontested, and constructing n Random
instances is wasteful when they're all essentially doing the same job.
So at a high-level ThreadLocalRandom
exists as a performance optimization, hence it makes sense that its implementation would be "bizarre", as the JDK devs have put time into optimizing it.
There are many classes in the JDK that, at first glance, are "ugly". Remember however that the JDK authors are solving a different problem than you. The code they write will be used by thousands if not millions of developers in countless ways. They have to regularly trade-off best-practices for efficiency because the code they're writing is so mission critical.
Effective Java: Item 55 also addresses this issue - the key point being that optimization should be done as a last resort, by experts. The JDK devs are those experts.
To your specific questions:
ThreadLocalRandom
duplicates some of the methods in Random
verbatim and others with minor modifications; surely much of this code could have been reused.
Unfortunately no, as the methods on Random
are synchronized
. If they were invoked ThreadLocalRandom
would pull in Random
's lock-contention trouble. TLR needs to override every method in order to remove the synchronized
keyword from the methods.
Thread
stores the seed and a probe variable used to initialize the ThreadLocalRandom
, violating encapsulation;
First off, it's really not "violating encapsulation" since the field is still package-private. It's encapsulated from users, which is the goal. I wouldn't get too hung up on this as the decisions were made here to improve performance. Sometimes performance comes at the cost of normal good design. In practice this "violation" is undetectable. The behavior is simply encapsulated inside two classes instead of a single one.
Putting the seed inside Thread
allows ThreadLocalRandom
to be totally stateless (aside from the initialized
field, which is largely irrelevant), and therefore only a single instance ever needs to exist across the whole application.
ThreadLocalRandom
uses Unsafe
to access the variables in Thread
, which I suppose is because the two classes are in different packages yet the state variables must be private in Thread
- Unsafe
is only necessary because of the encapsulation violation;
Many JDK classes use Unsafe
. It's a tool, not a sin. Again, I just wouldn't get too stressed out about this fact. The class is called Unsafe
to discourage lay-developers from misusing it. We trust/hope the JDK authors are smart enough to know when it's safe to use.
ThreadLocalRandom
stores its next nextGaussian
in a static ThreadLocal
instead of in an instance variable as Random
does.
Since there will only ever be one instance of ThreadLocalRandom
there's no need for this to be an instance variable. I suppose you could alternatively make the case that there's no need for it to be a static
either, but at that point you're just debating style. At a minimum making it static
more clearly leaves the class essentially stateless. As mentioned in the file, this field is not really necessary, but ensures similar behavior to Random
.