I really thought that after about 200 or more tomcat installs on various platforms, I am ready for any kind of challenge but this one is tricky.
I created a vanilla Ubunutu 14_04 image and installed Java 8 TGZ from oracle on that system. Furthermore I added a tomcat 8 to the game. Then I started the vanilla server install.
Soon after hanging on deploying the default apps shipped with tomcat, I wondered whats happening there and did some threaddumps. This one was the lousy thread who prevented tomcat from starting:
"localhost-startStop-1" #15 daemon prio=5 os_prio=0 tid=0x00007f37c8004800 nid=0x4d6 runnable [0x00007f37b38b3000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:246)
at sun.security.provider.SeedGenerator$URLSeedGenerator.getSeedBytes(SeedGenerator.java:539)
at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:144)
at sun.security.provider.SecureRandom$SeederHolder.<clinit>(SecureRandom.java:192)
at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:210)
- locked <0x00000000f06e6ce8> (a sun.security.provider.SecureRandom)
at java.security.SecureRandom.nextBytes(SecureRandom.java:457)
- locked <0x00000000f06e71c0> (a java.security.SecureRandom)
at java.security.SecureRandom.next(SecureRandom.java:480)
at java.util.Random.nextInt(Random.java:329)
at org.apache.catalina.util.SessionIdGeneratorBase.createSecureRandom(SessionIdGeneratorBase.java:234)
After more google & friends i discovered that the SeedGenerator
shipped with the JDK is the source of my problem. Interestingly sometimes the SeedGenerator came back after several minutes and sometimes it just hung (running out of entropy? ... checked via cat /proc/sys/kernel/random/entropy_avail
) . After more research I found out that a config variable in $JAVA_HOME$/lib/security/java.security
called securerandom.source
defines what the source for Random is. In my case, or better in the oracle JDK 8 install for linux, it was /dev/random
. I am not a Linux expert (I am a java developer) but what I understood is that /dev/random
can run out of entropy (whatever this means) but perhaps it means at some point it cant generate any more random numbers). I switched to /dev/urandom
and everything was fine with my tomcat.
Then i checked how other JDK installs look like on my other various server, which were a wild mix of OpenJDK and older Oracle JDK installs. At least OpenJDK always used /dev/urandom
what might be the answer, why I have never had the problem before.
Now to my question: Is it sane from Oracle to rely on /dev/random
when there can be corner cases where the OS cant produce any more numbers? I mean servers like Tomcat and many others rely on SeedGenerator
from the JDK and debugging this kind of error is really advanced. Took me 2 hours to get to the point where i am now.
I think the answer relies in this link for WebLogic support: https://docs.oracle.com/cd/E13209_01/wlcp/wlss30/configwlss/jvmrand.html
where they mention that "random" is more secure
and also in the Oracle bug comment (already mentioned by David): http://bugs.java.com/view_bug.do?bug_id=4705093
with particular regard to this part:
Because SHA1PRNG is a MessageDigest-based PRNG, it historically has always used /dev/random for initial seeding if seed data has not been provided by the application. Since all future values depend on the existing state of the MessageDigest, it's important to start with a strong initial seed.
Changing that behavior was troubling to the original developer. So he did created a new SecureRandom impl called NativePRNG, which does respect the java.security.egd value.
If you call:
new SecureRandom() on Linux and the default values are used, it will read from /dev/urandom and not block. (By default on Solaris, the PKCS11 SecureRandom is used, and also calls into /dev/urandom.)
SecureRandom.getInstance("SHA1PRNG") and do not specify a seed, OR new SecureRandom() but have specified an alternate java.security.egd besides "file:/dev/urandom", it will use the SHA1PRNG which calls into /dev/random and may potentially block.
SecureRandom.getInstance("NativePRNG"), it will depend on what java.security.egd is pointing to.
what I understood is that /dev/random can run out of entropy (whatever this means) but perhaps it means at some point it cant generate any more random numbers).
It can temporarily run out of entropy and block until it gathers enough to dispense more. The JVM only needs a little to seed a SecureRandom
instance.
How long it takes depends on how noisy your system is and how the kernel gathers entropy.
Is it sane from Oracle to rely on /dev/random when there can be corner cases where the OS cant produce any more numbers?
The lack of entropy can be problematic on embedded systems or in VMs on first boot which come with a very deterministic image, have few sources of entropy that can be harvested compared to real PCs and no RDRAND instruction or similar for the kernel to harvest for entropy pool initialization.
Insufficient randomness can be catastrophic for key generation and other cryptographic algorithms, e.g. DSA is quite sensitive to the quality of your entropy source.
So yes, it is quite sane to rather wait than having a compromised system.
To quote from Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices by N.Heninger et al.
To understand why these problem are occurring, we manually
investigated hundreds of the vulnerable hosts, which were
representative of the most commonly repeated keys as well as each
of the private keys we obtained (Section
3.2 ). Nearly all served information identifying them as headless or embedded systems, including routers, server management cards,
firewalls, and other network de- vices. Such devices typically
generate keys automatically on first boot, and may have limited
entropy sources com- pared to traditional PCs.
Furthermore, when we
examined clusters of hosts that shared a key or factor, in nearly all
cases these appeared to be linked by a manufacturer or device model.
These observations lead us to conclude that the problems are caused by
specific defective implementations that generate keys without having
collected sufficient entropy.
If you have an external source of entropy and root privileges you can push additional entropy into the pool and increment its counter (i think rngd
can do this for you). Just writing to /dev/random
will add your entropy to the pool but not increase the counter.
For VMs there also are virtualization drivers to get entropy from the host.
Pointing the jvm to a hardware RNG dev (/dev/hwrng, /dev/misc/hw_random or something like that) may also be an option, if they're available.