We have three web applications (standard Spring MVC-Hibernate) running within a Jboss server 6.1. All three applications share a common authentication method which is compiled as a JAR and included within each WAR file. Our authentication method uses org.springframework.security.crypto.bcrypt.BCrypt to hash user passwords, please see below:
hashedPassword.equals(BCrypt.hashpw(plainTextPassword, salt));
JBOSS StartUp Options
set "JAVA_OPTS=-Xms2048m -Xmx4096m -XX:PermSize=256m -XX:MaxPermSize=512m -XX:+HeapDumpOnOutOfMemoryError -verbosegc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:gc.txt -XX:+UseParallelOldGC
Problem:
It appears that when the server is restarted, the Bcrypt.hashpw takes 100ms to decrypt password. However after some time (there is no pattern) suddenly the Bcrypt.hashpw performance spikes up from 100ms to 10s of seconds. There is no obvious reason for this.
More information:
- Hibernate Version: 4.2.4.Final
- Spring Version: 4.0.5.RELEASE Spring
- Security Version: 3.2.4.RELEASE
Has anyone else seen this problem before?
Problem: It appears that when the server is restarted, the Bcrypt.hashpw takes
100ms to decrypt password. However after some time (there is no pattern)
suddenly the Bcrypt.hashpw performance spikes up from 100ms to 10s of seconds.
There is no obvious reason for this.
The problem is /dev/random
sometimes blocks and when it does it will appear to be random :) The more confusing thing is that while trying to test how it works you'll run up against the Observer Effect ie while trying to observe random behavior you're generating entropy and this can lead to a ton of confusion i.e. my results won't be the same as yours etc. This is also why it looks like there's no pattern..
I'll demonstrate the problem and show you how to recreate it (within reason) on your own servers so you can test solutions. I'll try and provide a couple of fixes, note this is on Linux but the same problem will happen on any system that requires entropy to generate random numbers and runs out.
On Linux /dev/random
is a stream of random bytes. As you read from the
stream you deplete the available entropy. When it reaches a certain point
reads from /dev/random
block. You can see available entropy using this command
cat /proc/sys/kernel/random/entropy_avail
If you run the following bash script and also monitor entropy_avail
you'll
notice that entropy dips dramatically as the bash script consumes it.
while :
do
cat /dev/random > /dev/null
done
This should also give you a hint on how to recreate this problem on your servers ie run the above bash script to reduce available entropy and the problem will manifest itself.
If you want to see just how many bytes per second your system is creating you
can use pv
to measure it i.e.
pv /dev/random
If you leave pv
running it has an effect, it's consuming the random stream of bytes which means other services might start to block. Note that pv
is also displaying it's output so it might also be increasing available entroy on the system :).
On systems with little or no entropy using pv /dev/random
will seem glacially slow. I've also experienced that VM's sometimes have major issues with generating entropy.
To recreate the issue use the following class...
import java.security.SecureRandom;
import org.mindrot.jbcrypt.BCrypt;
public class RandTest {
public static void main(String[] args) {
SecureRandom sr = new SecureRandom();
int out = 0;
String password = "very-strong-password-1729";
String hashed;
for (int i = 0; i < 200000 ; i++) {
hashed = BCrypt.hashpw(password, BCrypt.gensalt());
//If we print, we're generating entroy :) System.out.println(hashed);
}
}
}
I downloaded bcrypt to a local directory. I compiled and ran it as follows
javac -cp ./jBCrypt-0.4/src/ RandTest.java
java -cp ./jBCrypt-0.4/src/:. RandTest
If you then run the bash script from earlier while runnng RandTest
you'll see large pauses where the system is blocking waiting for more entropy. If you run strace
you'll see the following...
1067 [pid 22481] open("/dev/random", O_RDONLY|O_LARGEFILE) = 12
11068 [pid 22481] fstat64(12, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 8), ...}) = 0
11069 [pid 22481] fcntl64(12, F_GETFD) = 0
11070 [pid 22481] fcntl64(12, F_SETFD, FD_CLOEXEC) = 0
.....
11510 [pid 22481] read(12, "\320\244\317RB\370", 8) = 6
The program is reading from /dev/random
. The problem with testing entropy is
you might be generating more of it while trying to test it ie the Observer Effect.
Fixes
The first fix is to change from using /dev/random
to /dev/urandom
ie
time java -Djava.security.egd=file:///dev/./urandom -cp ./jBCrypt-0.4/src/:. RandTest
An alternative fix is to recreate the /dev/random
device as a /dev/urandom
device. You can find how to do this form the man page ie, instead of creating them...
mknod -m 644 /dev/random c 1 8
mknod -m 644 /dev/urandom c 1 9
chown root:root /dev/random /dev/urandom
we delete one and fake it ie
rm /dev/random
mknod -m 644 /dev/random c 1 9
chown root:root /dev/random
/dev/random
is now actually /dev/urandom
The key thing to remember is testing random data that requires entroy from the
system you're testing on is difficult because of the Observer Effect.
One possible explanation is that the SeedGenerator
of SecureRandom
is causing the delays.
Springs BCrypt implementation uses SecureRandom
which in turn uses a SeedGenerator
which in turn may use the blocking /dev/random
. Here is a good description of those classes.
That bugreport also reports performance problems in BCrypt and traced them back to the seed generator, showing full stacktraces. The BCrypt implementation is different but the stacktrace below SecureRandom
must be identical to the spring implementation. Their solution was to reduce the reseed frequency of BCrypt.
changing to urandom tag will work only on JDK8 or above, we were facing this for long time and changing to urandom in 1.7 didnt help but in 1.8 it did solve the issue.