BCrypt performance deterioration

2020-05-17 08:36发布

问题:

We have three web applications (standard Spring MVC-Hibernate) running within a Jboss server 6.1. All three applications share a common authentication method which is compiled as a JAR and included within each WAR file. Our authentication method uses org.springframework.security.crypto.bcrypt.BCrypt to hash user passwords, please see below:

hashedPassword.equals(BCrypt.hashpw(plainTextPassword, salt));

JBOSS StartUp Options

set "JAVA_OPTS=-Xms2048m -Xmx4096m -XX:PermSize=256m -XX:MaxPermSize=512m -XX:+HeapDumpOnOutOfMemoryError -verbosegc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:gc.txt -XX:+UseParallelOldGC

Problem: It appears that when the server is restarted, the Bcrypt.hashpw takes 100ms to decrypt password. However after some time (there is no pattern) suddenly the Bcrypt.hashpw performance spikes up from 100ms to 10s of seconds. There is no obvious reason for this.

More information:

  • Hibernate Version: 4.2.4.Final
  • Spring Version: 4.0.5.RELEASE Spring
  • Security Version: 3.2.4.RELEASE

Has anyone else seen this problem before?

回答1:

Problem: It appears that when the server is restarted, the Bcrypt.hashpw takes 100ms to decrypt password. However after some time (there is no pattern) suddenly the Bcrypt.hashpw performance spikes up from 100ms to 10s of seconds. There is no obvious reason for this.

The problem is /dev/random sometimes blocks and when it does it will appear to be random :) The more confusing thing is that while trying to test how it works you'll run up against the Observer Effect ie while trying to observe random behavior you're generating entropy and this can lead to a ton of confusion i.e. my results won't be the same as yours etc. This is also why it looks like there's no pattern..

I'll demonstrate the problem and show you how to recreate it (within reason) on your own servers so you can test solutions. I'll try and provide a couple of fixes, note this is on Linux but the same problem will happen on any system that requires entropy to generate random numbers and runs out.

On Linux /dev/random is a stream of random bytes. As you read from the stream you deplete the available entropy. When it reaches a certain point reads from /dev/random block. You can see available entropy using this command

cat /proc/sys/kernel/random/entropy_avail

If you run the following bash script and also monitor entropy_avail you'll notice that entropy dips dramatically as the bash script consumes it.

while :
do
  cat /dev/random > /dev/null
done

This should also give you a hint on how to recreate this problem on your servers ie run the above bash script to reduce available entropy and the problem will manifest itself.

If you want to see just how many bytes per second your system is creating you can use pv to measure it i.e.

pv /dev/random

If you leave pv running it has an effect, it's consuming the random stream of bytes which means other services might start to block. Note that pv is also displaying it's output so it might also be increasing available entroy on the system :).

On systems with little or no entropy using pv /dev/random will seem glacially slow. I've also experienced that VM's sometimes have major issues with generating entropy.

To recreate the issue use the following class...

import java.security.SecureRandom;
import org.mindrot.jbcrypt.BCrypt;
public class RandTest {
    public static void main(String[] args) {
        SecureRandom sr = new SecureRandom();
        int out = 0;
        String password = "very-strong-password-1729";
        String hashed;
        for (int i = 0; i < 200000 ; i++) {
            hashed = BCrypt.hashpw(password, BCrypt.gensalt());
            //If we print, we're generating entroy :) System.out.println(hashed);
        }
    }
}

I downloaded bcrypt to a local directory. I compiled and ran it as follows

javac -cp ./jBCrypt-0.4/src/   RandTest.java
java  -cp ./jBCrypt-0.4/src/:. RandTest

If you then run the bash script from earlier while runnng RandTest you'll see large pauses where the system is blocking waiting for more entropy. If you run strace you'll see the following...

1067 [pid 22481] open("/dev/random", O_RDONLY|O_LARGEFILE) = 12
11068 [pid 22481] fstat64(12, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 8), ...}) = 0
11069 [pid 22481] fcntl64(12, F_GETFD)        = 0
11070 [pid 22481] fcntl64(12, F_SETFD, FD_CLOEXEC) = 0
.....
11510 [pid 22481] read(12, "\320\244\317RB\370", 8) = 6

The program is reading from /dev/random. The problem with testing entropy is you might be generating more of it while trying to test it ie the Observer Effect.

Fixes

The first fix is to change from using /dev/random to /dev/urandom ie

time java  -Djava.security.egd=file:///dev/./urandom -cp ./jBCrypt-0.4/src/:.  RandTest

An alternative fix is to recreate the /dev/random device as a /dev/urandom device. You can find how to do this form the man page ie, instead of creating them...

mknod -m 644 /dev/random c 1 8
mknod -m 644 /dev/urandom c 1 9
chown root:root /dev/random /dev/urandom

we delete one and fake it ie

rm /dev/random
mknod -m 644 /dev/random c 1 9
chown root:root /dev/random

/dev/random is now actually /dev/urandom

The key thing to remember is testing random data that requires entroy from the system you're testing on is difficult because of the Observer Effect.



回答2:

One possible explanation is that the SeedGenerator of SecureRandom is causing the delays.

Springs BCrypt implementation uses SecureRandom which in turn uses a SeedGenerator which in turn may use the blocking /dev/random. Here is a good description of those classes.

That bugreport also reports performance problems in BCrypt and traced them back to the seed generator, showing full stacktraces. The BCrypt implementation is different but the stacktrace below SecureRandom must be identical to the spring implementation. Their solution was to reduce the reseed frequency of BCrypt.



回答3:

changing to urandom tag will work only on JDK8 or above, we were facing this for long time and changing to urandom in 1.7 didnt help but in 1.8 it did solve the issue.