SSL handshake problems

2019-03-29 10:59发布

问题:

Our server application suffers from extreme slowness at some of the customers. The slowness is solved by server restart, however it returns after a couple of weeks.

Java CPU is always around 100% (out of 200%), all other parameters are fine. Research showed that most of the CPU is consumed by "HandshakeCompletedNotify-Thread" thread. From tcp dump we see that the SSL handshake takes 2-8 seconds, which is very long, sometimes timeout is thrown.

Our SSL provider is BSAFE. Server runs on Linux(CentOS), 640 mb heap, 2 Cores. Hibernate, spring are used, Oracle local db

What could be reasons for such a behavior? What can be done to find them out?

P.S. We can not switch the traffic to HTTP at our customers.

Update: The system is completely freed when outgoing connections of java process are blocked with IP tables. What resource is freed in such a situation? We see that SSL Handshake frequently gets stuck at "change Cipher Spec" stage. Client (my java process) tries to reuse SSL session, but the server is completely stateless, it generates new session each time.

回答1:

This is a known bug that was introduced when Sun rolled out the Next Generation Java Plugin in 6u10. Oracle finally fixed it in Java 7u2, but they have not backported it to Java 6, at least as of 6u33.

Details on the bug, #7060523, can be found here.



回答2:

You might want to take a look at this issue reported against JBoss (not sure if that's what you're using). That issues indicates that HandshakeCompletedNotify-Thread can throw ConcurrentModificationException, which is one possible outcome of a race condition. Other outcomes include code that gets stuck in an endless loop and pegs a CPU, which sounds like your symptom. I'd consider upgrading JBoss if you're using it, or the library related to causing the issue reported. It might fix your problem.



回答3:

You could try switching to the JRE default JSSE implementation to see if a BSAFE bug is the issue.

Enabling the JSSE debug code can be in valuable too (javax.net.debug property).

Those links are pretty helpful wrt debugging the JSSE

http://download.oracle.com/javase/1.5.0/docs/guide/security/jsse/JSSERefGuide.html#Debug

http://download.oracle.com/javase/1.5.0/docs/guide/security/jsse/ReadDebug.html



回答4:

Have you analyzed your DNS lookups. SSL handshake could take longer when dns lookups are slow, it requires lookup as well as reverse-lookup to be efficient.