Our server application suffers from extreme slowness at some of the customers. The slowness is solved by server restart, however it returns after a couple of weeks.
Java CPU is always around 100% (out of 200%), all other parameters are fine. Research showed that most of the CPU is consumed by "HandshakeCompletedNotify-Thread" thread. From tcp dump we see that the SSL handshake takes 2-8 seconds, which is very long, sometimes timeout is thrown.
Our SSL provider is BSAFE. Server runs on Linux(CentOS), 640 mb heap, 2 Cores. Hibernate, spring are used, Oracle local db
What could be reasons for such a behavior? What can be done to find them out?
P.S. We can not switch the traffic to HTTP at our customers.
Update: The system is completely freed when outgoing connections of java process are blocked with IP tables. What resource is freed in such a situation?
We see that SSL Handshake frequently gets stuck at "change Cipher Spec" stage. Client (my java process) tries to reuse SSL session, but the server is completely stateless, it generates new session each time.
This is a known bug that was introduced when Sun rolled out the Next Generation Java Plugin in 6u10. Oracle finally fixed it in Java 7u2, but they have not backported it to Java 6, at least as of 6u33.
Details on the bug, #7060523, can be found here.
You might want to take a look at this issue reported against JBoss (not sure if that's what you're using). That issues indicates that HandshakeCompletedNotify-Thread
can throw ConcurrentModificationException
, which is one possible outcome of a race condition. Other outcomes include code that gets stuck in an endless loop and pegs a CPU, which sounds like your symptom. I'd consider upgrading JBoss if you're using it, or the library related to causing the issue reported. It might fix your problem.
You could try switching to the JRE default JSSE implementation to see if a BSAFE bug is the issue.
Enabling the JSSE debug code can be in valuable too (javax.net.debug
property).
Those links are pretty helpful wrt debugging the JSSE
http://download.oracle.com/javase/1.5.0/docs/guide/security/jsse/JSSERefGuide.html#Debug
http://download.oracle.com/javase/1.5.0/docs/guide/security/jsse/ReadDebug.html
Have you analyzed your DNS lookups. SSL handshake could take longer when dns lookups are slow, it requires lookup as well as reverse-lookup to be efficient.