I have a server component that I'm trying to load-test. All connections to the server use TLS 1.0. I have a simple test program that essentially does this on as many threads as I want:
Full TLS handshake to the server
send a request
read reply
close connection
repeat ad nauseam
My virtual machine is as follows:
Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
Java HotSpot(TM) Server VM (build 14.2-b01, mixed mode)
I have a memory leak. My memory footprint increases by about 1 meg per second when I heavily test my server, which makes it block after 15-20 minutes with OutOfMemoryException
.
I ran it in Netbean's profiler and it showed that the increase of memory was deep within the TLS API.
Has anyone ever experienced something similar? Is there any workaround I can implement at my level?
Edit. As requested, here's the profiling call trace which generates a lot of these byte[]:
.java.io.ByteArrayOutputStream.<init>(int)
..com.sun.net.ssl.internal.ssl.OutputRecord.<init>(byte, int)
...com.sun.net.ssl.internal.ssl.OutputRecord.<init>(byte)
....com.sun.net.ssl.internal.ssl.AppOutputStream.<init>(com.sun.net.ssl.internal.ssl.SSLSocketImpl)
.....com.sun.net.ssl.internal.ssl.SSLSocketImpl.init(com.sun.net.ssl.internal.ssl.SSLContextImpl, boolean)
......com.sun.net.ssl.internal.ssl.SSLSocketImpl.<init>(com.sun.net.ssl.internal.ssl.SSLContextImpl, java.net.Socket, String, int, boolean)
.......com.sun.net.ssl.internal.ssl.SSLSocketFactoryImpl.createSocket(java.net.Socket, String, int, boolean)
<my code>
There are many more I can put... this would be long. I'll tell you the entry points that the profiler gives me:
....com.sun.net.ssl.internal.ssl.AppOutputStream.<init>(com.sun.net.ssl.internal.ssl.SSLSocketImpl)
....com.sun.net.ssl.internal.ssl.HandshakeOutStream.<init>(com.sun.net.ssl.internal.ssl.ProtocolVersion, com.sun.net.ssl.internal.ssl.ProtocolVersion, com.sun.net.ssl.internal.ssl.HandshakeHash, com.sun.net.ssl.internal.ssl.SSLSocketImpl)
....com.sun.net.ssl.internal.ssl.SSLSocketImpl.sendAlert(byte, byte)
..com.sun.net.ssl.internal.ssl.AppInputStream.<init>(com.sun.net.ssl.internal.ssl.SSLSocketImpl)
..com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake()
..com.sun.net.ssl.internal.ssl.HandshakeInStream.<init>(com.sun.net.ssl.internal.ssl.HandshakeHash)
Have u seen the connection close. Most likely this is still open somehow. 1Mb is a sing of some extra thread. However, I am not sure what exactly would be the reason.
What hardware are you running on? Can you do a netstat and verify the state of your connections?
I've load tested Tomcat, and had no trouble achieving 500 new SSL requests/sec, running for hours, with a 1 GB heap on Solaris. Also, you may want to monitor the number of threads running in the container.
1MB is the memory required to create a thread, extra or not.
Are there any entries on the bug list for that class or package? The first step would be to check it.
The second step is to assume that the problem lies in your code, not the Sun stuff. It's more likely, simply because a commonly used class in the Java JDK has been banged on by users all over the world. If there was an error, it would have come to light by now.
That's not to say that the JDK code is bug-free, just that you should suspect your code first.
Get a profiler and measure. Don't guess.
All SSL connections are associated with an SSL session, which may be reused across distinct TCP connections to reduce handshake overhead when negotiating temporary encryption keys after the actual TCP connection has been established. It could be that your clients are somehow forcing the creation of a new session and since the default configuration for Java 6 seem to be to cache an unlimited number of sessions for one hour, you may easily run into a memory problem.
You can manipulate these settings for your server socket by getting the SSLSessionContext from your server socket with getSession().getSessionContext() and set the cache size with setSessionCacheSize and timeout (in seconds) with setSessionTimeout. I would have expected it to be possible to change the default configuration through system properties, but I'm not able to find any documentation on that. Perhaps you can find something yourself by googling a little bit longer than I did.
Are you sure that you're setting the limit on the correct session context? I was mistaken about the context being reachable from the server socket. You have to set it through the SSLContext before creating the server socket:
Without this limitation, it was easy to reproduce your memory "leak", as each cached SSL session seams to use somewhere around 7-800 bytes of heap memory. With the session count limit, my server has been running under stress for about 15 minutes now and is still using only 3-4 MB of heap memory.