How do you find a memory leak in Java (using, for example, JHat)? I have tried to load the heap dump up in JHat to take a basic look. However, I do not understand how I am supposed to be able to find the root reference (ref) or whatever it is called. Basically, I can tell that there are several hundred megabytes of hash table entries ([java.util.HashMap$Entry or something like that), but maps are used all over the place... Is there some way to search for large maps, or perhaps find general roots of large object trees?
[Edit] Ok, I've read the answers so far but let's just say I am a cheap bastard (meaning I am more interested in learning how to use JHat than to pay for JProfiler). Also, JHat is always available since it is part of the JDK. Unless of course there is no way with JHat but brute force, but I can't believe that can be the case.
Also, I do not think I will be able to actually modify (adding logging of all map sizes) and run it for long enough for me to notice the leak.
A tool is a big help.
However, there are times when you can't use a tool: the heap dump is so huge it crashes the tool, you are trying to troubleshoot a machine in some production environment to which you only have shell access, etc.
In that case, it helps to know your way around the hprof dump file.
Look for SITES BEGIN. This shows you what objects are using the most memory. But the objects aren't lumped together solely by type: each entry also includes a "trace" ID. You can then search for that "TRACE nnnn" to see the top few frames of the stack where the object was allocated. Often, once I see where the object is allocated, I find a bug and I'm done. Also, note that you can control how many frames are recorded in the stack with the options to -Xrunhprof.
If you check out the allocation site, and don't see anything wrong, you have to start backward chaining from some of those live objects to root objects, to find the unexpected reference chain. This is where a tool really helps, but you can do the same thing by hand (well, with grep). There is not just one root object (i.e., object not subject to garbage collection). Threads, classes, and stack frames act as root objects, and anything they reference strongly is not collectible.
To do the chaining, look in the HEAP DUMP section for entries with the bad trace id. This will take you to an OBJ or ARR entry, which shows a unique object identifier in hexadecimal. Search for all occurrences of that id to find who's got a strong reference to the object. Follow each of those paths backward as they branch until you figure out where the leak is. See why a tool is so handy?
Static members are a repeat offender for memory leaks. In fact, even without a tool, it'd be worth spending a few minutes looking through your code for static Map members. Can a map grow large? Does anything ever clean up its entries?
Most of the time, in enterprise applications the Java heap given is larger than the ideal size of max 12 to 16 GB. I have found it hard to make the NetBeans profiler work directly on these big java apps.
But usually this is not needed. You can use the jmap utility that comes with the jdk to take a "live" heap dump , that is jmap will dump the heap after running GC. Do some operation on the application, wait till the operation is completed, then take another "live" heap dump. Use tools like Eclipse MAT to load the heapdumps, sort on the histogram, see which objects have increased, or which are the highest, This would give a clue.
There is only one problem with this approach; Huge heap dumps, even with the live option, may be too big to transfer out to development lap, and may need a machine with enough memory/RAM to open.
That is where the class histogram comes into picture. You can dump a live class histogram with the jmap tool. This will give only the class histogram of memory usage.Basically it won't have the information to chain the reference. For example it may put char array at the top. And String class somewhere below. You have to draw the connection yourself.
Instead of taking two heap dumps, take two class histograms, like as described above; Then compare the class histograms and see the classes that are increasing. See if you can relate the Java classes to your application classes. This will give a pretty good hint. Here is a pythons script that can help you compare two jmap histogram dumps. histogramparser.py
Finally tools like JConolse and VisualVm are essential to see the memory growth over time, and see if there is a memory leak. Finally sometimes your problem may not be a memory leak , but high memory usage.For this enable GC logging;use a more advanced and new compacting GC like G1GC; and you can use jdk tools like jstat to see the GC behaviour live
Other referecences to google for -jhat, jmap, Full GC, Humongous allocation, G1GC
you may want to check out jconsole. It's also part of the JDK and I have found it helpful to find memory/reference leaks in conjunction with jhat. Also take a look at this blog entry.