eclipse memory analyzer sees small part (363,2MB)

2019-01-23 19:39发布

I was trying to investigate java.lang.OutOfMemoryError: GC limit exceeded which occurs at high load of our web app deployed in tomcat. Heap size was set to 8GB (-Xms2048m -Xmx8192m)

At some point in time our application become unresponsive due to GC activity overhead. I could see in logs that Full GC was occurring multiple times in a row. So I took heap dump with following command (jmap -F -dump:format=b,file=/root/dump2.hprof 4963). File containing dump was 9GB in size. After dump was taken (app was frozen for about 45minutes), multiple full GCs occured till OutOfMemoryError was thrown.

Here is a log sample of GC activity

[Full GC [PSYoungGen: 932096K->875513K(1864128K)] [ParOldGen: 5592447K->5592447K(5592448K)] 6524543K->6467961K(7456576K) [PSPermGen: 112285K->112285K(262144K)], 12.3954040 secs] [Times: user=47.60 sys=0.43, real=12.39 secs]
[Full GC [PSYoungGen: 932096K->890562K(1864128K)] [ParOldGen: 5592447K->5592447K(5592448K)] 6524543K->6483009K(7456576K) [PSPermGen: 112285K->112285K(262144K)], 12.6131900 secs] [Times: user=48.45 sys=0.49, real=12.61 secs]
[Full GC [PSYoungGen: 932096K->895268K(1864128K)] [ParOldGen: 5592447K->5592447K(5592448K)] 6524543K->6487715K(7456576K) [PSPermGen: 112285K->112285K(262144K)], 12.9488670 secs] [Times: user=49.61 sys=0.46, real=12.95 secs]

Heap
 PSYoungGen      total 1864128K, used 896698K [0x0000000755560000, 0x0000000800000000, 0x0000000800000000)
  eden space 932096K, 96% used [0x0000000755560000,0x000000078c10e8a8,0x000000078e3a0000)
  from space 932032K, 0% used [0x000000078e3a0000,0x000000078e3a0000,0x00000007c71d0000)
  to   space 932032K, 0% used [0x00000007c71d0000,0x00000007c71d0000,0x0000000800000000)
ParOldGen       total 5592448K, used 5592447K [0x0000000600000000, 0x0000000755560000, 0x0000000755560000)
  object space 5592448K, 99% used [0x0000000600000000,0x000000075555ff30,0x0000000755560000)
PSPermGen       total 262144K, used 112285K [0x00000005e0000000, 0x00000005f0000000, 0x0000000600000000)
  object space 262144K, 42% used [0x00000005e0000000,0x00000005e6da7530,0x00000005f0000000)

heap dump is taken (ca 45minutes freeze)
[Full GC [PSYoungGen: 932096K->903362K(1864128K)] [ParOldGen: 5592447K->5592447K(5592448K)] 6524543K->6495810K(7456576K) [PSPermGen: 112285K->112285K(262144K)], 2883.9864390 secs] [Times: user=49.41 sys=0.47, real=2884.17 secs]
[Full GC [PSYoungGen: 932096K->897728K(1864128K)] [ParOldGen: 5592447K->5592444K(5592448K)] 6524543K->6490173K(7456576K) [PSPermGen: 112288K->112288K(262144K)], 13.3092680 secs] [Times: user=50.75 sys=0.40, real=13.31 secs]

To analyze heap dump I opened it in eclipse memory analyzer (MAT). Unfortunately MAT displays that heap size was 363.2MB (in overview tab or heap dump details tab), whereas according to GC logs heap was filled up to 6467961K (6.4G). Unreachable Objects Histogram shows in total 75 511 736 (75 MB). Histogram view also confirmed that total shallow heap was 380 837 136 (363.2MB)

My question is why MAT doesn't display all objects from heap dump if GC cannot reclaim memory?

env details:
Eclipse Memory Analyzer Version 1.2.1
heap dump taken on
java version "1.7.0_13"
Java(TM) SE Runtime Environment (build 1.7.0_13-b20)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

Here are screenshots of imported heap dump in MAT:

4条回答
做个烂人
2楼-- · 2019-01-23 19:51

This is a usual behavior of MAT when collecting a heap dump of such a large heap. I frequently collect heap dumps of 8GB heaps and usually get a MAT profile showing ~1GB of live objects.

The 45 minutes freeze is also expected. My explanation is that during the heap dump collection time several FullGC cycles occur which reduce the size of the actual heap dump acquired. But I have not found official explanation or reference documenting why there is such a big difference.

Further, see this reference - MAT Does Not Show the Complete Heap :

Symptom: When monitoring the memory usage interactively, the used heap size is much bigger than what MAT reports.

During the index creation, the Memory Analyzer removes unreachable objects because the various garbage collector algorithms tend to leave some garbage behind (if the object is too small, moving and re-assigning addresses is to expensive). This should, however, be no more than 3 to 4 percent. If you want to know what objects are removed, enable debug output as explained here: MemoryAnalyzer/FAQ#Enable_Debug_Output

查看更多
男人必须洒脱
3楼-- · 2019-01-23 19:55

Actually I don't recommend that you use the jmap utility when you are troubleshooting a java.lang.OutOfMemoryError: GC limit exceeded problem. I have seen problems with this approach while the JVM is trashing (excessive # of major collections).

Please try the following approach instead and see if you get better results from the MAT parsing & analysis process:

  • Add the following JVM HotSpot parameter -XX:+HeapDumpOnOutOfMemoryError
  • Replicate the problem again and wait for the OOM event. The heap dump generation should be much faster which will increase the value and validity of the data.
  • The JVM will then generate a JVM Heap Dump (HPROF format) following the OOM event.
  • Load the JVM Heap Dump again in MAT and see if you are getting better results e.g. bigger heap footprint.

Regards, P-H

查看更多
一纸荒年 Trace。
4楼-- · 2019-01-23 20:00

MAT does not display the unreachable objects by default.

You can enable the option by going to Preferences -> Memory Analyzer -> Keep Unreachable Objects. Load the heap again once the option is enabled.

It will show the complete heap once the option is enabled. Even I was in same situation and was not able to get much information online and my manager showed me the option.Hope it helps.

查看更多
ら.Afraid
5楼-- · 2019-01-23 20:07

We found a bug in MAT recently where it would only see a subset of the heap: https://bugs.eclipse.org/bugs/show_bug.cgi?id=404679#c3

The problem was that the JVM wrote a single HEAP_DUMP record that was more than 4GB and therefore the length in the header wrapped.

What version of MAT are you running? Please try a more recent build.

查看更多
登录 后发表回答