Java runs out of memory, even though I give it ple

2020-07-17 15:18发布

问题:

So, I'm running a java server (specifically Winstone: http://winstone.sourceforge.net/ )

Like this: java -server -Xmx12288M -jar /usr/share/java/winstone-0.9.10.jar --useSavedSessions=false --webappsDir=/var/servlets --commonLibFolder=/usr/share/java

This has worked fine in the past, but now it needs to load a bunch more stuff into memory than it has before.

The odd part is that, according to 'top', it has 15.0g of VIRT(ual memory) and it's RES(ident set) is 8.4g. Once it hits 8.4g, the CPU hangs at 100% (even though it's loading from disk), and eventually, I get Java's OutOfMemoryError. Presumably, the CPU hanging at 100% is Java doing garbage collection.

So, my question is, what gives? I gave it 12 gigs of memory! And it's only using 8.2 gigs before it throws in the towel. What am I doing wrong?

Oh, and I'm using java version "1.6.0_07" Java(TM) SE Runtime Environment (build 1.6.0_07-b06) Java HotSpot(TM) 64-Bit Server VM (build 10.0-b23, mixed mode)

on Linux.

Thanks, Matt

回答1:

The odd part is that, according to 'top', it has 15.0g of VIRT(ual memory) and it's RES(ident set) is 8.4g. Once it hits 8.4g, the CPU hangs at 100% (even though it's loading from disk), and eventually, I get Java's OutOfMemoryError.

I think you are misinterpreting things. The "-Xmx12288M" option does not reserve physical memory. Rather it sets an upper limit on the size of the Java heap. Java also needs memory for non-heap objects; e.g. permgen space, code space, memory mapped files, etcetera. It is quite plausible for a 12g heap + the non-heap memory used/shared by the JVM to add up to 15g.

The 8.4g reported by top as RES is the amount of physical memory that is currently being used to run the JVM. It is not directly related to the size of your Java heap. Indeed, you would expect the RES number to move up and down as different processes' pages are swapped in and out by the OS virtual memory system. This is entirely outside the control of the JVM.

Presumably, the CPU hanging at 100% is Java doing garbage collection.

Yes. That's typically what happens.

I can think of three possible explanations:

  • Most likely, the operating system is unable to give your JVM the memory it is asking for because there is not enough swap disk space. For example if you have 2 processes with 15g of virtual memory each, that's 30gb. Given that you have 24g of physical memory, you will need at least 8g (probably more) of swap space. If the amount of physical memory allocatable to user processes + the amount of swap space is less than the total virtual space used by processes the OS will start refusing requests by the JVM to expand the heap. You can run "swapon -s" to see how much swap space is available / in use.

  • Your application may really be using all of the 12g of heap that you've said it can use, and that is not enough. (Maybe you've got a storage leak. Maybe it really needs more memory.)

  • It is also possible (but highly unlikely) that someone has set a process limit. You can use the shell builtin 'ulimit' command to see if this has been done; refer to "man ulimit" for details.

EDIT

  • If you use the -verbose:gc and -XX:+PrintGCDetails options, the GC might give you more clues as to what is happening. In particular, it will tell you how big the Java heap really is when you run out of memory.

  • You could write a little Java app that simply allocates and doesn't free lots of memory, and see how much it manages to allocate before falling over with an OOM error when run with the same options as you are currently using. (I don't think this will tell you anything new. EDIT 2 actually, it will tell you something if you run it in the way that @Dan suggests! )

  • If (as seems likely) the real problem is that you don't have enough swap space, there is nothing you can do on the Java side to fix this. You need to reconfigure your system to have more swap space. Refer to your Linux system administration documentation, the man pages for swapon, mkswap etcetera.



回答2:

Sometimes OutOfMemorError doesn't mean object heap is used up -- but something else. Was there any other info on the error stack?

In particular, the JVM needs a bunch of memory/address-space separate from the heap space. At some point, giving the process more object heap can leave less space for this other pool -- paradoxically making OOMEs more likely with larger '-Xmx' settings!

Memory-mapped files can eat a lot of address space; failing to close files properly can leave this space allocated until GC/finalization, which happens at an unpredictable time. Also, having a larger heap puts off GC/finalization -- meaning this native address space may stay reserved longer, so again: larger heaps can mean more frequent OOMEs due to depleted other memory.

You can also use '-Xms' of the same value to force all heap address space to be grabbed immediately at startup -- perhaps triggering the failure at a more convenient/understandable/reproducible stage.

Finally, there's also a threshold at which an OOME is thrown even if no memory request has failed -- but GC is taking "too much" time, indicating garbage being created as fast as it's being collected, probably in a tight loop. A big inefficient load-from-disk where lots of garbage is created for a small amount of retained data might cause this. Look up [GC overhead limit] for details.



回答3:

Can you upgrade to the latest version of java (1.6.0_17)?



回答4:

Maybe a dumb question, but are you sure that your swap partition has at least 15GB free when the program starts?



回答5:

And just to confirm, it ain't the PermGen space running out right?



回答6:

The odd part is that, according to 'top', it has 15.0g of VIRT(ual memory) and it's RES(ident set) is 8.4g. Once it hits 8.4g, the CPU hangs at 100% (even though it's loading from disk), and eventually, I get Java's OutOfMemoryError. Presumably, the CPU hanging at 100% is Java doing garbage collection.

I would guess that java has to use that much memory that it decides (or is forced) to use disk space for your data. Now the next time it has to collect garbage it has to use IO to do this. Doing garbage collection on disk is rather expensive for sure. And because you have that much objects the garbage collector may jump in again and again. This may be why your CPU is at 100% ... doing garbage collection for ever.

As Stephen has said -verbose:gc and -XX:+PrintGCDetails will give more hints.

But apart from this, maybe it will pay of to invest in an implementation that has not to load all the files at once? Working at the limit of your memory doesn't look like a winning strategy.