Reducing JVM pause time > 1 second using UseConcMa

2019-01-18 13:02发布

I'm running a memory intensive app on a machine with 16Gb of RAM, and an 8-core processor, and Java 1.6 all running on CentOS release 5.2 (Final). Exact JVM details are:

java version "1.6.0_10"
Java(TM) SE Runtime Environment (build 1.6.0_10-b33)
Java HotSpot(TM) 64-Bit Server VM (build 11.0-b15, mixed mode)

I'm launching the app with the following command line options:

java -XX:+UseConcMarkSweepGC -verbose:gc -server -Xmx10g -Xms10g ...

My application exposes a JSON-RPC API, and my goal is to respond to requests within 25ms. Unfortunately, I'm seeing delays up to and exceeding 1 second and it appears to be caused by garbage collection. Here are some of the longer examples:

[GC 4592788K->4462162K(10468736K), 1.3606660 secs]
[GC 5881547K->5768559K(10468736K), 1.2559860 secs]
[GC 6045823K->5914115K(10468736K), 1.3250050 secs]

Each of these garbage collection events was accompanied by a delayed API response of very similar duration to the length of the garbage collection shown (to within a few ms).

Here are some typical examples (these were all produced within a few seconds):

[GC 3373764K->3336654K(10468736K), 0.6677560 secs]
[GC 3472974K->3427592K(10468736K), 0.5059650 secs]
[GC 3563912K->3517273K(10468736K), 0.6844440 secs]
[GC 3622292K->3589011K(10468736K), 0.4528480 secs]

The thing is that I thought the UseConcMarkSweepGC would avoid this, or at least make it extremely rare. On the contrary, delays exceeding 100ms are occurring almost once a minute or more (although delays of over 1 second are considerably rarer, perhaps once every 10 or 15 minutes).

The other thing is that I thought only a FULL GC would cause threads to be paused, yet these don't appear to be full GCs.

It may be relevant to note that most of the memory is occupied by a LRU memory cache that makes use of soft references.

Any assistance or advice would be greatly appreciated.

8条回答
ゆ 、 Hurt°
2楼-- · 2019-01-18 13:24

I haven't personally used such a huge heap but I've experienced very low latency in general using following switches for Oracle/Sun Java 1.6.x:

-Xincgc -XX:+UseConcMarkSweepGC -XX:CMSIncrementalSafetyFactor=50
-XX:+UseParNewGC
-XX:+CMSConcurrentMTEnabled -XX:ConcGCThreads=2 -XX:ParallelGCThreads=2
-XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=5
-XX:GCTimeRatio=90 -XX:MaxGCPauseMillis=20 -XX:GCPauseIntervalMillis=1000

The important parts are, in my opinion, the use of CMS for tenured generation and ParNewGC for young generation. In addition, this adds a pretty big safety factor for CMS (default is 10% instead of 50%) and request short pause times. As you're targeting for 25 ms response time, I'd try setting -XX:MaxGCPauseMillis to even smaller value. You could even try to use more than two cores for concurrent GC but I would guess that is not worth the CPU usage.

You should probably also check the HotSpot JVM GC cheat sheet.

查看更多
欢心
3楼-- · 2019-01-18 13:29

Some places to start looking:

Also I'd run the code through a profiler.. I like the one in NetBeans but there are other ones as well. You can view the gc behaviour in real time. The Visual VM does that as well... but I haven't run it yet (been looking for a reason to... but haven't had the time or the need yet).

查看更多
登录 后发表回答