Java cannot use terabytes of RAM because the GC pause is way too long (minutes). With the recent update to the Go GC, I'm wondering if its GC pauses are short enough for use with huge amounts of RAM, such as a couple of terabytes.
Are there any benchmarks of this yet? Can we use a garbage-collected language with this much RAM now?
tl;dr:
- You can't use TBs of RAM with a single Go process right now. Max is 512 GB on Linux, and most that I've seen tested is 240 GB.
- With the current background GC, GC workload tends to be more important than GC pauses.
- You can understand GC workload as pointers * allocation rate / spare RAM. Of apps using tons of RAM, only those with few pointers or little allocation will have a low GC workload.
I agree with inf's comment that huge heaps are worth asking other folks about (or testing). JimB notes that Go heaps have a hard limit of 512 GB right now, and 18 240 GB is the most I've seen tested.
Some things we know about huge heaps, from the design document and the GopherCon 2015 slides:
Folks running a couple production apps that initially had about 300ms pauses reported drops to ~4ms and ~20ms. Another app reported their 95th percentile GC time went from 279ms to ~10ms.
Go 1.6 added polish and pushed some of the remaining work to the background. As a result, tests with heaps up to a bit over 200GB still saw a max pause time of 20ms, as shown in a slide in an early 2016 State of Go talk:
The same application that had 20ms pause times under 1.5 had 3-4ms pauses under 1.6, with about an 8GB heap and 150M allocations/minute.
Twitch, who use Go for their chat service, reported that by Go 1.7 pause times had been reduced to 1ms with lots of running goroutines.
1.8 took stack scanning out of the stop-the-world phase, bringing most pauses well under 1ms, even on large heaps. Early numbers look good. Occasionally applications still have code patterns that make a goroutine hard to pause, effectively lengthening the pause for all other threads, but generally it's fair to say the GC's background work is now usually much more important than GC pauses.
Some general observations on garbage collection, not specific to Go:
- The frequency of collections depends on how quickly you use up the RAM you're willing to give to the process.
- The amount of work each collection does depends in part on how many pointers are in use.
(That includes the pointers within slices, interface values, strings, etc.)
Rephrased, an application accessing lots of memory might still not have a GC problem if it only has a few pointers (e.g., it handles relatively few large []byte
buffers), and collections happen less often if the allocation rate is low (e.g., because you applied sync.Pool
to reuse memory wherever you were chewing through RAM most quickly).
So if you're looking at something involving heaps of hundreds of GB that's not naturally GC-friendly, I'd suggest you consider any of
- writing in C or such
- moving the bulky data out of the object graph. For example, you could manage data in an embedded DB like
bolt
, put it in an outside DB service, or use something like groupcache
or memcache if you want more of a cache than a DB
- running a set of smaller-heap'd processes instead of one big one
- just carefully prototyping, testing, and optimizing to avoid memory issues.