After answering a question about how to force-free objects in Java (the guy was clearing a 1.5GB HashMap) with System.gc()
, I was told it's bad practice to call System.gc()
manually, but the comments were not entirely convincing. In addition, no one seemed to dare to upvote, nor downvote my answer.
I was told there that it's bad practice, but then I was also told that garbage collector runs don't systematically stop the world anymore, and that it could also effectively be used by the JVM only as a hint, so I'm kind of at loss.
I do understand that the JVM usually knows better than you when it needs to reclaim memory. I also understand that worrying about a few kilobytes of data is silly. I also understand that even megabytes of data isn't what it was a few years back. But still, 1.5 gigabytes? And you know there's like 1.5 GB of data hanging around in memory; it's not like it's a shot in the dark. Is System.gc()
systematically bad, or is there some point at which it becomes okay?
So the question is actually double:
- Why is or isn't it bad practice to call
System.gc()
? Is it really merely a hint to the JVM under certain implementations, or is it always a full collection cycle? Are there really garbage collector implementations that can do their work without stopping the world? Please shed some light over the various assertions people have made in the comments to my answer. - Where's the threshold? Is it never a good idea to call
System.gc()
, or are there times when it's acceptable? If so, what are those times?
It has already been explained that calling
system.gc()
may do nothing, and that any code that "needs" the garbage collector to run is broken.However, the pragmatic reason that it is bad practice to call
System.gc()
is that it is inefficient. And in the worst case, it is horribly inefficient! Let me explain.A typical GC algorithm identifies garbage by traversing all non-garbage objects in the heap, and inferring that any object not visited must be garbage. From this, we can model the total work of a garbage collection consists of one part that is proportional to the amount of live data, and another part that is proportional to the amount of garbage; i.e.
work = (live * W1 + garbage * W2)
.Now suppose that you do the following in a single-threaded application.
The first call will (we predict) do
(live * W1 + garbage * W2)
work, and get rid of the outstanding garbage.The second call will do
(live* W1 + 0 * W2)
work and reclaim nothing. In other words we have done(live * W1)
work and achieved absolutely nothing.We can model the efficiency of the collector as the amount of work needed to collect a unit of garbage; i.e.
efficiency = (live * W1 + garbage * W2) / garbage
. So to make the GC as efficient as possible, we need to maximize the value ofgarbage
when we run the GC; i.e. wait until the heap is full. (And also, make the heap as big as possible. But that is a separate topic.)If the application does not interfere (by calling
System.gc()
), the GC will wait until the heap is full before running, resulting in efficient collection of garbage1. But if the application forces the GC to run, the chances are that the heap won't be full, and the result will be that garbage is collected inefficiently. And the more often the application forces GC, the more inefficient the GC becomes.Note: the above explanation glosses over the fact that a typical modern GC partitions the heap into "spaces", the GC may dynamically expand the heap, the application's working set of non-garbage objects may vary and so on. Even so, the same basic principal applies across the board to all true garbage collectors2. It is inefficient to force the GC to run.
1 - This is how the "throughput" collector works. Concurrent collectors such as CMS and G1 use different criteria to decide when to start the garbage collector.
2 - I'm also excluding memory managers that use reference counting exclusively, but no current Java implementation uses that approach ... for good reason.
This is a very bothersome question, and I feel contributes to many being opposed to Java despite how useful of a language it is.
The fact that you can't trust "System.gc" to do anything is incredibly daunting and can easily invoke "Fear, Uncertainty, Doubt" feel to the language.
In many cases, it is nice to deal with memory spikes that you cause on purpose before an important event occurs, which would cause users to think your program is badly designed/unresponsive.
Having ability to control the garbage collection would be very a great education tool, in turn improving people's understanding how the garbage collection works and how to make programs exploit it's default behavior as well as controlled behavior.
Let me review the arguments of this thread.
Often, the program may not be doing anything and you know it's not doing anything because of the way it was designed. For instance, it might be doing some kind of long wait with a large wait message box, and at the end it may as well add a call to collect garbage because the time to run it will take a really small fraction of the time of the long wait but will avoid gc from acting up in the middle of a more important operation.
I disagree, it doesn't matter what garbage collector you have. Its' job is to track garbage and clean it.
By calling the gc during times where usage is less critical, you reduce odds of it running when your life relies on the specific code being run but instead it decides to collect garbage.
Sure, it might not behave the way you want or expect, but when you do want to call it, you know nothing is happening, and user is willing to tolerate slowness/downtime. If the System.gc works, great! If it doesn't, at least you tried. There's simply no down side unless the garbage collector has inherent side effects that do something horribly unexpected to how a garbage collector is suppose to behave if invoked manually, and this by itself causes distrust.
It is a use case that cannot be achieved reliably, but could be if the system was designed that way. It's like making a traffic light and making it so that some/all of the traffic lights' buttons don't do anything, it makes you question why the button is there to begin with, javascript doesn't have garbage collection function so we don't scrutinize it as much for it.
what is a "hint"? what is "ignore"? a computer cannot simply take hints or ignore something, there are strict behavior paths it takes that may be dynamic that are guided by the intent of the system. A proper answer would include what the garbage collector is actually doing, at implementation level, that causes it to not perform collection when you request it. Is the feature simply a nop? Is there some kind of conditions that must me met? What are these conditions?
As it stands, Java's GC often seems like a monster that you just don't trust. You don't know when it's going to come or go, you don't know what it's going to do, how it's going to do it. I can imagine some experts having better idea of how their Garbage Collection works on per-instruction basis, but vast majority simply hopes it "just works", and having to trust an opaque-seeming algorithm to do work for you is frustrating.
There is a big gap between reading about something or being taught something, and actually seeing the implementation of it, the differences across systems, and being able to play with it without having to look at the source code. This creates confidence and feeling of mastery/understanding/control.
To summarize, there is an inherent problem with the answers "this feature might not do anything, and I won't go into details how to tell when it does do something and when it doesn't and why it won't or will, often implying that it is simply against the philosophy to try to do it, even if the intent behind it is reasonable".
It might be okay for Java GC to behave the way it does, or it might not, but to understand it, it is difficult to truly follow in which direction to go to get a comprehensive overview of what you can trust the GC to do and not to do, so it's too easy simply distrust the language, because the purpose of a language is to have controlled behavior up to philosophical extent(it's easy for a programmer, especially novices to fall into existential crisis from certain system/language behaviors) you are capable of tolerating(and if you can't, you just won't use the language until you have to), and more things you can't control for no known reason why you can't control them is inherently harmful.
Maybe I write crappy code, but I've come to realize that clicking the trash-can icon on eclipse and netbeans IDEs is a 'good practice'.
First, there is a difference between spec and reality. The spec says that System.gc() is a hint that GC should run and the VM is free to ignore it. The reality is, the VM will never ignore a call to System.gc().
Calling GC comes with a non-trivial overhead to the call and if you do this at some random point in time it's likely you'll see no reward for your efforts. On the other hand, a naturally triggered collection is very likely to recoup the costs of the call. If you have information that indicates that a GC should be run than you can make the call to System.gc() and you should see benefits. However, it's my experience that this happens only in a few edge cases as it's very unlikely that you'll have enough information to understand if and when System.gc() should be called.
One example listed here, hitting the garbage can in your IDE. If you're off to a meeting why not hit it. The overhead isn't going to affect you and heap might be cleaned up for when you get back. Do this in a production system and frequent calls to collect will bring it to a grinding halt! Even occasional calls such as those made by RMI can be disruptive to performance.
The reason everyone always says to avoid
System.gc()
is that it is a pretty good indicator of fundamentally broken code. Any code that depends on it for correctness is certainly broken; any that rely on it for performance are most likely broken.You don't know what sort of garbage collector you are running under. There are certainly some that do not "stop the world" as you assert, but some JVMs aren't that smart or for various reasons (perhaps they are on a phone?) don't do it. You don't know what it's going to do.
Also, it's not guaranteed to do anything. The JVM may just entirely ignore your request.
The combination of "you don't know what it will do," "you don't know if it will even help," and "you shouldn't need to call it anyway" are why people are so forceful in saying that generally you shouldn't call it. I think it's a case of "if you need to ask whether you should be using this, you shouldn't"
EDIT to address a few concerns from the other thread:
After reading the thread you linked, there's a few more things I'd like to point out. First, someone suggested that calling
gc()
may return memory to the system. That's certainly not necessarily true - the Java heap itself grows independently of Java allocations.As in, the JVM will hold memory (many tens of megabytes) and grow the heap as necessary. It doesn't necessarily return that memory to the system even when you free Java objects; it is perfectly free to hold on to the allocated memory to use for future Java allocations.
To show that it's possible that
System.gc()
does nothing, view:http://bugs.sun.com/view_bug.do?bug_id=6668279
and in particular that there's a -XX:DisableExplicitGC VM option.
Since objects are dynamically allocated by using the new operator,
you might be wondering how such objects are destroyed and their
memory released for later reallocation.
In some languages, such as C++, dynamically allocated objects must be manually released by use of a delete operator.