In the recent 10 year when discussing java and/or garbage collection, the only performance penalty that I have not been able to defend is that garbage collection algorithms more or less breaks when running in a paged memory architecture, and parts of the heap is getting paged out.
Unix systems (and especially Linux) agressively pages out memory that has not been touched for a while, and while that is good for your average leaking c application, it kill javas perfomance in memory tight situations.
I know the best practice is to keep the max heap less than the physical memory. (Or you will see your application swap to death) but the idea - at least in the unix world, is that the memory could be better spent for filesystem caches etc.
My question is:
Are there any paging (aware) garbage collecting algorithms?
You are correct, the garbage collector and the virtual memory manager have to collaborate otherwise the GC will trash the system. Such a GC/kernel collaboration has been investigated by Matthew Hertz, Yi Feng and Emery D. Berger. In order to obtain good performance, they had to extend the kernel slightly and also tweak the garbage collector.
Under high memory pressure, their benchmark took something like 160x longer using the GenMS Java GC. With the new, page-aware GC, the benchmark was only 1.6 times slower. In other words, with a properly tuned GC, there is a 100x performance gain.
http://lambda-the-ultimate.org/node/2391
I'm going to contend that this is not as big an issue as you think.
To make sure that we're describing the same thing: a complete collection requires the JVM to walk the object graph to identify every reachable object; the ones left over are garbage. While doing so, it will touch every page in the application heap, which will cause every page to be faulted into memory if it's been swapped out.
I think that's a non-concern for several reasons: First, because modern JVMs use generational collectors, and most objects never make their way out of the young generations, which are almost guaranteed to be in the resident set.
Second, because the objects that move out of the young generation still tend to be accessed frequently, which again means they should be in the resident set. This is a more tenuous argument, and there are in fact lots of cases where long-lived objects won't get touched except by the GC (one reason that I don't believe in memory-limited caches).
The third reason (and there may be more) is because the JVM (at least, the Sun JVM) uses a mark-sweep-compact collector. So after GC, the active objects in the heap occupy a smaller number of pages, again increasing the RSS. This, incidentally, is the main driver for Swing apps to explicitly call System.gc() when they're minimized: by compacting the heap, there's less to swap in when they get maximized again.
Also, recognize that heap fragmentation of C/C++ objects can get extreme, and young objects will be sprinkled amongst older, so the RSS has to be larger.
I'm not an expert, but any kind of generational garbage collection should help somewhat. The pages which are being aggressively swapped are likely to contain older objects rather than newer ones, so the gc will naturally touch them less frequently.
I'd also ask the counter-question: are there any unix paging algorithms which are garbage-collection aware? If a given page is being hauled into memory on a regular (if not frequent) basis, then maybe it isn't such a great candidate after all to be ditched in favour of more disk cache ;-)
Unix systems (and especially Linux) agressively pages out memory that has not been touched for a while, and while that is good for your average leaking c application, it kill javas perfomance in memory tight situations.
Keep in mind that this is typically a tunable setting -- vm.swappiness for the Linux kernel, for example. You can read a blog article I wrote on this if you want some more in depth information about swap-tuning on Linux.
Are there any paging (aware) garbage collecting algorithms?
Garbage collection algorithms are typically designed for a very wide variety of possible programs as input and for operation in a large array of possible environments; their design needs to take this into account. I think it would be really challenging to make a "paging-aware" gc algorithm that was broadly useful. If you were writing one for a very specialized environment where you could lock down things, then I think you'd have a pretty good chance of producing a good result.
In this day and age, paging programs is a really bad idea. Memory is very cheap.
FWIW, if you are running on a PC from a manufacturer that likes to charge several times over the odds for memory, Windows Vista has a predictive paging algorithm that works quite well (perhaps the only thing that OS does do well).