I'm trying to find any system functionality that would allow a process to allocate "temporary" memory - i.e. memory that is considered discardable by the process, and can be take away by the system when memory is needed, but allowing the process to benefit from available memory when possible. In other words, the process tells the system it's OK to sacrifice the block of memory when the process is not using it. Freeing the block is also preferable to swapping it out (it's more expensive, or as expensive, to swap it out rather then re-constitute its contents).
Systems (e.g. Linux), have those things in the kernel, like F/S memory cache. I am looking for something like this, but available to the user space.
I understand there are ways to do this from the program, but it's really more of a kernel job to deal with this. To some extent, I'm asking the kernel:
- if you need to reduce my, or another process residency, take these temporary pages off first
- if you are taking these temporary pages off, don't swap them out, just unmap them
Specifically, I'm interested on a solution that would work on Linux, but would be interested to learn if any exist for any other O/S.
UPDATE
An example on how I expect this to work:
- map a page (over swap). No difference to what's available right now.
- tell the kernel that the page is "temporary" (for the lack of a better name), meaning that if this page goes away, I don't want it paged in.
- tell the kernel that I need the temporary page "back". If the page was unmapped since I marked it "temporary", I am told that happened. If it hasn't, then it starts behaving as a regular page.
Here are the problems to have that done over existing MM:
To make pages not being paged in, I have to allocate them over nothing. But then, they can get paged out at any time, without notice. Testing with mincore() doesn't guarantee that the page will still be there by the time mincore() finishes. Using mlock() requires elevated privileges.
So, the closest I can get to this is by using mlock(), and anonymous pages. Following the expectations I outlined earlier, it would be:
- map an anonymous, locked page. (MAP_ANON|MAP_LOCKED|MAP_NORESERVE). Stamp the page with magic.
- for making page "temporary", unlock the page
- when needing the page, lock it again. If the magic is there, it's my data, otherwise it's been lost, and I need to reconstitute it.
However, I don't really need for pages to be locked in RAM when I'm using them. Also, MAP_NORESERVE is problematic if memory is overcommitted.
This is what the
VmWare ESXi
server aka theVirtual Machine Monitor
(VMM) layer implements. This is used in the Virtual Machines and is a way toreclaim
memory from the virtual machine guests. Virtual machines that have more memory allocated than they actually are using/require are made to release/free it to the VMM so that it can assign it back to the Virtual Machines guests that are in need of it.This technique of
Memory Reclamation
is mentioned in this paper: http://www.vmware.com/files/pdf/mem_mgmt_perf_vsphere5.pdfOn similar lines, something similar you can implement in your kernel.
I'm not sure to understand exactly your needs. Remember that processes run in virtual memory (their address space is virtual), that the kernel is dealing with virtual to physical address translation (using the MMU) and with paging. So page fault can happen at any time. The kernel will choose to page-in or page-out at arbitrary moments - and will choose which page to swap (only the kernel care about RAM, and it can page-out any physical RAM page at will). Perhaps you want the kernel to tell you when a page is genuinely discarded. How would the kernel take away temporary memory from your process without your process being notified ? The kernel could take away and later give back some RAM.... (so you want to know when the given back memory is fresh)
You might use mmap(2) with
MAP_NORESERVE
first, then again (on the same memory range) withMAP_FIXED|MAP_PRIVATE
. See also mincore(2) and mlock(2)You can also later use madvise(2) with
MADV_WONTNEED
orMADV_WILLNEED
etc..Perhaps you want to
mmap
some device like/dev/null
,/dev/full
,/dev/zero
or (more likely) write your own kernel module providing a similar device.GNU Hurd has an external pager mechanism... You cannot yet get exactly that on Linux. (Perhaps consider
mmap
on some FUSE mounted file).I don't understand what you want to happen when the kernel is paging out your memory, and what you want to happen when the kernel is paging in again such a page because your process is accessing it. Do you want to get a zero-ed page, or a
SIGSEGV
?