I am working on a C++ application that requires a large amounts of memory for a batch run. (> 20gb)
Some of my customers are running into memory limits where sometimes the OS starts swapping and the total run time doubles or worse.
I have read that I can use the mlockall to keep the process from being swapped out. What would happen when the process memory requirements approaches or exceeds the the available physical memory in this way?
I guess the answer might be OS specific so please list the OS in your answer.
I tried this on Linux. After I exceed my "ulimit -l" (max locked pages), malloc fails.
Here's the test script:
(Did I mention I'm not a C programmer?)
The docs say that this behavior is "implementation-dependent". So you will probably have to test your own implementation to see what happens. I ran my script by su-ing to root, setting the locked page ulimit to 10000, su-ing to a normal user from that shell, and then running the script. I did 10057 allocations before malloc failed; and many more than that without the mlockall call.
What would happen is exactly what you are seeing - failure to allocate more memory, since your application has acquired all the physical memory in the system and since these can not be swapped out, there is nothing for malloc to do but fail. This behaviour will be the same across most modern operating systems.
If you want to use mlockall (and you really shouldn't) you better make sure that the system has the required amount of physical memory or else you will be in a world of pain - malloc will fail for other processes as well and these might crash your system.
What happens in this situation is that your system's resources are exceeded by your requirements. You need to redesign your system so that it requires less memory.
Why do you need 20GB of RAM? THat's very unusual. I have some jobs that are this big. You can usually break them up into a number of smaller jobs and run them sequentially or simultaneously on multiple machines.
If you really do need to run all 20GB at the same time, you might look at making your small data set smaller so that more of it can fit in the L2 cache. This can give you substantial improvements.
What is the process that is taking 20GB of RAM? Is it your own program or something like MySQL? What language is it written in? I was able to take a program in python that took 4GB of ram and shrink it into 500MB of RAM using a streamlined C++ implementation.