Is there a way to tell Linux that it shouldn't swap out a particular processes' memory to disk?
Its a Java app, so ideally I'm hoping for a way to do this from the command line.
I'm aware that you can set the global swappiness to 0, but is this wise?
You can do this via the mlockall(2) system call under Linux; this will work for the whole process, but do read about the argument you need to pass.
Do you really need to pull the whole thing in-core? If it's a java app, you would presumably lock the whole JVM in-core. I don't know of a command-line method for doing this, but you could write a trivial program to call fork
, call mlockall
, then exec
.
You might also look to see if one of the access pattern notifications in madvise(2) meets your needs. Advising the VM subsystem about a better paging strategy might work out better if it's applicable for you.
Note that a long time ago now under SunOS, there was a mechanism similar to madvise called vadvise(2).
If you wish to change the swappiness for a process add it to a cgroup and set the value for that cgroup:
https://unix.stackexchange.com/questions/10214/per-process-swapiness-for-linux#10227
There exist a class of applications in which you never want them to swap. One such class is a database. Databases will use memory as caches and buffers for their disk areas, and it makes absolutely no sense that these are ever put to swap. The particular memory may hold some relevant data that is not needed for a week until one day when a client asks for it. Without the caching/swapping, the database would simply find the relevant record on disk, which would be quite fast; but with swapping, your service might suddenly be taking a long time to respond.
mysqld
includes code to use the OS / system call memlock
. On Linux, since at least 2.6.9, this system call will work for non-root processes that have the CAP_IPC_LOCK
capability[1]. When using memlock()
, the process must still work within the bounds of the LimitMEMLOCK
limit. [2]. One of the (few) good things about systemd
is that you can grant the mysqld
process these capabilities, without requiring a special program. If can also set the rlimits as you'd expect with ulimit
. Here is an override
file for mysqld
that does the requisite steps, including a few others that you might need for a process such as a database:
[Service]
# Prevent mysql from swapping
CapabilityBoundingSet=CAP_IPC_LOCK
# Let mysqld lock all memory to core (don't swap)
LimitMEMLOCK=-1
# do not kills this process if low on memory
OOMScoreAdjust=-900
# Use higher io scheduling
IOSchedulingClass=realtime
Type=simple
ExecStart=
ExecStart=/usr/sbin/mysqld --memlock $MYSQLD_OPTS
Note The standard community mysql currently ships with Type=forking
and adds --daemonize
in the option to the service on the ExecStart
line. This is inherently less stable than the above method.
UPDATE I am not 100% happy with this solution. After several days of runtime, I noticed the process still had enormous amounts of swap! Examining /proc/XXXX/smaps
, I note the following:
- The largest contributor of swap is from a stack segment! 437 MB and fluctuating. This presents obvious performance issues. It also indicates stack-based memory leak.
- There are zero Locked pages. This indicates the
memlock
option in MySQL (or Linux) is broken. In this case, it wouldn't matter much because MySQL can't memlock stack.
You can do that by the mlock family of syscalls. I'm not sure, however, if you can do it for a different process.
As super user you can 'nice' it to the highest priority level -20 and hope that's enough to keep it from being swapped out. It usually is. Positive numbers lower scheduling priority. Normal users cannot nice upwards (negative nos.)
Except in extremely unusual circumstances, asking this question means that You're Doing It Wrong(tm).
Seriously, if Linux wants to swap and you're trying to keep your process in memory then you're putting an unreasonable demand on the OS. If your app is that important then 1) buy more memory, 2) remove other apps/daemons from the machine, or dedicate a machine to your app, and/or 3) invest in a really fast disk subsystem. These steps are reasonable for an important app. If you can't justify them, then you probably can't justify wiring memory and starving other processes either.
Why do you want to do this?
If you are trying to increase performance of this app then you are probably on the wrong track. The OS will swap out a process to increase memory for disk cache - even if there is free RAM, the kernel knows best (actauly the samrt guys that wrote the scheduler know best).
If you have a process that needs responsiveness (it's swapped out while not used and you need it to restart quickly) then nice it to high priority, mlock, or using a real time kernel might help.