Hypothetically, suppose I want to perform sequential writing to a potentially very large file.
If I mmap() a gigantic region and madvise(MADV_SEQUENTIAL) on that entire region, then I can write to the memory in a relatively efficient manner. This I have gotten to work just fine.
Now, in order to free up various OS resources as I am writing, I occasionally perform a munmap() on small chunks of memory that have already been written to. My concern is that munmap() and msync()will block my thread, waiting for the data to be physically committed to disk. I cannot slow down my writer at all, so I need to find another way.
Would it be better to use madvise(MADV_DONTNEED) on the small, already-written chunk of memory? I want to tell the OS to write that memory to disk lazily, and not to block my calling thread.
The manpage on madvise() has this to say, which is rather ambiguous:
MADV_DONTNEED
Do not expect access in the near future. (For the time being, the
application is finished with the given range, so the kernel can free
resources associated with it.) Subsequent accesses of pages in this
range will succeed, but will result either in re-loading of the memory
contents from the underlying mapped file (see mmap(2)) or
zero-fill-on-demand pages for mappings without an underlying file.
first, madv_sequential enables aggressive readahead, so you don't need it. second, os will lazily write dirty file-baked memory to disk anyway, even if you will do nothing. but madv_dontneed will instruct it to free memory immediately (what you call "various os resources"). third, it is not clear that mmapping files for sequential writing has any advantage. you probably will be better served by just write(2) (but use buffers - either manual or stdio).
No!
For your own good, stay away from
MADV_DONTNEED
. Linux will not take this as a hint to throw pages away after writing them back, but to throw them away immediately. This is not considered a bug, but a deliberate decision.Ironically, the reasoning is that the functionality of a non-destructive
MADV_DONTNEED
is already given bymsync(MS_INVALIDATE|MS_ASYNC)
,MS_ASYNC
on the other hand does not start I/O (in fact, it does nothing at all, following the reasoning that dirty page writeback works fine anyway),fsync
always blocks, andsync_file_range
may block if you exceed some obscure limit and is considered "extremely dangerous" by the documentation, whatever that means.Either way, you must
msync(MS_SYNC)
, orfsync
(both blocking), orsync_file_range
(possibly blocking) followed byfsync
, or you will lose data withMADV_DONTNEED
. If you cannot afford to possibly block, you have no choice, sadly, but to do this in another thread.