I want to write log file, unstructured format (one line at a time), using mmap
(for speed). What is the best procedure? Do I open empty file, truncate
to 1 page size (write empty string to resize file?), then mmap
- and repeat when mmaped area full?
I usually use mmap
for writing fixed size structures, usually just one page at a time, however this is for writing log files (anywhere from 0.5 - 10 Gb) using mmap but not sure what's the best practice once the first mmaped area is filled - munmap
, resize file truncate
and mmap
next page ?
While writing logs to memory area, I would tracking size, and msync
, what is the proper handling once I get to the end of the mapped memory area?
Let's say I never need to go back or overwrite existing data, so I only write new data to file.
Q1: When I get to the end of mapped area do I munmap
, ftruncate
file to resize by another page size and mmap
the next page ?
Q2: Is there a standard way to pre-empt and have the next page ready in memory for next write? Do this on another thread when we get close to the end of mapped area ?
Q3: Do I madvise
for sequential access?
This is for real time data processing with requirement to keep log file - currently I just write to file. Log file is unstructured, text format, line based.
This is for linux/c++/c optionally testing on Mac (so no remap [?]).
Any links/pointers to best practices appreciated.
I wrote my bachelor thesis about the comparism of fwrite VS mmap ("An Experiment to Measure the Performance Trade-off between Traditional I/O and Memory-mapped Files"). First of all, for writing, you don't have to go for memory-mapped files, espacially for large files. fwrite
is totally fine and will nearly always outperform approaches using mmap
. mmap
will give you the most performance boosts for parallel data reading; for sequential data writing your real limitation with fwrite
is your hardware.
In my examples remapSize
is the initial size of the file and the size by which the file gets increased on each remapping.
fileSize
keeps track of the size of the file, mappedSpace
represents the size of the current mmap (it's length), alreadyWrittenBytes
are the bytes that have already been written to the file.
Here is the example initalization:
void init() {
fileDescriptor = open(outputPath, O_RDWR | O_CREAT | O_TRUNC, (mode_t) 0600); // Open file
result = ftruncate(fileDescriptor, remapSize); // Init size
fsync(fileDescriptor); // Flush
memoryMappedFile = (char*) mmap64(0, remapSize, PROT_WRITE, MAP_SHARED, fileDescriptor, 0); // Create mmap
fileSize = remapSize; // Store mapped size
mappedSpace = remapSize; // Store mapped size
}
Ad Q1:
I used an "Unmap-Remap"-mechanism.
Unmap
- first flushes (
msync
)
- and then unmaps the memory-mapped file.
This could look the following:
void unmap() {
msync(memoryMappedFile, mappedSpace, MS_SYNC); // Flush
munmap(memoryMappedFile, mappedSpace)
}
For Remap, you have the choice to remap the whole file or only the newly appended part.
Remap basically
- increases the file size
- creates the new memory map
Example implementation for a full remap:
void fullRemap() {
ftruncate(fileDescriptor, mappedSpace + remapSize); // Make file bigger
fsync(fileDescriptor); // Flush file
memoryMappedFile = (char*) mmap64(0, mappedSpace + remapSize, PROT_WRITE, MAP_SHARED, fileDescriptor, 0); // Create new mapping on the bigger file
fileSize += reampSize;
mappedSpace += remapSize; // Set mappedSpace to new size
}
Example implementation for the small remap:
void smallRemap() {
ftruncate(fileDescriptor, fileSize + remapSize); // Make file bigger
fsync(fileDescriptor); // Flush file
remapAt = alreadyWrittenBytes % pageSize == 0
? alreadyWrittenBytes
: alreadyWrittenBytes - (alreadyWrittenBytes % pageSize); // Adjust remap location to pagesize
memoryMappedFile = (char*) mmap64(0, fileSize + remapSize - remapAt, PROT_WRITE, MAP_SHARED, fileDescriptor, remapAt); // Create memory-map
fileSize += remapSize;
mappedSpace = fileSize - remapAt;
}
There is a mremap function
out there, yet it states
This call is Linux-specific, and should not be used in programs
intended to be portable.
Ad Q2:
I'm not sure if I understood that point right. If you want to tell the kernel "and now load the next page", then no, this is not possible (at least to my knowledge). But see Ad Q3 on how to advise the kernel.
Ad Q3:
You can use madvise
with the flag MADV_SEQUENTIAL
, yet keep in mind that this does not enforce the kernel to read ahead, but only advices it.
Excerp form the man:
This may cause the kernel to aggressively read-ahead
Personal conclusion:
Do not use mmap
for sequential data writing. It will just cause much more overhead and will lead to much more "unnatural" code than a simple writing alogrithm using fwrite
.
Use mmap
for random access reads to large files.
This are also the results that were obtained during my thesis. I was not able to achieve any speedup by using mmap
for sequential writing, in fact, it was always slower for this purpose.
using mmap (for speed). What is the best procedure?
Don't use mmap
, use write
. Seriously. Why do people always seem to think that mmap
would somehow magically speed things up?
Creating a mmap
is not cheap, those page tables are not going to populate by themself. When you want to append to a file you have to
- truncate to new size (with modern file systems that's quite cheap actually)
- unmap the old mapping (leaving around dirty pages that may or may not have to be written out)
- mmap the new mapping, which requires populating of page tables. Also every time to write to a previously unfaulted page, you're invoking the page fault handler.
There are a few good uses for mmap, for example when doing random access reads in a large data set or recurrent reads from the same dataset.
For further elaboration I'll refer to Linus Torvalds himself:
http://lkml.iu.edu/hypermail/linux/kernel/0004.0/0728.html
In article <200004042249.SAA06325@op.net>, Paul Barton-Davis
wrote:
>
I was very disheartened to find that on my system the mmap/mlock
approach took 3 TIMES as long as the read solution. It seemed to me
that mmap/mlock should be at least as fast as read. Comments are
invited.
People love mmap() and other ways to play with the page tables to
optimize away a copy operation, and sometimes it is worth it.
HOWEVER, playing games with the virtual memory mapping is very
expensive in itself. It has a number of quite real disadvantages that
people tend to ignore because memory copying is seen as something very
slow, and sometimes optimizing that copy away is seen as an obvious
improvment.
Downsides to mmap:
quite noticeable setup and teardown costs. And I mean noticeable. It's things like following the page tables to unmap everything cleanly. It's the book-keeping for maintaining a list of all the mappings. It's The TLB flush needed after unmapping stuff.
page faulting is expensive. That's how the mapping gets populated, and it's quite slow.
Upsides of mmap:
if the data gets re-used over and over again (within a single map operation), or if you can avoid a lot of other logic by just mapping something in, mmap() is just the greatest thing since sliced bread. This may be a file that you go over many times (the binary image of an executable is the obvious case here - the code jumps all around the place), or a setup where it's just so convenient to map the whole thing in without regard of the actual usage patterns that mmap() just wins. You may have random access patterns, and use mmap() as a way of keeping track of what data you actually needed.
if the data is large, mmap() is a great way to let the system know what it can do with the data-set. The kernel can forget pages as memory pressure forces the system to page stuff out, and then just automatically re-fetch them again.
And the automatic sharing is obviously a case of this..
But your test-suite (just copying the data once) is probably pessimal
for mmap().
Linus