I am working on a script in Python that maps a file for processing using mmap().
The tasks requires me to change the file's contents by
- Replacing data
- Adding data into the file at an offset
- Removing data from within the file (not whiting it out)
Replacing data works great as long as the old data and the new data have the same number of bytes:
VDATA = mmap.mmap(f.fileno(),0)
start = 10
end = 20
VDATA[start:end] = "0123456789"
However, when I try to remove data (replacing the range with "") or inserting data (replacing the range with contents longer than the range), I receive the error message:
IndexError: mmap slice assignment is wrong size
This makes sense.
The question now is, how can I insert and delete data from the mmap'ed file? From reading the documentation, it seems I can move the file's entire contents back and forth using a chain of low-level actions but I'd rather avoid this if there is an easier solution.
In lack of an alternative, I went ahead and wrote two helper functions - deleteFromMmap() and insertIntoMmap() - to handle the low level file actions and ease the development.
The closing and reopening of the mmap instead of using resize() is do to a bug in python on unix derivates leading resize() to fail. (http://mail.python.org/pipermail/python-bugs-list/2003-May/017446.html)
The functions are included in a complete example. The use of a global is due to the format of the main project but you can easily adapt it to match your coding standards.
There is no way to shift contents of a file (be it mmap'ed or plain) without doing it explicitly. In the case of a mmap'ed file, you'll have to use the
mmap.move
method.