Delete / Insert Data in mmap'ed File

2020-02-24 04:27发布

问题:

I am working on a script in Python that maps a file for processing using mmap().

The tasks requires me to change the file's contents by

  1. Replacing data
  2. Adding data into the file at an offset
  3. Removing data from within the file (not whiting it out)

Replacing data works great as long as the old data and the new data have the same number of bytes:

VDATA = mmap.mmap(f.fileno(),0)
start = 10
end = 20
VDATA[start:end] = "0123456789"

However, when I try to remove data (replacing the range with "") or inserting data (replacing the range with contents longer than the range), I receive the error message:

IndexError: mmap slice assignment is wrong size

This makes sense.

The question now is, how can I insert and delete data from the mmap'ed file? From reading the documentation, it seems I can move the file's entire contents back and forth using a chain of low-level actions but I'd rather avoid this if there is an easier solution.

回答1:

In lack of an alternative, I went ahead and wrote two helper functions - deleteFromMmap() and insertIntoMmap() - to handle the low level file actions and ease the development.

The closing and reopening of the mmap instead of using resize() is do to a bug in python on unix derivates leading resize() to fail. (http://mail.python.org/pipermail/python-bugs-list/2003-May/017446.html)

The functions are included in a complete example. The use of a global is due to the format of the main project but you can easily adapt it to match your coding standards.

import mmap

# f contains "0000111122223333444455556666777788889999"

f = open("data","r+")
VDATA = mmap.mmap(f.fileno(),0)

def deleteFromMmap(start,end):
    global VDATA
    length = end - start
    size = len(VDATA)
    newsize = size - length

    VDATA.move(start,end,size-end)
    VDATA.flush()
    VDATA.close()
    f.truncate(newsize)
    VDATA = mmap.mmap(f.fileno(),0)

def insertIntoMmap(offset,data):
    global VDATA
    length = len(data)
    size = len(VDATA)
    newsize = size + length

    VDATA.flush()
    VDATA.close()
    f.seek(size)
    f.write("A"*length)
    f.flush()
    VDATA = mmap.mmap(f.fileno(),0)

    VDATA.move(offset+length,offset,size-offset)
    VDATA.seek(offset)
    VDATA.write(data)
    VDATA.flush()

deleteFromMmap(4,8)

# -> 000022223333444455556666777788889999

insertIntoMmap(4,"AAAA")

# -> 0000AAAA22223333444455556666777788889999


回答2:

There is no way to shift contents of a file (be it mmap'ed or plain) without doing it explicitly. In the case of a mmap'ed file, you'll have to use the mmap.move method.