Delete / Insert Data in mmap'ed File

2020-02-24 04:10发布

I am working on a script in Python that maps a file for processing using mmap().

The tasks requires me to change the file's contents by

  1. Replacing data
  2. Adding data into the file at an offset
  3. Removing data from within the file (not whiting it out)

Replacing data works great as long as the old data and the new data have the same number of bytes:

VDATA = mmap.mmap(f.fileno(),0)
start = 10
end = 20
VDATA[start:end] = "0123456789"

However, when I try to remove data (replacing the range with "") or inserting data (replacing the range with contents longer than the range), I receive the error message:

IndexError: mmap slice assignment is wrong size

This makes sense.

The question now is, how can I insert and delete data from the mmap'ed file? From reading the documentation, it seems I can move the file's entire contents back and forth using a chain of low-level actions but I'd rather avoid this if there is an easier solution.

2条回答
狗以群分
2楼-- · 2020-02-24 04:50

In lack of an alternative, I went ahead and wrote two helper functions - deleteFromMmap() and insertIntoMmap() - to handle the low level file actions and ease the development.

The closing and reopening of the mmap instead of using resize() is do to a bug in python on unix derivates leading resize() to fail. (http://mail.python.org/pipermail/python-bugs-list/2003-May/017446.html)

The functions are included in a complete example. The use of a global is due to the format of the main project but you can easily adapt it to match your coding standards.

import mmap

# f contains "0000111122223333444455556666777788889999"

f = open("data","r+")
VDATA = mmap.mmap(f.fileno(),0)

def deleteFromMmap(start,end):
    global VDATA
    length = end - start
    size = len(VDATA)
    newsize = size - length

    VDATA.move(start,end,size-end)
    VDATA.flush()
    VDATA.close()
    f.truncate(newsize)
    VDATA = mmap.mmap(f.fileno(),0)

def insertIntoMmap(offset,data):
    global VDATA
    length = len(data)
    size = len(VDATA)
    newsize = size + length

    VDATA.flush()
    VDATA.close()
    f.seek(size)
    f.write("A"*length)
    f.flush()
    VDATA = mmap.mmap(f.fileno(),0)

    VDATA.move(offset+length,offset,size-offset)
    VDATA.seek(offset)
    VDATA.write(data)
    VDATA.flush()

deleteFromMmap(4,8)

# -> 000022223333444455556666777788889999

insertIntoMmap(4,"AAAA")

# -> 0000AAAA22223333444455556666777788889999
查看更多
兄弟一词,经得起流年.
3楼-- · 2020-02-24 05:00

There is no way to shift contents of a file (be it mmap'ed or plain) without doing it explicitly. In the case of a mmap'ed file, you'll have to use the mmap.move method.

查看更多
登录 后发表回答