Python Replace one line in >20GB text file

2019-08-06 04:50发布

问题:

I am fully aware that there were many approaches to this problem.

What I need is a simple Python script that would replace only 1 line in a large text file.

It is always the fourth line from the beginning.

As the file (actually, files) is bigger than 20GB, I don't want to load it to memory or create a copy, just replace one line efficiently.

I'll be glad for any help in this regard.

A.

PS. I know vi can do it, but I need it as a script, so that someone non-vi-compatible would be able to do it as well.

回答1:

You can open a file for updating, or use mmap as the other answer suggested. Example on how to edit in the middle of a file:

def example(fname):
    f = open(fname, 'r+b')
    f.seek(100)
    f.write('foobar')
    f.close()

That will edit in "foobar" at location 100 in the file. However in the general case where the line you edit becomes either longer and shorter, you still will have to go through the whole file all the way to the end (you can only extend and truncate a file at the end, not at the head). Vi is not magic in this regard, the same rules apply to it.

To keep it simple, I would iterate through the whole file and output a new, edited file. You definitely don't want to read it all into memory at once. Do it line by line until the line you need to edit, and block by block after that.

You can also use the ed or sed commands as these are arguably simpler to script than vi.



回答2:

Try using a memory mapped file. https://docs.python.org/2/library/mmap.html