可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm trying to edit a text file in-place in python. It is very large (so loading it into memory is not an option). I intend to replace byte-for-byte strings I find inside.

with f as open("filename.txt", "r+b"):
    if f.read(8) == "01234567":
        f.seek(-8, 1)
        f.write("87654321")

However, the write() operation adds onto the end of the file when I tried it:

>>> n.read()
'sdf'
>>> n.read(1)
''
>>> n.seek(0,0)
>>> n.read(1)
's'
>>> n.read(1)
'd'
>>> n.write("sdf")
>>> n.read(1)
''
>>> n.seek(0,0)
>>> n.read()
'sdfsdf'
`

I want the result of that to be sdsdf.

回答1:

The original ANSI / ISO C standards required a seek operation when switching a read-write mode stream from read mode to write mode, and vice versa. This restriction persists, e.g., n1570 includes this text:

When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end-of-file. Opening (or creating) a text file with update mode may instead open (or create) a binary stream in some implementations.

For whatever reason this restriction has been imported into Python,¹ even though it would be possible for the Python wrappers to handle it automatically.

For what it's worth, the reason for the original ANSI C restriction was the low-budget implementation found on many Unix-based systems: they kept, for each stream, a "current byte count" and "current pointer". The current byte count was 0 if the macro-ized getc and putc operations had to call into underlying implementation, which could check whether a stream was opened in update mode and switch it as needed. But once you successfully obtained a character, the counter would hold the number of characters that could continue to be read from the underlying stream; and once you successfully wrote a character, the counter would hold the number of buffer-locations that allowed adding characters.

This meant that if you did a successful getc that filled an internal buffer, but followed it by a putc, the "written" character from putc would simply overwrite the buffered data. If you had a successful putc but followed with a poorly-implemented getc, you would see un-set value out of the buffer.

This problem was trivial to fix (just provide separate input and output counters, one of which is always zero, and have the functions that implement buffer-refill check for mode-switch as well).

¹Citation needed :-)

回答2:

You can check the difference of following codes:

>>> f = open("file.txt", "r+b")
>>> f.seek(2)
>>> f.write("sdf")
>>> f.seek(0)
>>> f.read()
'sdsdf'


>>> f = open("file.txt", "r+b")
>>> f.read(1)
's'
>>> f.read(1)
'd'
>>> f.write("sdf")
>>> f.seek(0)
>>> f.read()
'sdfsdf'

The pointer of .write is originally at the end of the file. Only .seek() will change its position, but not .read(). So you have to call .seek() before writing the bytes. The following code works well:

>>> f = open("file.txt", "r+b")
>>> f.read(1)
's'
>>> f.read(1)
'd'
>>> f.seek(2)
>>> f.write("sdf")
>>> f.seek(0)
>>> f.read()
'sdsdf'

Python in-place write to file at arbitrary positio

问题:

回答1:

回答2:

收藏的人(0)

Python in-place write to file at arbitrary positio

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮