Mixing read() and write() on Python files in Windo

2019-01-15 19:57发布

问题:

It appears that a write() immediately following a read() on a file opened with r+ (or r+b) permissions in Windows doesn't update the file.

Assume there is a file testfile.txt in the current directory with the following contents:

This is a test file.

I execute the following code:

with open("testfile.txt", "r+b") as fd:
    print fd.read(4)
    fd.write("----")

I would expect the code to print This and update the file contents to this:

This----a test file.

This works fine on at least Linux. However, when I run it on Windows then the message is displayed correctly, but the file isn't altered - it's like the write() is being ignored. If I call tell() on the filehandle it shows that the position has been updated (it's 4 before the write() and 8 afterwards), but no change to the file.

However, if I put an explicit fd.seek(4) just before the write() line then everything works as I'd expect.

Does anybody know the reason for this behaviour under Windows?

For reference I'm using Python 2.7.3 on Windows 7 with an NTFS partition.

EDIT

In response to comments, I tried both r+b and rb+ - the official Python docs seem to imply the former is canonical.

I put calls to fd.flush() in various places, and placing one between the read() and the write() like this:

with open("testfile.txt", "r+b") as fd:
    print fd.read(4)
    fd.flush()
    fd.write("----")

... yields the following interesting error:

IOError: [Errno 0] Error

EDIT 2

Indirectly that addition of a flush() helped because it lead me to this post describing a similar problem. If one of the commenters on it is correct, it's a bug in the underlying Windows C library.

回答1:

Python's file operation should follow the libc convention as internally its implemented using C file IO functions.

Quoting from fopen man page or fopen page in cplusplus

For files open for appending (those which include a "+" sign), on which both input and output operations are allowed, the stream should be flushed (fflush) or repositioned (fseek, fsetpos, rewind) between either a writing operation followed by a reading operation or a reading operation which did not reach the end-of-file followed by a writing operation.

SO to summarize, if you need to read a file after writing, you need to fflush the buffer and a write operation after read should be preceded by a fseek, as fd.seek(0, os.SEEK_CUR)

So just change your code snippet to

with open("test1.txt", "r+b") as fd:
    print fd.read(4)
    fd.seek(0, os.SEEK_CUR)
    fd.write("----")

The behavior is consistent with how a similar C program would behave

#include <cstdio>
int main()
{   
    char  buffer[5] = {0};
    FILE *fp = fopen("D:\\Temp\\test1.txt","rb+");
    fread(buffer, sizeof(char), 4, fp);
    printf("%s\n", buffer);
    /*without fseek, file would not be updated*/
    fseek(fp, 0, SEEK_CUR); 
    fwrite("----",sizeof(char), 4, fp);
    fclose(fp);
    return 0;
}


回答2:

It appears that this due to the behaviour of the underlying Windows libraries (which personally I regard to be in error) and nothing wrong with Python. On adding a flush() call between reading and writing (which is apparently good practice) I got an IOError with a zero errno, which is the same issue as discussed in this blog post.

From that post I found this Python issue which mentions the problem and says that the seek() call is actually the best workaround, along with a flush() every time you change from reading to writing.

All that taken into account, it seems the best way to write the code above such that it successfully runs on Windows is:

with open("testfile.txt", "r+b") as fd:
    print fd.read(4)
    fd.flush()
    fd.seek(4)
    fd.write("----")

Might be something to bear in mind for anybody attempting to write portable code.



回答3:

have you tried flushing ?

fd.flush()

it is OS-dependant, as write uses the filesystem caching mechanism



回答4:

Is it possible that the implementation missinterpretest "r+b"? Afaik "rb+" is for reading and writing in binary.