It appears that a write()
immediately following a read()
on a file opened with r+
(or r+b
) permissions in Windows doesn't update the file.
Assume there is a file testfile.txt
in the current directory with the following contents:
This is a test file.
I execute the following code:
with open("testfile.txt", "r+b") as fd:
print fd.read(4)
fd.write("----")
I would expect the code to print This
and update the file contents to this:
This----a test file.
This works fine on at least Linux. However, when I run it on Windows then the message is displayed correctly, but the file isn't altered - it's like the write()
is being ignored. If I call tell()
on the filehandle it shows that the position has been updated (it's 4
before the write()
and 8
afterwards), but no change to the file.
However, if I put an explicit fd.seek(4)
just before the write()
line then everything works as I'd expect.
Does anybody know the reason for this behaviour under Windows?
For reference I'm using Python 2.7.3 on Windows 7 with an NTFS partition.
EDIT
In response to comments, I tried both r+b
and rb+
- the official Python docs seem to imply the former is canonical.
I put calls to fd.flush()
in various places, and placing one between the read()
and the write()
like this:
with open("testfile.txt", "r+b") as fd:
print fd.read(4)
fd.flush()
fd.write("----")
... yields the following interesting error:
IOError: [Errno 0] Error
EDIT 2
Indirectly that addition of a flush()
helped because it lead me to this post describing a similar problem. If one of the commenters on it is correct, it's a bug in the underlying Windows C library.
Python's file operation should follow the libc
convention as internally its implemented using C file IO functions.
Quoting from fopen man page or fopen page in cplusplus
For files open for appending (those which include a "+" sign), on
which both input and output operations are allowed, the stream should
be flushed (fflush) or repositioned (fseek, fsetpos, rewind) between
either a writing operation followed by a reading operation or a
reading operation which did not reach the end-of-file followed by a
writing operation.
SO to summarize, if you need to read a file after writing, you need to fflush
the buffer and a write operation after read should be preceded by a fseek
, as fd.seek(0, os.SEEK_CUR)
So just change your code snippet to
with open("test1.txt", "r+b") as fd:
print fd.read(4)
fd.seek(0, os.SEEK_CUR)
fd.write("----")
The behavior is consistent with how a similar C program would behave
#include <cstdio>
int main()
{
char buffer[5] = {0};
FILE *fp = fopen("D:\\Temp\\test1.txt","rb+");
fread(buffer, sizeof(char), 4, fp);
printf("%s\n", buffer);
/*without fseek, file would not be updated*/
fseek(fp, 0, SEEK_CUR);
fwrite("----",sizeof(char), 4, fp);
fclose(fp);
return 0;
}
It appears that this due to the behaviour of the underlying Windows libraries (which personally I regard to be in error) and nothing wrong with Python. On adding a flush()
call between reading and writing (which is apparently good practice) I got an IOError
with a zero errno, which is the same issue as discussed in this blog post.
From that post I found this Python issue which mentions the problem and says that the seek()
call is actually the best workaround, along with a flush()
every time you change from reading to writing.
All that taken into account, it seems the best way to write the code above such that it successfully runs on Windows is:
with open("testfile.txt", "r+b") as fd:
print fd.read(4)
fd.flush()
fd.seek(4)
fd.write("----")
Might be something to bear in mind for anybody attempting to write portable code.
have you tried flushing ?
fd.flush()
it is OS-dependant, as write uses the filesystem caching mechanism
Is it possible that the implementation missinterpretest "r+b"? Afaik "rb+" is for reading and writing in binary.