Had an interesting experience with Python's file buffering and wanted to know that I understand it correctly. Given
[Python 2.7 shell]
...
model = (really big Numpy model)
f = open('file.out','w')
pickle.dump(model, f)
(pickle.dump() finishes while I'm doing other things)
[Bash shell]
$ ls -l
-rw-r--r-- 1 john staff 270655488 Dec 6 21:32 file.out
[Return to Python shell]
model = (different really big Numpy model)
f = open('newfile.out','w')
pickle.dump(model,f)
(pickle.dump() finishes)
[Bash shell]
$ ls -l
-rw-r--r-- 1 john staff 270659455 Dec 7 07:09 file.out
-rw-r--r-- 1 john staff 270659451 Dec 6 20:48 newfile.out
Note file.out is now a different size.
Now, I know that Python's file buffer defaults to the system size (I'm on Mac OSX), so it seems that there were still 3,967 bytes in the file buffer while I was screwing around, and the Mac OSX file buffer is greater than that.
What interested me was that I was forcibly reassigning the file object 'f' to another open file without actually calling f.close() (Honestly, I was just working really fast to test something else and forgot). When I looked at the file size, I half expected it to remain the same (which might mean truncating the output)
So, the question is whether this is a safe procedure. Is the file object assignment wrapped in such a way that either the Python garbage collector, or the file object itself, flushes the buffer and closes the file on such a sudden variable re-assignment even if you don't call the close() method? More importantly, is this always the case, or is it possible that the variable re-assignment actually did-- or in another situation might-- truncate that buffer before the file buffer flushed.
I guess it's really a question of how elegant and safe the file objects and Python garbage collector are when yanking objects around without appropriate destruction.