How does IO buffering work in Ruby? How often is data flushed to the underlying stream when using the IO
and File
classes? How does this compare to OS buffering? What needs to be done to guarantee that given data has been written to disk, before confidently reading it back for processing?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
The Ruby IO documentation is not 100% clear on how this buffering works, but this is what you can extract from the documentation:
- Ruby IO has its own internal buffer
- In addition to that the underlying operating system may or may not further buffer data.
The relevant methods to look at:
IO.flush
: FlushesIO
. I also looked at the Ruby source and a call toIO.flush
also calls the underlying OSfflush()
. This should be enough to get the file cached, but does not guarantee physical data to disk.IO.sync=
: If set totrue
, no Ruby internal buffering is done. Everything is immidiately sent to the OS, andfflush()
is called for each write.IO.sync
: Returns the current sync setting (true
orfalse
).IO.fsync
: Flushes both the Ruby buffers + callsfsync()
on the OS (if it supports it). This will guarantee a full flush all the way to the physical disk file.IO.close
: Closes the RubyIO
and writes pending data to the OS. Note that this does not implyfsync()
. The POSIX documentation onclose()
says that it does NOT guarantee data is physically written to the file. So you need to use an explicitfsync()
call for that.
Conclusion: flush
and/or close
should be enough to get the file cached so that it can be read fully by another process or operation. To get the file all the way to the physical media with certainty, you need to call IO.fsync
.
Other related methods:
IO.syswrite
: Bypass Ruby internal buffers and do a straight OSwrite
. If you use this then do not mix it withIO.read/write
.IO.sysread
: Same as above, but for reading.
回答2:
Ruby does its internal buffering on top of the OS. When you do file.flush Ruby flushes its internal buffer. To ensure the file is written to disk you need to do file.fsync. But in the end you can not be certain the file is written to disk anyway, it depends on the OS, the hdd controller and the hdd.