Calling fsync(2) after close(2)

2019-05-22 15:28发布

问题:

Scenario:

Task code (error checking omitted):

// open, write and close
fd = open(name);
write(fd, buf, len);
close(fd);
< more code here **not** issuing read/writes to name but maybe open()ing it >
// open again and fsync
fd = open(name);
fsync(fd);

No more tasks accessing name concurrently in the system.

Is it defined, and more important, will it sync possible outstanding writes on the inode referred by name? ie, will I read back buf from the file after the fsync?

From POSIX http://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html I would say it seems legit ...

Thanks.

Edit may 18: Thanks for the answers and research. I took this question (in 2016) to one of the extfs lead developers (Ted) and got this answer: "It's not guaranteed by Posix, but in practice it should work on most file systems, including ext4. The key wording in the Posix specification is:

The fsync() function shall request that all data for the open file ^^^^^^^^^^^^^^^^^ descriptor named by fildes is to be transferred to the storage device ^^^^^^^^^^^^^^^^^^^^^^^^^^ associated with the file described by fildes.

It does not say "all data for the file described by fildes...." it says "all data for the open file descriptor". So technically data written by another file descriptor is not guaranteed to be synced to disk.

In practice, file systems don't try dirty data by which fd it came in on, so you don't need to worry. And an OS which writes more than what is strictly required is standards compliant, and so that's what you will find in general, even if it isn't guaranteed." This is less specific than "exact same durabily guarrantees" but is quite authoritative, even though maybe outdated.

What I was trying to do was a 'sync' command that worked on single files. Like fsync /some/file without having to sync the whole filesystem, to use it in shell scripts for example. Now (since a few years ago) gnu coreutils 'sync' works on single files and does exactly this (open/fsync). commit: https://github.com/coreutils/coreutils/commit/8b2bf5295f353016d4f5e6a2317d55b6a8e7fd00

回答1:

No, close()+re-open()+fsync() does not provide the same guarantees as fsync()+close().

Source: I took this question to the linux-fsdevel mailing list and got the answer:

Does a sequence of close()/re-open()/fsync() provide the same durability guarantees as fsync()/close()?

The short answer is no, the latter provides a better guaranty. The longer answer is that durability guarantees depends on kernel version, because situation has been changing in v4.13, v4.14 and now again in v4.17-rc and stable kernels.

Further relevant links are:

  • https://wiki.postgresql.org/wiki/Fsync_Errors ("fsyncgate")
  • Mailing list entry PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
  • Writing programs to cope with I/O errors causing lost writes on Linux from the same author

In particular, the latter links describe how

  • after closing an FD, you lose all ways to enforce durability
  • after an fsync() fails, you cannot call fsync() again in the hope that now your data would be written
  • you must re-do/confirm all writing work if that happens


回答2:

The current (2017) specification of POSIX fsync() recognizes a base functionality and an optional functionality:

The fsync() function shall request that all data for the open file descriptor named by fildes is to be transferred to the storage device associated with the file described by fildes. The nature of the transfer is implementation-defined. The fsync() function shall not return until the system has completed that action or until an error is detected.

[SIO] ⌦ If _POSIX_SYNCHRONIZED_IO is defined, the fsync() function shall force all currently queued I/O operations associated with the file indicated by file descriptor fildes to the synchronized I/O completion state. All I/O operations shall be completed as defined for synchronized I/O file integrity completion. ⌫

If _POSIX_SYNCHRONIZED_IO is not defined by the implementation, then your reopened file descriptor has no unwritten data to be transferred to the storage device, so the fsync() call is effectively a no-op.

If _POSIX_SYNCHRONIZED_IO is defined by the implementation, then your reopened file descriptor will ensure that all data written on any file descriptor associated with the file to be transferred to the storage device.

The section of the standard on Conformance has information about options and option groups. The Definitions section has definitions 382..387 which defines aspects of Synchronized I/O and Synchronous I/O (yes, they're different — beware open file descriptors and open file descriptions, too). The section on Realtime defers to the Definitions section for what synchronized I/O means.

It defines:

3.382 Synchronized Input and Output

A determinism and robustness improvement mechanism to enhance the data input and output mechanisms, so that an application can ensure that the data being manipulated is physically present on secondary mass storage devices.

3.383 Synchronized I/O Completion

The state of an I/O operation that has either been successfully transferred or diagnosed as unsuccessful.

3.384 Synchronized I/O Data Integrity Completion

For read, when the operation has been completed or diagnosed if unsuccessful. The read is complete only when an image of the data has been successfully transferred to the requesting process. If there were any pending write requests affecting the data to be read at the time that the synchronized read operation was requested, these write requests are successfully transferred prior to reading the data.

For write, when the operation has been completed or diagnosed if unsuccessful. The write is complete only when the data specified in the write request is successfully transferred and all file system information required to retrieve the data is successfully transferred.

File attributes that are not necessary for data retrieval (access time, modification time, status change time) need not be successfully transferred prior to returning to the calling process.

3.385 Synchronized I/O File Integrity Completion

Identical to a synchronized I/O data integrity completion with the addition that all file attributes relative to the I/O operation (including access time, modification time, status change time) are successfully transferred prior to returning to the calling process.

3.386 Synchronized I/O Operation

An I/O operation performed on a file that provides the application assurance of the integrity of its data and files.

3.387 Synchronous I/O Operation

An I/O operation that causes the thread requesting the I/O to be blocked from further use of the processor until that I/O operation completes.

Note: A synchronous I/O operation does not imply synchronized I/O data integrity completion or synchronized I/O file integrity completion.

It is not 100% clear whether the 'all currently queued I/O operations associated with the file indicated by [the] file descriptor' applies across processes. Conceptually, I think it should, but the wording isn't there in black and white (or black on pale yellow). It certainly should apply to any open file descriptors in the current process referring to the same file. It is not clear that it would apply to the previously opened (and closed) file descriptor in the current process. If it applies across all processes, then it should include the queued I/O from the current process. If it does not apply across all processes, it is possible that it does not.

In view of this and the rationale notes for fsync(), it is by far safest to assume that the fsync() operation has no effect on the queued operations associated with the closed file descriptor. If you want fsync() to be effective, call it before you close the file descriptor.