I felt at peace with Posix after many years of experience.
Then I read this message from Linus Torvalds, circa 2002:
int ret;
do {
ret = close(fd);
} while(ret == -1 && errno != EBADF);
NO.
The above is
(a) not portable
(b) not current practice
The "not portable" part comes from the fact that (as somebody pointed
out), a threaded environment in which the kernel does close the FD
on errors, the FD may have been validly re-used (by the kernel) for
some other thread, and closing the FD a second time is a BUG.
Not only is looping until EBADF
unportable, but any loop is, due to a race condition that I probably would have noticed if I hadn't "made peace" by taking such things for granted.
However, in the GCC C++ standard library implementation, basic_file_stdio.cc
, we have
do
__err = fclose(_M_cfile);
while (__err && errno == EINTR);
The primary target for this library is Linux, but it seems not to be heeding Linus.
As far as I've come to understand, EINTR
happens only after a system call blocks, which implies that the kernel received the request to free the descriptor before commencing whatever work got interrupted. So there's no need to loop. Indeed, the SA_RESTART
signal behavior does not apply to close
and generate such a loop by default, precisely because it is unsafe.
This is a standard library bug then, right? On every file ever closed by a C++ application.
EDIT: To avoid causing too much alarm before some guru comes along with an answer, I should note that close
only seems to be allowed to block under specific circumstances, perhaps none of which ever apply to regular files. I'm not clear on all the details, but you should not see EINTR
from close
without opting into something by fcntl
or setsockopt
. Nevertheless the possibility makes generic library code more dangerous.
With respect to POSIX, R..'s answer to a related question is very clear and concise: close()
is a non-restartable special case, and no loop should be used.
This was surprising to me, so I decided to describe my findings, followed by my conclusions and chosen solution at end.
This is not really an answer. Consider this more like the opinion of a fellow programmer, including the reasoning behind that opinion.
POSIX.1-2001 and POSIX.1-2008 describe three possible errno values that may occur: EBADF
, EINTR
, and EIO
. The descriptor state after EINTR
and EIO
is "unspecified", which means it may or may not have been closed. EBADF
indicates fd
is not a valid descriptor. In other words, POSIX.1 clearly recommends using
if (close(fd) == -1) {
/* An error occurred, see 'errno'. */
}
without any retry looping to close file descriptors.
(Even the Austin Group defect #519 R.. mentioned, does not help with recovering from close()
errors: it leaves it unspecified whether any I/O is possible after an EINTR
error, even if the descriptor itself is left open.)
For Linux, the close()
syscall is defined in fs/open.c, with __do_close()
in fs/file.c managing the descriptor table locking, and filp_close()
back in fs/open.c taking care of the details.
In summary, the descriptor entry is removed from the table unconditionally first, followed by filesystem-specific flushing (f_op->flush()
), followed by notification (dnotify/fsnotify hook), and finally by removing any record or file locks. (Most local filesystems like ext2, ext3, ext4, xfs, bfs, tmpfs, and so on, do not have ->flush()
, so given a valid descriptor, close()
cannot fail. Only ecryptfs, exofs, fuse, cifs, and nfs have ->flush()
handlers in Linux-3.13.6, as far as I can tell.)
This does mean that in Linux, if a write error occurs in the filesystem-specific ->flush()
handler during close()
, there is no way to retry; the file descriptor is always closed, just like Torvalds said.
The FreeBSD close()
man page describes the exact same behaviour.
Neither the OpenBSD nor the Mac OS X close()
man pages describe whether the descriptor is closed in case of errors, but I believe they share the FreeBSD behaviour.
It seems clear to me that no loop is necessary or required to close a file descriptor safely. However, close()
may still return an error.
errno == EBADF
indicates the file descriptor was already closed. If my code encounters this unexpectedly, to me it indicates there is a significant fault in the code logic, and the process should gracefully exit; I'd rather my processes die than produce garbage.
Any other errno
values indicate an error in finalizing the file state. In Linux, it is definitely an error related to flushing any remaining data to the actual storage. In particular, I can imagine ENOMEM
in case there is no room to buffer the data, EIO
if the data could not be sent or written to the actual device or media, EPIPE
if connection to the storage was lost, ENOSPC
if the storage is already full with no reservation to the unflushed data, and so on. If the file is a log file, I'd have the process report the failure and exit gracefully. If the file contents are still available in memory, I would remove (unlink) the entire file, and retry. Otherwise I'd report the failure to the user.
(Remember that in Linux and FreeBSD, you do not "leak" file descriptors in the error case; they are guaranteed to be closed even if an error occurs. I am assuming all other operating systems I might use behave the same way.)
The helper function I'll use from now on will be something like
#include <unistd.h>
#include <errno.h>
/**
* closefd - close file descriptor and return error (errno) code
*
* @descriptor: file descriptor to close
*
* Actual errno will stay unmodified.
*/
static int closefd(const int descriptor)
{
int saved_errno, result;
if (descriptor == -1)
return EBADF;
saved_errno = errno;
result = close(descriptor);
if (result == -1)
result = errno;
errno = saved_errno;
return result;
}
I know the above is safe on Linux and FreeBSD, and I assume it is safe on all other POSIX-y systems. If I encounter one that is not, I can simply replace the above with a custom version, wrapping it in a suitable #ifdef
for that OS. The reason this maintains errno
unchanged is just a quirk of my coding style; it makes short-circuiting error paths shorter (less repeated code).
If I am closing a file that contains important user information, I will do a fsync()
or fdatasync()
on it prior to closing. This ensures the data hits the storage, but also causes a delay compared to normal operation; therefore I won't do it for ordinary data files.
Unless I will be unlink()
ing the closed file, I will check closefd()
return value, and act accordingly. If I can easily retry, I will, but at most once or twice. For log files and generated/streamed files, I only warn the user.
I want to remind anyone reading this far that we cannot make anything completely reliable; it is just not possible. What we can do, and in my opinion should do, is to detect when an error occurs, as reliably as we can. If we can easily and with neglible resource use retry, we should. In all cases, we should make sure the notification (about the error) is propagated to the actual human user. Let the human worry about whether some other action, possibly complex, needs to be done before the operation is retried. After all, a lot of tools are used only as a part of a larger task, and the best course of action usually depends on that larger task.