While researching this question I came across the fact that in POSIX (and Linux) there simply is not a truncateat
system call.
Certain system calls like for instance unlink
have an equivalent alternative method with an added at
suffix at the end of their names, i.e. unlinkat
. The difference between those methods is that the variations with the at
suffix accept an additional argument, a file descriptor pointing to a directory. Therefore, a relative path passed into unlinkat
is not relative to the current working directory but instead relative to the provided file descriptor (an open directory). This is really useful under certain circumstances.
Looking at truncate
, there only is ftruncate
next to it. truncate
works on paths - absolute or relative to the current working directory. ftruncate
directly works on an open file handle - without any path being specified. There is no truncateat
.
A lot of libraries (various "alternative" C-libraries) do what I did and mimic tuncateat
by using an openat
-ftruncate
-close
-sequence. This works, in most cases, except ...
I ran into the following issue. It took me months to figure out what was happening. Tested on Linux, different 3.X and 4.X kernels. Imagine two processes (not threads):
- Process "A"
- Process "B"
Now imagine the following sequence of events (pseudo code):
A: fd = open(path = 'filename', mode = write)
A: ftruncate(fd, 100)
A: write(fd, 'abc')
B: truncate('filename', 200)
A: write(fd, 'def')
A: close(fd)
The above works just fine. Just after process "A" has the file opened, set its size to 100 and written some stuff into it, process "B" re-sets its size to 200. Then process "A" continues. At the end, the file has a size of 200 and contains "abcdef" at its beginning followed by zero-bytes.
Now, let's try and mimic something like truncateat
:
A: fd_a = open(path = 'filename', mode = write)
A: ftruncate(fd_a, 100)
A: write(fd_a, 'abc')
B: fd_b = openat(dirfd = X, path = 'filename', mode = write | truncate)
B: ftruncate(fd_b, 200)
B: close(fd_b)
A: write(fd_a, 'def')
A: close(fd_a)
My file has a length of 200, ok. It starts with three zero-bytes, not ok, then the "def", then then again zero-bytes. I have just lost the first write from process "A" while the "def" technically landed at the correct position (three bytes in, as if I had called seek(fd_a, 3)
before writing it).
I can work with the first sequence of operations just fine. But in my use case, I can not rely on paths relative the current working directory as far as process "B" is concerned. I really want to work with paths relative to a file descriptor. How can achieve that - without running into the issue demonstrated in the second sequence of operations? Calling fsync
from process "A" just after write(fd_a, 'abc')
does not solve this.