How to implement something similar to “truncateat”

2019-08-19 06:58发布

问题:

While researching this question I came across the fact that in POSIX (and Linux) there simply is not a truncateat system call.

Certain system calls like for instance unlink have an equivalent alternative method with an added at suffix at the end of their names, i.e. unlinkat. The difference between those methods is that the variations with the at suffix accept an additional argument, a file descriptor pointing to a directory. Therefore, a relative path passed into unlinkat is not relative to the current working directory but instead relative to the provided file descriptor (an open directory). This is really useful under certain circumstances.

Looking at truncate, there only is ftruncate next to it. truncate works on paths - absolute or relative to the current working directory. ftruncate directly works on an open file handle - without any path being specified. There is no truncateat.

A lot of libraries (various "alternative" C-libraries) do what I did and mimic tuncateat by using an openat-ftruncate-close-sequence. This works, in most cases, except ...

I ran into the following issue. It took me months to figure out what was happening. Tested on Linux, different 3.X and 4.X kernels. Imagine two processes (not threads):

  • Process "A"
  • Process "B"

Now imagine the following sequence of events (pseudo code):

A: fd = open(path = 'filename', mode = write)
A: ftruncate(fd, 100)
A: write(fd, 'abc')
B: truncate('filename', 200)
A: write(fd, 'def')
A: close(fd)

The above works just fine. Just after process "A" has the file opened, set its size to 100 and written some stuff into it, process "B" re-sets its size to 200. Then process "A" continues. At the end, the file has a size of 200 and contains "abcdef" at its beginning followed by zero-bytes.

Now, let's try and mimic something like truncateat:

A: fd_a = open(path = 'filename', mode = write)
A: ftruncate(fd_a, 100)
A: write(fd_a, 'abc')
B: fd_b = openat(dirfd = X, path = 'filename', mode = write | truncate)
B: ftruncate(fd_b, 200)
B: close(fd_b)
A: write(fd_a, 'def')
A: close(fd_a)

My file has a length of 200, ok. It starts with three zero-bytes, not ok, then the "def", then then again zero-bytes. I have just lost the first write from process "A" while the "def" technically landed at the correct position (three bytes in, as if I had called seek(fd_a, 3) before writing it).

I can work with the first sequence of operations just fine. But in my use case, I can not rely on paths relative the current working directory as far as process "B" is concerned. I really want to work with paths relative to a file descriptor. How can achieve that - without running into the issue demonstrated in the second sequence of operations? Calling fsync from process "A" just after write(fd_a, 'abc') does not solve this.

回答1:

The reason why your second case overwrites everything with zeroes is that mode=truncate (i.e. openat(.., O_TRUNC)) will first truncate the file to length 0.

If you instead ftruncate to 200 immediately without first truncating to 0, the existing data up until that point will remain untouched.