How does this canonical flock example work?

2020-03-22 11:10发布

问题:

When one must synchronize programs (shell scripts) via file system, I have found an flock-based solution to be recommended (should also work on NFS). The canonical example for usage from within a script (from http://linux.die.net/man/1/flock) is:

(
flock -s 200

# ... commands executed under lock ...

) 200>/var/lock/mylockfile 

I don't quite get why this whole construct ensures atomicity. In particular, I am wondering in which order flock -s 200 and 200>/var/lock/mylockfile are executed when e.g. bash executes these lines of code. Is this order guaranteed/deterministic? The way I understand it, it must be deterministic if this idiom should work. But since a sub shell is spawned in a child process, I do not understand how these two processes synchronize themselves. I only see a race condition between these two commands already.

I would appreciate if someone could make my confusion about this disappear and explain why this construct can be used to safely synchronize processes.

At the same time, if someone knows, I would be interested in how safe it is to chose just some arbitrary file descriptor (such as 200 in the example), especially in the context of a large NFS file system with many clients.

回答1:

The whole I/O context of the sub-shell (...) 200>/var/lock/mylockfile has to be evaluated — and the I/O redirection done — before any commands can be executed in the sub-shell, so the redirection always precedes the flock -s 200. Think about if the sub-shell had its standard output piped to another command; that pipe has to be created before the sub-shell is created. The same applies to the file descriptor 200 redirection.

The choice of file descriptor number really doesn't matter in the slightest — beyond it is advisable not to use file descriptors 0-2 (standard input, output, error). The file name matters; different processes could use different file descriptors; as long the name is agreed upon, it should be fine.