Some notes on understanding how to use flock(1)

August 4, 2019

The flock(1) command has rather complicated usage, with a bunch of options, which makes it not entirely clear how to use it for shell script locking in various different circumstances. Here are some notes on this, starting with understanding what it's doing and the implications of that.

The key to understanding all of flock(1)'s weird options is to know that flock(2) automatically releases your lock when the last copy of the file descriptor you locked is closed, and that file descriptors are shared with child processes. Given this, we can start with the common basic flock(1) usage of:

flock -n LOCKFILE SHELL-SCRIPT [ARGS ...]

flock(1) opens LOCKFILE, locks it with flock(2), and then starts the shell script, which will inherit an open (and locked) file descriptor for LOCKFILE. As long as the shell script process or any sub-process it starts still exists with that file descriptor open, the lock is held, even if flock(1) itself is killed for some reason.

This is generally what you want; so long as any component of the shell script and the commands it runs is still running, it's potentially not safe to start another copy. Only when everything has exited, flock included, is the lock released.

However, this is perhaps not what you want if flock is used to start a daemon that doesn't close all of its file descriptors, because then the daemon will inherit the open (and locked) file descriptor for LOCKFILE and the lock will never be released. If this is the case, you want to start flock with the -o option, which does not pass the open file descriptor for LOCKFILE to the commands that flock winds up running:

flock -n -o LOCKFILE SHELL-SCRIPT [ARGS ...]

Run this way, the only thing holding the lock is flock itself. When flock exits (for whatever reason), the file descriptor will be closed and the lock released, even if SHELL-SCRIPT is still running.

(Of course, having a daemon inherit an open and locked file descriptor for LOCKFILE is a convenient way to only have one copy of the daemon running. As long as the first copy is still running, further attempts to get the lock will fail; if it exits, the lock is released.)

The final usage is that flock(1) can be directly told the file descriptor number to lock. In order to be useful, this requires some shared file descriptor that will live on after flock exits; the usual place to get this is by redirecting some file descriptor of your choice to or from a file for an entire block of a shell script, like this:

(
flock -n 9 || exit 1
... # locked commands
) 9>/var/lock/mylockfile

This is convenient if you only want to lock some portions of a shell script or don't want to split a shell script into two, especially since the first will just be 'flock -n /var/lock/mylockfile script-part-2'. On the other hand, it is sort of tricky and clever, perhaps too clever. I'd certainly want to comment it heavily in any shell script I wrote.

However, you don't necessarily have to go all the way to doing this if you just want to flock some stuff that involves shell operations like redirecting files and so on, because you can use 'flock -c' to run a shell command line instead of just a program:

flock -n LOCKFILE -c '(command1 | command2 >/some/where) && command3'

This can also get too tricky, of course. There's only so much that's sensible to wedge into a single shell command line, regardless of what's technically possible.

Once you're locking file descriptors, you can also unlock file descriptors with 'flock -u'. This is probably useful mostly if you're going to unlock and then re-lock, and that probably wants you to be using flock without the '-n' option for at least the re-lock. I imagine you could use this in a shell script loop, for example something like:

(
for file in "$@"; do
  flock 9; big-process "$file"; flock -u 9
  more-work ...
done
) 9>/var/lock/mylockfile

This would allow more-work to run in parallel with another invocation's big-process, while not allowing two big-process's to be running at once.

(This feels even more tricky and clever than the basic usage of flock'ing a file descriptor in a shell '( ... )' block, so I suspect I'll never use it.)


Comments on this page:

By John Wiersba at 2019-08-05 16:26:26:

Wouldn't that last example be better written as

for file in "$@"; do
  flock /var/lock/mylockfile big-process "$file"
  more-work ...
done

Or use your previous subshell idiom locally around the big-process.

By cks at 2019-08-05 20:48:24:

I think you're right. In thinking about it more, I suspect that sensible cases for 'flock -u' are hard to illustrate in a small example, as I tried here. It's probably only sensible if there's a bunch of shell code that has to run with the lock held and you're only unlocking for moderate amounts of time in the middle (especially if you only unlock some of the time). It's hard to come up with cases where you wouldn't split your shell code up into blocks, some locked and some not:

(
flock 9
....
) 9>/var/lock/mylockfile

# unlocked section
...

# back to locked
(
flock 9
....
) 9>/var/lock/mylockfile
By George Shuklin at 2019-08-07 16:15:18:

Thank you a lot for a great overview. I've known about it, but was always deterred from using it because of the complexity of the manual. Now the picture become clearer.

Written on 04 August 2019.
« Sharing file descriptors with child processes is a clever Unix decision
dup(2) and shared file descriptors »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Aug 4 22:19:29 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.