Flock() and fcntl() file locks and Linux NFS (v3)

May 4, 2023

Unix broadly and Linux specifically has long had three functions that can do file locks, flock(), fcntl(), and lockf(). The latter two are collectively known as 'POSIX' file locks because they appear in the POSIX specification (and on Linux lockf() is just a layer over fcntl()), while flock() is a separate thing with somewhat different semantics (cf), as it originated in BSD Unix. In /proc/locks, flock() locks are type 'FLOCK' and fcntl()/lockf() locks are type 'POSIX', and you can see both on a local system.

(In one of those amusing things, in Ubuntu 22.04 crond takes a flock() lock on /run/crond.pid while atd takes a POSIX lock on /run/atd.pid.)

Because they're different types of locks, you can normally obtain both an exclusive flock() lock and an exclusive fcntl() POSIX lock on the same file. As a result of this, some programs adopted the habit of normally obtaining both sorts of locks, just to cover their bases for interacting with other unknown programs who might lock the file.

In the beginning on Linux (before 2005), flock() locks didn't work at all over NFS (on Linux); they were strictly local to the current machine, so two programs on two different machines could obtain 'exclusive' flock locks on the same file. Then 2.6.12's NFS client code was modified to accept flock() locks and silently change them into POSIX locks (that did work over NFS, in NFS v3 through the NLM protocol). This caused heartburn for programs and setups that were obtaining both sorts of (exclusive) locks on the same file, because obviously two POSIX locks conflict with each other and your NFS server will not let you have conflicting locks like that. This change is effectively invisible to the NFS client's kernel, so flock() locks on a NFS mounted filesystem will show up in the client's /proc/locks (and lslocks) as type 'FLOCK'. However, on your NFS server all locks from NFS clients are listed as type 'POSIX' in /proc/locks (and these days they're all 'owned' by lockd), because that is what they are.

(One reason for this is that the NFS v3 NLM protocol doesn't have an idea of different types of locks, apart from exclusive or non-exclusive.)

Unfortunately, this change creates another surprising situation, which is that the NFS server and a NFS client can both obtain an exclusive flock() lock on the same file. Two NFS clients trying to exclusively flock() the same file will conflict with each other and only one will succeed, but the NFS server and an NFS client won't, and both will 'win' the lock (and everyone loses). This is the inevitable but surprising consequence of client side flock() locks being changed to POSIX locks on the NFS server, and POSIX locks not conflicting with flock() locks. From the NFS server's perspective, it's not two flock() exclusive locks on a file; it's one exclusive POSIX lock (from a NFS client) and one exclusive local flock() lock, and that's nominally fine.

In my opinion, this makes using flock() locking dangerous in general, which is unfortunate since the flock command uses flock() and it's pretty much your best bet for locking in shell scripts (see also flock(1)). Flock() is only safe as a potentially cross-machine locking mechanism if you can be confident that your NFS server will never be doing anything except serving files via NFS. If things may be running locally on the NFS server, for example because you moved a very active NFS filesystem to the primary machine that uses it, then flock() becomes dangerous.

It also means that if you have a lock testing program, as I do, you should make it default to either fcntl() or lockf() locks, whichever you find easier, rather than flock() locks. Flock() has the easiest API out of the three locking functions, but it may give you results that are between misleading and wrong if you're trying to use it in a situation where you want to check locking behavior between a NFS server and a NFS client, as I did recently.

(Per nfs(5), you can use the local_lock mount option to make flock() locks purely local again on NFS v3 clients, but this doesn't exactly solve the problem.)

PS: Given the server flock() issue, I kind of wish there was a generic mount option to change flock() locks to POSIX locks, so that you could force this to happen to NFS exported filesystems even on your NFS fileserver. That would at least make the behavior the same on clients and the server.

(This elaborates on a learning experience I mentioned on the Fediverse.)

Written on 04 May 2023.
« Forcefully breaking NFS locks on Linux NFS servers as of Ubuntu 22.04
Some early praise for using drgn for poking into Linux kernel internals »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu May 4 23:13:16 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.