Some notes on Linux's /proc/locks listing of file locks

March 24, 2022

As covered in the proc(5) manual page, /proc/locks lists current flock(2) and fcntl(2) file locks (and also lockf(3) locks, because those are actually fcntl() locks). Unsurprisingly, this /proc file is the fundamental source of information used by lslocks(8), and so understanding what appears in /proc/locks and what doesn't can tell you something about what lslocks can show you.

The basic format of a /proc/locks line looks like this:

4: POSIX  ADVISORY  READ  13 00:2f:72542 0 EOF
5: FLOCK  ADVISORY  WRITE 17503 00:a2:1912512 0 EOF
6: POSIX  ADVISORY  WRITE 2231 00:16:681 0 EOF

(This is mashing up /proc/locks entries from several systems for illustrative purposes.)

As covered in proc(5), the interesting two fields are generally the fifth field, the "process ID" of what is holding the lock, and the sixth field, which identifies the filesystem and inode of what is locked.

The manual page calls the first two subfields of the fifth field the 'major' and 'minor' device numbers of the device with the filesystem. This is wrong. It is actually the major and minor device numbers of the st_dev of files on this filesystem, which can be completely made up for filesystems that don't correspond to a single specific device. Such filesystems include NFS mounts, tmpfs filesystems such as everything in /run, ZFS filesystems, and perhaps BTRFS filesystems. To find what filesystem this actually is, you consult the third field of every filesystem in /proc/self/mountinfo to find a match, with the complication that /proc/locks lists these numbers in hex while mountinfo has them in decimal.

The listed process ID is subject to two issues. First, on NFS servers the process ID for NFS client locks is a somewhat random kernel thread, and has no use other than alerting you to the fact that you have a NFS client lock. That is the case for the '4:' lock in my example output. On an NFS server, different NFS client locks may be 'owned' by different process IDs. Second, sometimes the claimed process doesn't exist any more. This is the case for the '5:' lock on the system I took it from; it appears that the process acquired a lock, passed the locked, open file descriptor to several sub-processes, and then exited.

It would be nice if /proc/locks on NFS servers reflected all current locks held by NFS clients. Unfortunately it does not. Not all current locks held by NFS clients (and known by the NFS server) are listed in /proc/locks, and I don't know what determines which ones make it in and which ones don't. This has the unfortunate consequence that looking at /proc/locks on one of our NFS fileservers will not authoritatively tell us if a process on some NFS client has a particular file locked.

(The usual source of information and control of this sort of NFS server stuff is in /proc/net/rpc (as mentioned in past entries 1, 2, 3 and 4). Unfortunately as far as I'm aware, server lock information is not exposed anywhere. There is /proc/sys/sunrpc/nlm_debug if you're desperate, but I haven't looked into this. The flags that can be set there are documented in include/linux/lockd/debug.h, but I don't know if any of them will let you determine if a file is locked or what client is doing it. At this point I should consider exploring drgn, because it's apparently good for poking around the Linux kernel and the information has to be in there somewhere on the NFS server.)

As far as I know, /proc/locks is authoritative on NFS clients for what locks that particular client holds. The PID may not be accurate (see above), but if there is no entry for the right inode number (on the right filesystem, if applicable), the NFS client doesn't have that file locked.

Sidebar: forcing unlocking on NFS fileservers

See this interesting linux-nfs thread, which led me to this Red Hat document on some NFSD procfs files, /proc/fs/nfsd/unlock_ip and /proc/fs/nfsd/unlock_filesystem. Note that you can't unlock a specific client IP through this interface.

(Yes, yes, I know, use NFS v4. We are attached to NFS's standard trusted client Unix UID authentication, and do not want to change that. The back and forth in the Ubuntu NFSv4Howto wiki page makes for somewhat alarming reading, at least for a sysadmin who wants all of this to just work.)

Comments on this page:

From at 2022-03-25 01:52:04:

We are attached to NFS's standard trusted client Unix UID authentication, and do not want to change that.

From my short tests with sec=sys, that still works the same way in NFSv4 (partly because it's RPC-layer anyway). It's still the default on exports unless one explicitly requests krb5.

The changes in NFSv4 are how stat() results get transmitted (the server now sends username@domain rather than integer UIDs), but this just means your servers need to run rpc.idmapd to allow it to translate back and forth, and all servers & clients need to agree on the configured domain in idmapd.conf. Most likely in your environment it'll already work by default.

(Clients no longer need to run rpc.idmapd as a daemon though; they can use a direct upcall to nfsidmap for the same purpose.)

Written on 24 March 2022.
« Document your mistakes and then try to block them in the future
Some notes on lslocks, the Linux command to list current file locks »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Mar 24 23:55:12 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.