2010-10-06
What is going on with Samba's POSIX locking on NFS on Linux
Once we looked at the evidence from our Samba problem, it was relatively easy to come up with a broad theory of what was going wrong. All of the signs pointed to some sort of NFS-related locking clash where Samba was applying multiple locks that it (and the local kernel) did not think clashed with each other, but where the NFS server felt that they did clash and rejected one.
As far as I can see, this is indeed what is going on.
Current versions of Samba both flock()
files and (if POSIX locking
is enabled) apply fcntl()
locks to them. This does not conflict for
local files as the two sorts of locks are independent (and this is even
documented in the flock(2)
manpage).
In old versions of the Linux kernel, flock()
didn't work at all over
NFS; flock()
locks were purely local (where they continued to not
conflict with POSIX locks). In Linux 2.6.12, the Linux NFS client was
changed to make flock()
locks work over NFS by quietly converting
flock()
locks to server side POSIX locks; this conversion happens
in the depth of the kernel NFS client, and the local kernel's general
locking layer is unaware of it. Since two POSIX locks can obviously
conflict with each other, this conversion means that from 2.6.12 onwards
flock()
locks now conflict with fcntl()
locks on NFS filesystems.
Hence the common symptom of the problem: if you upgraded your system such that you crossed over the 2.6.12 version boundary, Samba's dual locking went from non-conflicting to conflicting (on NFS filesystems only) if the Windows client program made just the right sort of locking requests. Evidently Office 2003's 'save in Office 2003 format' code does so, and other programs do not.
(I believe that Samba takes a read flock()
when clients open files,
so from the symptoms it looks like Office 2003 was trying to acquire a
write lock of some sort when you opened or saved the file. I can see
why this was changed in subsequent versions of Office.)