2014-04-17
Partly getting around NFS's concurrent write problem
In a comment on my entry about NFS's problem with concurrent writes, a commentator asked this very good question:
So if A writes a file to an NFS directory and B needs to read it "immediately" as the file appears, is the only workaround to use low values of actimeo? Or should A and B be communicating directly with some simple mechanism instead of setting, say, actimeo=1?
(Let's assume that we've got 'close to open' consistency to start with, where A fully writes the file before B processes it.)
If I was faced with this problem and I had a free hand with A and
B, I would make A create the file with some non-repeating name and
then send an explicit message to B with 'look at file <X>' (using eg
a TCP connection between the two). A should probably fsync()
the
file before it sends this message to make sure that the file's on the
server. The goal of this approach is to avoid B's kernel having any
cached information about whether or not file <X> might exist (or what
the contents of the directory are). With no cached information, B's
kernel must go ask the NFS fileserver and thus get accurate information
back. I'd want to test this with my actual NFS server and client just
to be sure (actual NFS implementations can be endlessly crazy) but I'd
expect it to work reliably.
Note that it's important to not reuse filenames. If A ever reuses a filename, B's kernel may have stale information about the old version of the file cached; at the best this will get B a stale filehandle error and at the worst B will read old information from the old version of the file.
If you can't communicate between A and B directly and B operates by scanning the directory to look for new files, you have a moderate caching problem. B's kernel will normally cache information about the contents of the directory for a while and this caching can delay B noticing that there is a new file in the directory. Your only option is to force B's kernel to cache as little as possible. Note that if B is scanning it will presumably only be scanning, say, once a second and so there's always going to be at least a little processing lag (and this processing lag would happen even if A and B were on the same machine); if you really want immediately, you need A to explicitly poke B in some way no matter what.
(I don't think it matters what A's kernel caches about the directory, unless there's communication that runs the other way such as B removing files when it's done with them and A needing to know about this.)
Disclaimer: this is partly theoretical because I've never been trapped in this situation myself. The closest I've come is safely updating files that are read over NFS. See also.