Wandering Thoughts archives

2013-10-21

NFS's problem with (concurrent) writes

If you hang around distributed filesystem developers, you may hear them say grumpy things about NFS's handling of concurrent writes and writes in general. If you're an outsider this can be a little bit opaque. I didn't fully remember the details until I was reminded about them recently so in my usual tradition I am going to write down the core problem. To start with I should say that the core problem is with NFS the protocol, not any particular implementation.

Suppose that you have two processes, A and B. A is writing to a file and B is reading from it (perhaps they are cooperating database processes or something). If A and B are running on the same machine, the moment that A calls write() the newly-written data is visible to B when it next does a read() (or it's directly visible if B has the file mmap()'d). Now we put A and B on different machines, sharing access to the file over NFS. Suddenly we have a problem, or actually two problems.

First, NFS is silent on how long A's kernel can hold on to the write() before sending it to the NFS server. If A close()s or fsync()s the file the kernel must ship the writes off to the NFS server, but before then it may hang on to them for some amount of time at its convenience. Second, NFS has no protocol for the server to notify B's kernel that there is updated data in the file. Instead B's kernel may be holding on to what is now old cached data that it will quietly give to B, even though the server has new data. Properly functioning NFS clients check for this when you open() a file (and discard old data if necessary); I believe that they may check at other times but it's not necessarily guaranteed.

The CS way of putting this is that this is a distributed cache invalidation problem and NFS has only very basic support for it. Basically NFS punts and tells you to use higher-level mechanisms to make this work, mechanisms that mean A and B have to be at least a bit NFS-aware. Many modern distributed and cluster filesystems have much more robust support that guarantees processes A and B see a result much closer to what they would if they ran on the same machine (some distributed FSes probably guarantee that it's basically equivalent).

(Apparently one term of art for this is that NFS has only 'close to open' consistency, ie you only get consistent results among a pool of clients if A closes the file before B opens it.)

NFSWritePlusReadProblem written at 23:56:14; Add Comment

2013-10-10

Sun's NeWS was a mistake, as are all toolkit-in-server windowing systems

One of the great white hopes of the part of the Unix world that never liked X Windows was Sun's NeWS. Never mind all of its practical flaws, all sorts of people held NeWS up as the better way and the bright future that could have been if only things had been different (by which they mean if people had made the 'right' choice instead of settling for X Windows). One of the reasons people often give for liking NeWS is that it put much of the windowing toolkit into the server instead of forcing every client to implement it separately.

Unfortunately for all of these people, history has fairly conclusively shown that NeWS was a mistake. Specifically the core design of putting as much intelligence as possible into the server instead of the clients has turned out to be a terrible idea. There are at least two big reasons for this.

The first is parallelization. In the increasingly multi-core world you desperately want as much concurrent processing as possible and it's much easier to run several clients in parallel than it is to parallelize a single server. Even if you do get equal parallelization, separate clients are inherently more resilient because the operating system intrinsically imposes a strong separation of address space and so on, something that's very hard to get in server where everything is jumbled together.

(I believe that this is one reason that modern X font rendering has been moved from the server to the client. XFT font rendering is increasingly complex and CPU-consuming, so it's better to stick clients with that burden than dump all of it on the server.)

The second is that if you put the toolkit in the server you make evolving the toolkit and its API much more complicated and problematic. The drawback of having everyone use the server toolkit is that everyone has to use the same server toolkit. Well, not completely. You can introduce a mechanism to have multiple toolkit versions and APIs all in the same server and allow clients to select which one they want or need and so on and so forth. The mess of a situation with the current X server and its extensions make a very educational example of what happens if you go down this path; not very much of it is good.

(Some X extensions are in practice mandatory but still must be probed for and negotiated by the clients, while others are basically historical relics but they still can't be dropped because some client somewhere may ask for them.)

Toolkits in the client push the burden of dealing with the evolution of the toolkit into the clients. It is clients that carry around old or new versions of the toolkit, with various different APIs, and you naturally have old toolkit versions (and even old toolkits) go away entirely when they are no longer used by any active clients (or even any installed clients, when things get far enough).

(I'm ignoring potential security issues for complex reasons, but they may be a good third reason to be unhappy with server-side toolkits.)

NeWSWasAMistake written at 00:53:05; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.