Concurrency is tricky

September 13, 2005

Concurrency may or may not be hard (I know people who disagree with me on that). But I am sure that it is tricky. As an illustration, I just fixed a DWiki concurrency bug that I first spotted in MyFirstCommentSpam.

For simplicity, DWiki stores each comment as a file in a directory hierarchy that mirrors the page's DWiki path; if you comment on the DWiki page /foo/bar, the comment will be a file in a /foo/bar/ directory (under a separate top-level directory). DWiki makes these directory hierarchies on demand; the first time someone comments on a DWiki page, DWiki makes all of the elements of the comment directory hierarchy that don't already exist.

DWiki does this with code like this:

try:
  if not os.path.isdir(loc):
    os.makedirs(loc)
except EnvironmentError, e:
  raise ... an internal error

(os.makedirs() conveniently makes all of the directories in one shot, like 'mkdir -p'.)

This is perfectly ordinary code and I didn't think twice about it. Except that there's a concurrency problem: if two (or more) comments on the same page are posted at the same time, and this is the first time the page has been commented on, they can race in this small section. Both can see no directory, then start os.makedirs(), but only one will actually make it; the other one will eventually try to mkdir() a directory that already exists, which is an error.

The truly reliable cure requires much more complicated code, because you cannot just do:

try:
  os.makedirs(loc)
except EnvironmentError, e:
  pass
# ... do things

The problem is that os.makedirs() can fail due to intermediary directories too. If processes A and B are both trying to make all directories in /foo/bar/baz/:

  1. process A makes /foo/.
  2. process A is preempted
  3. process B tries to make /foo/. Because it exists, os.makedirs() errors out.
  4. process B continues on, assuming that /foo/bar/baz/ now exists.
  5. incorrectness ensues.

The concurrency problem with the simple solution is not in my code, it's in how os.makedirs() is implemented. To know about it, I have to either examine the code or try experiments, and producing concurrency races on demand is not trivial. (Fortunately, I can examine the code in this situation.)

Concurrency is tricky because it's easy to overlook cases. And it's not just your code that matters, it's also all the library routines or standard modules that your code depends on. And the authors may not have so much overlooked cases as considered them outside the specification.

Sidebar: the concurrency safe makedirs()

The trick is to modify os.makedirs() slightly to only raise an error after os.mkdir() if the target directory still isn't there, since that's the condition that we really care about. The result:

def makedirs(name):
  head, tail = os.path.split(name)
  if not tail:
    head, tail = os.path.split(head)
  if head and tail and not os.path.exists(head):
    makedirs(head)
  try:
    os.mkdir(name)
  except EnvironmentError:
    if not os.path.isdir(name):
      raise
Written on 13 September 2005.
« The problem of being overcautious
Why I really dislike the Singleton design pattern »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Sep 13 22:41:16 2005
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.