A fun bug I once ran across

October 9, 2009

Here's an interesting and fun bug (in C) that I once ran across, years ago. Suppose that you have a hash table, with entries chained off each other in case of a hash collision, and a function to remove an element from it that looks like this:

void remove(elem *mp)
{
  elem *np;

  np = table[HASH(mp)];
  if (np == mp) {
    table[HASH(mp)] = mp->next;
    return;
  }

  while (np != NULL) {
    if (np->next == mp) {
      np->next = mp->next;
      break;
    }
  }
  return;
}

Since I have told you that there's a bug here, you can probably spot it; the while loop doesn't actually traverse down the chain (it's missing an 'np = np->next;' at the bottom).

What makes this an interesting bug is that this mistake is actually somewhat obscure. The infinite loop only happens if you have a chain that's at least three elements long and you're not trying to remove the first or second element, so how often the bug triggers depends on how sparsely populated the hash table is and on the removal pattern for elements.

In the actual case where I ran across this, it was in fact relatively rare for the infinite loop to trigger, or at least it was rare enough that this code could pass testing and be deployed before it started blowing up (and it did not blow up widely).

(If you check the code coverage of your tests, this is one of the cases where it is important to look for not just 100% code coverage but that both branches of all decision points have been followed. Technically you can get 100% code coverage here without exposing the bug since the problem is missing code, not incorrect code.)

Written on 09 October 2009.
« What is going on with faulted ZFS spares
You should delete obsolete data files »

Page tools: View Source.
Search:
Login: Password:

Last modified: Fri Oct 9 00:17:47 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.