An obvious realization about SQL foreign keys

January 30, 2011

In the process of writing the last entry, I finally realized something about SQL foreign keys. Specifically, about what you could call their lifetime, by analogy to object lifetimes in garbage collected languages.

Everyone knows (hopefully) that you don't want to have long-lived objects hold references to what should be short-lived objects. In a GC'd environment such (strong) references convert the short-lived objects into long-lived ones; they can't be GC'd before the long-lived object drops its reference to them, and that's normally not going to happen until the long-lived object itself is deleted. Poof, some of your short-term objects are now long-term.

In light of the issue with deleting records from the last entry, it's clear that foreign key relationships need to have a similar respect for entry lifetimes. A table for long-lived entries should not have an explicit foreign key to a table for short-lived ones. Instead, you can only safely have foreign key pointers to tables where the data is at least as long-lived as entries in your table.

(In SQL the problems can implicitly run the other way; rather than your short-lived objects being long-lived, your long-lived objects suddenly become short-lived as they are cascade-deleted for you.)

This is especially silly problem to give yourself in SQL. Unlike with objects, in SQL you don't need an explicit foreign key relationship in order to link two tables together. You can trivially construct the SQL equivalent of weak references by just not specifying your field as an explicit foreign key; you still join on it as usual. In ORMs you will generally have to do a little bit more work, but it's not crushingly more.

In retrospect this makes using any explicit foreign keys in my audit table schema an obviously bad idea. Audit table entries are likely to be the longest-lived objects in my database; making them depend on short-lived things like account requests is about as classical an entity lifetime mistake as I could have made. And it's not as if I need to query through the foreign key relationship very often; in fact, not at all in the actual code so far. (As far as the code is concerned, audit entries are write-only.)

I'll definitely have to bear this in mind for any future database schemas that I design.

Comments on this page:

From at 2011-01-30 09:34:30:

This is a very helpful insight. Thank you.

You can have ON DELETE SET NULL on foreign keys, so you don’t lose records, but you’re still losing data, as you can no longer tell whether different audit table rows referred to the same foreign table row or to different ones.

Aristotle Pagaltzis

Written on 30 January 2011.
« Two things I have discovered about Django schema design
A little advantage of Django's automatic primary keys »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jan 30 02:21:18 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.