My issues with primary keys are not Django specific

February 2, 2011

A commentator on my recent entry about learning from my Django schema design experience wrote, about me abandoning explicit primary keys:

I'm not sure your data will be best served by giving up the database schema you want in deference to the conventions and limitations of the (Django ORM) framework.

I want to clarify something: the issues in my schema that are driving me away from using explicit primary keys and to surrogate keys are not Django specific.

I've now designed two database schemas, one purely in SQL and one in Django schema. One of the major differences between them is where most of the data came from and how immutable it was once entered. In my pure SQL design most of the data (and almost all of the primary keys) came from automated systems elsewhere. One primary key was entered by users, but it had to be immutable once entered. It's easy to use explicit primary keys in a database schema like this, because errors and changes are rare (you hope) and you can't deal with them anyways; they have to be dealt with by the people and the upstream systems that build data for you.

(Attempts to fix errors and change things locally will just result in you getting out of sync with your data sources. This rarely ends well.)

My current schema is not like this. Essentially all of the data is entered by hand directly into my database and so is subject to errors, which we are on the hook to correct. Many of the natural primary keys actually are subject to change, even if that change is infrequent (people's names do change, and so sometimes do their logins, for good reason). One otherwise natural primary key can sometimes be null, and this is not supported for UNIQUE fields by all database engines (even apart from other issues). This database design is simply not a good fit for explicit primary keys, no matter what ORM or SQL database I use for it. If I did use explicit primary keys, I would expect to be doing a reasonable amount of re-stitching foreign key relationships every so often.

(This assumes that I do not make entry lifetime mistakes like linking my audit table entries directly to account requests. That too would be a problem in any database.)

So, in summary: not all database situations can sensibly use explicit primary keys. Sometimes surrogate keys really are the right choice, not something 'forced' on you by using inferior or awkward technology.

Written on 02 February 2011.
« There are two sorts of standards in the world
Django and primary keys versus surrogate keys »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Feb 2 02:29:50 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.