Why 'file as blog entry' blog engines have problems

June 7, 2008

One enduringly popular model for blogging engines is that entries will be files in the filesystem, and the blog engine will just wrap them up in various simple ways. This approach has a clear and attractive simplicity and thus an obvious appeal, but as I have found out from following this route myself this simplicity hides a number of subtle problems.

All of the problems can be summarized in one word: metadata. Blog entries have (or need) quite a lot of of metadata associated with them, and making files your entries does not give you very many good places to put this metadata:

  • you can infer the metadata from things surrounding the file itself; for example, you can infer the entry's publication date from the file's last modification time.

    There are two problems with this: first, this doesn't cover all of the metadata you need, and second, this creates awkward problems when the blog engine's use of file metadata clashes with things you want to do with the file, such as updating an entry without changing its publication date.

  • you can embed the metadata in the file itself, but this clutters up the file contents and makes authoring more annoying.

    (If you go this route, I suggest putting the metadata at the end of the file, not the start, so that it is at least less obtrusive.)

  • you can have your blog engine invent or learn the necessary metadata the first time it sees a new entry and record it somewhere else so that it's stable.

    The problem here is that you've effectively created a database (or you're using a real one), with all of the associated management issues, but things are half in the database and half outside for extra fun.

One popular additional non-answer is that you can decide to ignore the need for certain sorts of metadata; however, this will limit what your blog engine can do and periodically cause explosions.

(Ironically, a stable master identifier for entries is one of the easier things to arrange; if the filename is otherwise meaningless, and it probably is since you can't really use it as the entry title, you can just use it for the master identifier. Of course this will give you a blog directory full of peculiarly named things that you can never rename, but that's life without metadata.)

Written on 07 June 2008.
« Why shells should have small programming languages
Recovering my Eee PC from a post-update problem »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Jun 7 00:34:01 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.