Your web application should have an audit log

January 18, 2014

One of the smartest things I did when I was writing our web based account request management system was giving it an audit log. Pretty much every time the database gets changed, the web app writes an audit record about it that captures all of the high level details (which user or what automated process, from what IP if applicable, doing what, and so on). There have been two advantages of having this audit log.

The first, obviously, is that it tells you what happened (and why). There have been at least three general situations where this is useful. The obvious one is when you're trying to determine what happened, ie who did what when, so that you can tell people about it. The less obvious one (for me) is checking to make sure that certain things actually did happen, generally things done by automated processes. Finally, the audit log is a great place to get an overview of what's been happening on the system since it's a single spot that sees all activity.

(There are other uses of audit logs, for example generating certain sorts of usage information based on people's logged activities.)

The more subtle advantage of having an audit log in this application is that it's simply reassuring, even if we never actually need it. If we come in some day and the entire thing is a complete mess, I know that we have a reasonable chance of sorting out what happened and maybe why. If a professor has questions or concerns about something, we can see what happened and at least take reasonable guesses about why. We are not going to be left looking at web server logs of POST requests from hither and yon and trying to figure out what might have happened.

In order to make all of this work, the audit log needs to capture not just the database-level changes that were made but also the surrounding context. In my case this is what authenticated user or automated process did the action, what IP address they came from, what part of the application they're using (you can often do a particular low-level DB operation in more than once place), and so on. The important thing is that you be able to reconstruct not just what happened but who was doing what at the time.

(I don't actually log the full DB-level changes, just what I consider to be the important information from them. Possibly I should and maybe next time I will; there has been a time or two when I wanted somewhat more information than the audit log provided. On the other hand it's a lot easier to not have to figure out how to encode or embed everything into the audit log.)

PS: There are much more elaborate and complete ways to audit database changes and capture information. My perception is that they generally take more work than simply writing audit log records every time you change things. In our web app the audit log is just another DB table that is basically plain text and the records are generated by hand.

PPS: Don't put foreign key fields into your audit records. Really. Learn from my mistakes.

Written on 18 January 2014.
« Link: Armin Ronacher's 'More About Unicode in Python 2 and 3'
Some thoughts on structured logging, especially in and for databases »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Jan 18 02:37:33 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.