Understanding SQL placeholders

May 27, 2013

As I got wrong once before, the correct way to secure your SQL against SQL injection attacks is not to try to quote all your arguments (even if you do it in a library) but to use SQL placeholders instead. This raises the obvious question of what are SQL placeholders and how do they differ from correctly quoting your SQL arguments.

The simple answer is that SQL placeholders are transmitted to the SQL server separately from your SQL statements. To make a terrible analogy, SQL placeholders are much like using exec*() and separate arguments to run Unix programs securely. In both cases the underlying protocol itself insures that arguments can never be confused with the SQL command or the program to be run.

(Now, a caution: in order for this to work the client libraries, the SQL server, and the communication protocol between the two all have to support SQL placeholders. I believe that everything you want to use as a SQL database these days properly supports placeholders, but I haven't exhaustively checked.)

SQL placeholders are also part of prepared SQL statements. Prepared statements are basically pre-parsed, pre-optimized SQL statements; you send the generic version of the statement to the SQL server, it sets it up once, and then you can repeatedly run it with different specific placeholder values filled in. This avoids the overhead of both the parsing (which I suspect is generally small) and the query optimization (which might be significant for relatively complex queries), but of course it may only pay off if you repeat the same prepared query enough times.

(At the same time any extra overhead may be insignificant for a small application that isn't stressing database performance very much in the first place.)

Single use prepared statements can apparently be slower than non-prepared ones under some circumstances; I suspect that this because you have to make two round trips to the SQL server instead of one before getting your answer. I don't know if any common SQL server can do SQL placeholders outside of prepared statements. Since you really want to use SQL placeholders for security under most circumstances, there isn't really very much you can do about this.

(This is partly the kind of entry I write to get things straight in my own head.)

Sidebar: client library support is tricky

Many languages and environments these days have common, database independent 'database access' APIs (Python has its DB-API, for example). These APIs are usually defined with placeholders for all of the obvious reasons. However this doesn't guarantee that a particular implementation of the DB API for a particular database actually implements SQL placeholders (even if the underlying protocol and database support them). Hopefully the documentation will tell you if the implementation actually supports SQL placeholders or just seems to and is faking them with quoting behind the scenes.

(If not, your SQL database may report statistics that will let you determine this.)

You might ask yourself why on earth anyone would implement a non-placeholder client library if all of the other pieces support SQL placeholders and prepared statements and so on. My impression is that people do it because the non-placeholder communication protocol is easier to implement than supporting prepared statements and placeholders. If they're trying to get some interface, any interface off the ground, it's easier to send straight SQL and hope they get all of the quoting correct.

(Some web searches suggest that there are client libraries like this for young platforms.)

Written on 27 May 2013.
« Our situation with ZFS and 4 Kb physical sector disks
How you should package local-use configuration files »

Page tools: View Source.
Search:
Login: Password:

Last modified: Mon May 27 22:17:08 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.