The advantage of garbage collection for APIs

April 21, 2010

Here's a question that interests me: why don't I have a standard C version of my Python warn() function? The Python version shows up in most programs I write, but in my C programs I generally directly fprintf() errors to stderr, instead of having a utility function for it.

Part of the reason is that a C version of warn() really calls for a different and more complicated set of arguments. Instead of a single string, the natural C-ish approach is printf()-like, creating a prototype of 'warn(const char *fmt, ...)'. In turn this makes warn() a non-trivial function, because varargs functions in C are vaguely annoying.

(Things get even more interesting when one implements die(), since you can't have it call warn(); either you duplicate warn()'s code or you wind up needing a vwarn() with the prototype 'vwarn(const char *fmt, va_list ap)' that both warn() and die() call.)

But why does the C version of warn() need a different API than the Python one? One answer is that Python has a first class operator to format printf-like strings (the string '%' operator, in a marvelous abuse of operator overloading) and C doesn't, so in Python it's idiomatic to format strings directly in your code, instead of handing the format and the arguments to another function. This isn't the whole story, though, since C has sprintf() and could do much of the same tricks if people wanted to.

The problem with sprintf() (and why in practice it is not used for this) is that you have to give it a buffer. If you use sprintf() you have to think about how big a buffer you need and what you do if it isn't big enough. If you use fprintf(), you don't have to think about all of that, so people use fprintf(); in C, people will go to a lot of effort to avoid having to think about buffer issues.

And the reason that sprintf() needs you to supply the buffer is so that no one has to worry about allocating and especially deallocating the buffer, because memory management is a pain in the rear in C. Not just because you have to do it, but also because you have to decide where it's done and which function has to look after it and this generally complicates the code. (For example, warn() can't just free the string it's passed, which means that you can't write 'warn(strformat(fmt, ...))' because this leaks the string.)

Thus I come to the conclusion:

A subtle advantage of garbage collection is that it enables APIs that would otherwise be impossible, or at least dangerous.

Garbage collection is the essential glue that enables my Python versions of warn() and die() to have their simple APIs, because it is what makes Python's '%' operator convenient (among many other Python APIs). You could implement all of the operations of the Python version in C, but you shouldn't; in a non-GC'd language, object allocation is to be avoided as much as possible, and certainly turned into someone else's problem.

(Hence sprintf() does not allocate anything; finding a buffer for it is your problem.)

Written on 21 April 2010.
« Standard format Unix errors in Python programs
The latest Solaris licensing and support rumbles »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Apr 21 02:15:49 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.