2010-05-22
Why I'm wrong about what sort of APIs C's stdargs allows
One of the things that blogging gives me is the chance to be very
wrong in public. Yesterday, I claimed that
C's stdargs didn't let you peel some arguments off the front of a
va_list and then pass the shortened list to another function,
such as vprintf(). Well, no, and now I'll tell you why I'm wrong.
I'm clearly wrong in practice on x86 Unix machines with gcc, as a simple
test program easily demonstrated once a commentator raised doubts and
I bothered to check. But I also believe that I'm wrong even in theory
and that this sort of manipulation of va_list likely has to be
supported by any spec-compliant C compiler. While this is not spelled
out directly in the documentation I've read, I think that it arises by
implication from things like the Single Unix Specification stdarg page.
(A disclaimer: I haven't read the ANSI/ISO C standard, so this may be clearly spelled out there.)
First, assume that you can write a standards-compliant C function
that accepts a va_list argument and works directly with it, an
equivalent of vprintf() or the like. The only way this function has
to extract arguments from the va_list is with va_arg(),
which it's allowed to use. va_arg() requires that you first call
va_start().
However, our va_list receiving function cannot call
va_start() itself; va_start() must be invoked with the
identifier of the rightmost parameter before the ... in the varargs
function definition, which exists only in the context of the caller of
our function. So the caller must call va_start() before invoking
our function, and in fact is the only function that can. And once you
call va_start(), the behavior of va_arg() is quite well
specified and contains no mention of the va_list being reset when
you call a function; each time you call va_arg(), you advance the
va_list to the next parameter (until you run out).
Hence I believe that the C standard almost certainly requires that if
you call va_arg() and then pass the va_list to another
function, that function sees the va_list just as you would and gets
the same results from calling va_arg() that you would. Peeling
arguments off your va_list and then calling a v* function with the
remainder is perfectly spec-compliant behavior.
This still leaves the C stdarg stuff moderately constrained, but it's less constrained than I thought.
Sidebar: Why you have to reset va_list after function calls
va_list is an opaque type that is effectively an iterator, and
implementations are free to make it have internal state that is
manipulated by va_arg(). Thus, when you call a function and pass
it a va_list, that function may manipulate the internal state of
your iterator and leave it in some random state, or simply at the end of
the varargs parameters. So you have to reset it in order to be able to
use it again yourself.
(This idea is achingly familiar to anyone who has ever passed iterators around in Python.)
2010-05-21
An example of an API that you can't do with C stdargs
(This is a followup to yesterday's entry.)
Update: I'm actually completely wrong about this, both in practice on x86 machines with gcc and, I believe, in standards-compliant theory, as brought to my attention by nothings in comments.
One general class of impossible-in-C varargs APIs is heavily polymorphic dispatch APIs where you need to dispatch to one of several different varargs functions that have different constant arguments. Here is a gratuitous (and probably bad) API example:
void maybeLog(int what, int where, ...)
This acts as a master function that determines whether or not something
gets logged to a particular output destination and then logs it if
it is. Possible output destinations include standard output, some
file descriptor (including standard error), or syslog, and in each
case the remaining arguments are the arguments that you would pass to
the respective normal output function (printf(), fprintf(), and
syslog()) if you were calling it directly.
This API cannot be implemented in (standard) C because each of the
output functions you want has different fixed arguments, and you cannot
extract and remove the fixed arguments from the va_list in order
to call one of the v* versions with the correct arguments. The only
possibly prototype for this is:
void maybeLog(int what, int where, FILE *fp, int priority, char *format, ...)
(maybeLog can then call the appropriate v* function with its required
fixed arguments plus the va_list; it ignores fixed arguments not
needed by the output function.)
In other words, you're supplying the union of all fixed arguments potentially needed by any output function. This has several drawbacks, including the lack of extensibility if you want to add another output destination that needs a new set of fixed arguments.
(I believe that the usual approach to doing this kind of API in C is to use a preprocessor macro. This has its own problems.)
2010-05-20
The limitations of C's varargs support
I've noted before that dealing with varargs functions in C is vaguely annoying (although not as annoying as in the days before the ANSI C stdarg support). One of those annoyances is that C's stdarg support is limited compared to what you can do in languages with first class support for variable argument counts.
The core limitation in C is that you can't manipulate the list of
variable arguments that your function gets. Instead you have two
choices; you can pass it completely unaltered to a function that takes a
va_list argument, or you can entirely consume it yourself. You can't
add things to the list, remove things from the list, or create your own
va_list from scratch containing a mixture of things, and in turn
this means that you can't create certain sorts of varargs APIs in C.
(Whether these APIs are a good idea is another question.)
Update: I'm somewhat wrong about this, see CStdargWhyWrong.
You can effectively pop stuff off the front of a va_list before
you pass it to another function.
To answer the obvious question: by first class support for variable arguments, I mean that you can both manipulate (and create) varargs argument lists and call an arbitrary function with a varargs argument list, whether you got it because you're a varargs function yourself or because you made it up from scratch. C doesn't support either of these because it's constrained by a desire for efficient implementation of varargs support, one that doesn't require dynamic memory allocation or the like.
(In theory a C compiler could support the latter under some
circumstances; basically, it would need to know the called function's
exact argument list and behavior if the va_list was not an exact
match for this would be undefined.)