The pervasive effects of C's malloc() and free() on C APIs
In my entry on the history of looking up host addresses in Unix, I touched on how from the beginning
gethostbyname()
had an issue in its API, one that the BSD Unix people specifically
called out in its manual page's BUGS section:
All information is contained in a static area so it must be copied if it is to be saved. [...]
This became a serious issue when Unix added threads (this static area isn't thread safe), but was seen as a problem from the very beginning. Given that the static return area was known as an issue, why was the API written this way?
While I don't know for sure, I think we can point fingers at the
hassles that dynamic memory allocation brings you in a C API. The
gethostbyname()
API returns a pointer to a 'struct hostent
',
which is (from 4.3 BSD onward):
struct hostent { char *h_name; /* official name of host */ char **h_aliases; /* alias list */ int h_addrtype; /* address type */ int h_length; /* length of address */ char **h_addrs; /* list of addresses */ };
If this structure is dynamically allocated by gethostbyname()
and
returned to the caller, either you need an additional API function
to free it or you have to commit to what fields in the structure
have to be freed separately, and how (ie, this is part of the API).
Having the caller free things is also not all that simple. Since
this structure contains embedded pointers (including two that point
to arrays of pointers), there could be quite a lot of things for
the caller to call free()
on (and in the right order).
This issue isn't unique to gethostbyname()
; it affects any C API
that wants to return (in a conceptual sense) anything more complicated
than a basic type or a simple structure (even in old C, simple
structures can be 'returned' by passing a pointer to the structure
to the function, as is done in stat()
). C offers no good solution
to the problem; either you add one or more 'free' functions to your
API (one per dynamically allocated structure you're returning), or
you document and thus freeze the process for freeing what you
return, or you do what BSD opted to in gethostbyname()
and return
a pointer to something static.
(Documenting what callers have to free implies that you can't later add extra fields to what you return unless they don't have to be freed separately.)
In POSIX, this API issue was eventually worked around with the first
approach, when they added a freeaddrinfo()
function to go with the new getaddrinfo()
. This is the only
particularly good solution, but it does mean that you get an
increasing profusion of 'free something' functions, which serves
as a disincentive to add APIs which would return something where
you'd need such a function.
Comments on this page:
|
|