Wandering Thoughts archives


The pervasive effects of C's malloc() and free() on C APIs

In my entry on the history of looking up host addresses in Unix, I touched on how from the beginning gethostbyname() had an issue in its API, one that the BSD Unix people specifically called out in its manual page's BUGS section:

All information is contained in a static area so it must be copied if it is to be saved. [...]

This became a serious issue when Unix added threads (this static area isn't thread safe), but was seen as a problem from the very beginning. Given that the static return area was known as an issue, why was the API written this way?

While I don't know for sure, I think we can point fingers at the hassles that dynamic memory allocation brings you in a C API. The gethostbyname() API returns a pointer to a 'struct hostent', which is (from 4.3 BSD onward):

struct  hostent {
   char  *h_name;     /* official name of host */
   char **h_aliases;  /* alias list */
   int    h_addrtype; /* address type */
   int    h_length;   /* length of address */
   char **h_addrs;    /* list of addresses */

If this structure is dynamically allocated by gethostbyname() and returned to the caller, either you need an additional API function to free it or you have to commit to what fields in the structure have to be freed separately, and how (ie, this is part of the API). Having the caller free things is also not all that simple. Since this structure contains embedded pointers (including two that point to arrays of pointers), there could be quite a lot of things for the caller to call free() on (and in the right order).

This issue isn't unique to gethostbyname(); it affects any C API that wants to return (in a conceptual sense) anything more complicated than a basic type or a simple structure (even in old C, simple structures can be 'returned' by passing a pointer to the structure to the function, as is done in stat()). C offers no good solution to the problem; either you add one or more 'free' functions to your API (one per dynamically allocated structure you're returning), or you document and thus freeze the process for freeing what you return, or you do what BSD opted to in gethostbyname() and return a pointer to something static.

(Documenting what callers have to free implies that you can't later add extra fields to what you return unless they don't have to be freed separately.)

In POSIX, this API issue was eventually worked around with the first approach, when they added a freeaddrinfo() function to go with the new getaddrinfo(). This is the only particularly good solution, but it does mean that you get an increasing profusion of 'free something' functions, which serves as a disincentive to add APIs which would return something where you'd need such a function.

programming/CAPIsEffectsOfMalloc written at 21:41:36; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.