The odd return value of the original 4.2 BSD gethostbyname()

August 4, 2022

In my entry on the history of looking up host addresses in Unix, I touched on how from the beginning gethostbyname() had an issue in its API, one that the BSD Unix people specifically called out in its manual page's BUGS section:

All information is contained in a static area so it must be copied if it is to be saved. [...]

But there is another oddity in how the original gethostbyname() behaved and what it returned. The gethostbyname() API returns a pointer to a 'struct hostent', which in 4.2 BSD was documented as:

struct  hostent {
   char  *h_name;     /* official name of host */
   char **h_aliases;  /* alias list */
   int    h_addrtype; /* address type */
   int    h_length;   /* length of address */
   char  *h_addr;     /* address */
};

The oddity is that in 4.2 BSD, gethostbyname() could return only a single IP address for your host, although the host could have several names (a single 'official' name and then aliases).

The reason for this behavior is that in 4.2 BSD, everything was looked up in /etc/hosts, and the specific behavior for doing this was, to quote the manual page:

Gethostbyname and gethostbyaddr sequentially search from the beginning of the file until a matching host name or host address is found, or until EOF is encountered.

Famously, these functions return the first match (and only the first match) that they find even if there are additional matching entries. An /etc/hosts line has the format:

127.0.0.1    localhost.localdomain localhost myalias

Which is to say, a single IP address but then an official name with additional optional aliases. The 4.2 BSD gethostbyname() API is designed to return exactly this information, which means that you get one IP address but multiple host names. The implication of this is that in 4.2 BSD, if you put the same name on multiple IP addresses in /etc/hosts (perhaps because your host had multiple interfaces), looking up the name would only ever return the first address.

This is exactly backward from the information that DNS naturally provides you when you look up a host by name; the host may easily have multiple IP addresses, but if it has other names there's no natural way for DNS to tell you. As a result, the 'struct hostent' in 4.3 BSD changed (cf):

struct  hostent {
   char  *h_name;      /* official name of host */
   char **h_aliases;   /* alias list */
   int    h_addrtype;  /* address type */
   int    h_length;    /* length of address */
   char **h_addr_list; /* list of addresses from name server */
};

#define h_addr h_addr_list[0] /* address, for backward compatibility */

Now your gethostbyname() lookups could return multiple IP addresses, and still potentially multiple names too. In practice I suspect that name server lookups in 4.3 BSD mostly returned an empty h_aliases list.

(I believe that most gethostbyname() implementations still only returned the first entry they found in /etc/hosts if they searched it, rather than continuing through the whole file and merging the information together from all matching lines.)

Sidebar: Dealing with multiple interfaces in /etc/hosts

If you had a host with multiple IP addresses, my memory is that you gave the additional IPs special names:

192.168.1.1   server server-net1
192.168.2.1   server-net2
192.168.3.1   server-dev

I believe people were inconsistent about whether the additional IPs should have 'server' as their official name, with the per-interface names always aliases. On the one hand, it made gethostbyaddr() give you the official name as, well, the official name; on the other hand, it meant that a gethostbyname() on the official name you'd just gotten back would give you a different IP address.


Comments on this page:

The aliases are used if a DNS response involves one or more CNAMEs: the canonical name is set to the end of the chain and the aliases include the original query name and any intermediate names.

Written on 04 August 2022.
« Link: The MGR Window System
How old our servers are (as of 2022) »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Aug 4 23:08:19 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.