A brief history of looking up host addresses in Unix
In the beginning, back in V7 Unix and earlier, Unix didn't have
networking and so the standard C library didn't have anything to
look up host addresses. When BSD famously added IP networking to
BSD Unix, that had to change, so BSD added C library functions to
look up this sort of information, in the form of the gethost*
functions,
which first appeared in 4.1c BSD but are probably most widely known
in the 4.2 BSD version.
Because this was before DNS was really a thing, functions like
gethostbyname()
searched through /etc/hosts
.
The next step in practice in host lookups was done by Sun, when they introduced what was then called YP (until it had to be renamed to NIS because of trademark issues). To avoid having to distribute a potentially large /etc/hosts to all machines and to speed up lookups in it, Sun made their gethostbyaddr() be able to look up host entries through YP; on the YP server, your hosts file was compiled into a database file for efficient lookups (along with all of the other YP information sources). As a fallback, gethostbyaddr could still use your local /etc/hosts, which was useful to insure that you weren't completely out to sea if the YP server stopped responding to you. People who didn't use YP (which was a lot of us) still used /etc/hosts, and perhaps distributed a (large) local version to all of their machines.
(YP was not universally loved by system administrators, to put it one way.)
When DNS was introduced to the world of BSD Unix, it didn't initially
get integrated into the C library. Instead, my memory is that BIND
shipped with a separate library that implemented DNS-based versions
of the various host lookup functions. This caused a lot of Makefiles
to pick up stanzas to link things with '-lresolv
'. The resolver
library also contained additional functions specifically for DNS
lookups, so programs like mail transport agents were soon specifically
using them (MTAs care about MX lookups, which aren't exposed
through the BSD gethost* functions). Later, in 4.3 BSD, nameserver
lookups were directly included in the C library gethost* functions
(see eg the 4.3 BSD manual page).
Still later we got the idea of the Name Service Switch to actually
configure how all of these lookups worked.
(My memory is that Sun integrated DNS lookups into YP, so that if you looked up hosts in YP, YP could then do DNS lookups instead of having to have everything in a static /etc/hosts. They also added direct DNS lookup support to their C library, although I'm not sure if this was only after they added support for DNS lookups through YP.)
The next thing that happened was threads. Unfortunately, the gethost* functions are not thread safe because, to quote the manual page's BUGS section:
All information is contained in a static area so it must be copied if it is to be saved. [...]
When people started adding threads to Unix, this led to the creation of reentrant versions of these functions, such as gethostbyname_r(). Support for these reentrant versions wasn't and isn't universal; for example, FreeBSD doesn't have them. One reason for this is that another API problem came up around the same time.
The other problem for gethostbyname() was IPv6, because there's no
way for you to tell it what sort of IP addresses you want and no
good way for it to return a mix of IPv4 and IPv6 address types.
POSIX solved both the threading problem and the IPv6 problem at
once in getaddrinfo()
(and getnameinfo()
.);
see RFC 3493 for
some of the history of the development of these functions. This
more or less brings us to today, where you should probably use
getaddrinfo()
(aka 'gai') for everything. I believe that good
versions of getaddrinfo() exist in basically any modern Unix that
you want to use.
(An early step in trying to get gethostbyname() to deal with IPv6
was the gethostbyname2()
function, which sometimes also got a reentrant _r version.)
PS: Although there was a DNS specification fairly early in the 1980s (cf), it took rather a while for DNS support to appear in actual Unix systems, especially as a standard part of the C library instead of as third party software added by the local sysadmin (which was how you often got a -lresolv back in the day; you could compile and install the BIND libraries yourself, then relink critical programs against them).
(This entry was sparked by What does it take to resolve a hostname (via).)
Comments on this page:
|
|