Some notes on using socket.getaddrinfo()

February 18, 2010

Since I've been immersing myself in IPv6 I've naturally become interested in finding out the IPv6 addresses of hosts, which means getaddrinfo(). In many ways, getaddrinfo() is a great API; it's wonderfully tuned for giving you exactly the information you need in order to make connections to places. In other ways, it's less than ideal and how to use its API is unclear.

(For the rest of this, I'm going to assume that you've read its documentation.)

The big annoyance is that in practice, the socktype argument is not optional (which means that the family argument isn't either, but you generally want to set it to 0). If you leave out socktype, getaddrinfo() tells you how to make both TCP and UDP connections to the given host and port; this is rarely what you want. So the standard method of using it is to tell getaddrinfo() that you only want to make TCP connections by using 'getaddrinfo(host, port, 0, socket.SOCK_STREAM)'.

The next annoyance (which is in the C API on some systems) is that getaddrinfo() won't necessarily give you IPv6 connection details if it doesn't think that you can use them. This is fine if you're using getaddrinfo() because you want to make a connection; it's not so fine if you're using getaddrinfo() to look up host addresses themselves. Thus, if you're looking up host addresses and need to get both IPv4 and IPv6 addresses if the host has them, you need to call getaddrinfo() twice, once with socket.AF_INET and once with socket.AF_INET6 as the family argument.

(To use getaddrinfo() to find host addresses in general, you call it, go through the list of results, and find the host address as the first element of the sockaddr element of the result 5-tuple. If you remembered to ask for only TCP connections, you shouldn't normally get duplicate host addresses. This is annoyingly indirect, but that's what utility functions are for.)

There's no guarantee that getaddrinfo() always returns IPv6 addresses before (or after) IPv4 ones if the host has both; what happens seems to vary from system to system. If you want to be sure to try IPv6 before IPv4, you're going to need to apply a sorting function to the list that getaddrinfo() returns.

(If you want to try IPv4 before IPv6, plain .sort() will probably order things the way you want.)

Sidebar: the standard way to use getaddrinfo()

This is taken more or less straight from create_connection in the socket module, which you can read here; this version has less error checking and features. I feel like writing it down for my own future convenience:

import socket 
def makeconn(host, port):
    for r in socket.getaddrinfo(host, port,
                                0, socket.SOCK_STREAM):
        af, st, pr, _, sa = r
        s = socket.socket(af, st, pr)
        try:
            s.connect(sa)
            return s
        except socket.error, msg:
            s.close()
    raise msg

A function to determine the IPv4 and IPv6 addresses of a host is left for a later entry (or an exercise for the reader), since it involves more annoyance.

Written on 18 February 2010.
« The purpose of configuration files
Using socket.getaddrinfo to look up IP addresses »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Feb 18 00:32:30 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.