2016-05-13
You can call bind()
on outgoing sockets, but you don't want to
It started with some tweets and some data by Julia Evans. In the data she mentions:
and here are a few hundred lines of strace output. What's going on? it is running
bind()
all the time, but it's making outgoing HTTP connections. that makes no sense!!
It turns out that this is valid behavior according to the Unix API, but you probably don't want to do this for a number of reasons.
First off, let's note more specifically what Erlang is doing here.
It is not just calling bind()
, it is calling bind()
with no
specific port and address picked:
bind(18, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16 <unfinished ...>
Those arguments are INADDR_ANY
and the magic port (port 0) that
tells bind()
that it can pick any ephemeral port. What this
bind()
does is assign the socket a local ephemeral port (from
whatever the ephemeral port range is). Since we specified the local
address as INADDR_ANY
, the socket remains unbound to any specific
local IP; the local IP will only be chosen when we connect()
the
socket to some address.
(This is visible in anything that exposes the Unix socket API and
has a getsockname()
operation. I like using Python, since I can
do all of this from an interactive REPL.)
There really isn't very much point in doing this for sockets that you're going to use for outgoing connections; about all it achieves is letting you know your local port before you make the connection, instead of only afterwards. In exchange for this minor advantage you make one extra system call and also increase your chances of running out of ephemeral ports under load, because you're putting an extra constraint on the kernel's port allocation.
In general, IP requires each connected socket to have a unique tuple
of (local IP, local port, remote IP, report port). When you leave
an outgoing socket unbound until you connect()
, the kernel has
the maximum freedom to find a local port that makes the tuple unique,
because all it needs is one of the four things to be unique, not
necessarily the local port number. If you're connecting to different
ports on a remote server, the same port on different remote servers,
or whatever, it may be able to reuse a local port number that's
already been used for something else. By contrast, if you bind()
before you connect and use INADDR_ANY
, the kernel pretty much
has the minimum freedom; it must ensure that the local port alone
is completely unique, so that no matter what you then do with
listen()
and connect()
later you'll never collide with an
existing tuple.
(See this article for a discussion of how the Linux kernel does this, and in fact this entire issue.)
Some Unixes may frontload all of the checks necessary into bind()
,
but at least some of them defer some checks to connect()
, even
for pre-bound sockets. This is probably a sensible decision,
especially since a normal connect()
can fail because of ephemeral
port exhaustion.
I'm sure there's some advantage to this 'bind before connect' approach, but I'm honestly hard pressed to think of any.
(There are situations where you want to bind to a specific IP address, but that's not what's happening here.)
(I sort of always knew that it was valid to bind()
before calling
connect()
but I didn't know the details, so writing this has been
useful. For instance, before I started writing I thought maybe the
bind()
picked an IP address as well as the ephemeral port, which
turns out not to be the case; it leaves the IP address unbound.
Which is really not that surprising once I think about it, since
that's what you often do with servers; they listen to a specific
port on INADDR_ANY
. All of which goes to show that sometimes
it's easier for me to experiment and find out things rather than
reason through them from first principles.)