2020-01-15
How Go's net.DialContext()
stops things when the context is cancelled
These days, a number of core Go standard
packages support functions that take a context.Context
argument and abort their
operation if the context is cancelled. This is an interesting trick
in Go, because normally you can't gracefully interrupt a goroutine
doing network IO (which leads to problems in practice). When I started looking into the relevant
standard library code I expected to find that things like
net.Dialer.DialContext()
had special hooks into the runtime's
network poller (netpoller) to
do this. This turns out to not be the case; instead dialing uses
an interesting and elegant approach that's open to everyone doing
network IO.
In order to abort an outstanding dial operation if the context is cancelled, the net package simply sets an expired (write) deadline. In order to do this asynchronously, it starts a background goroutine to listen for the context being cancelled (and then there's some complexity involved to clean everything up properly and handle potential races; races caused a number of issues, eg issue 16523). Setting read and write deadlines is already explicitly documented as affecting currently pending reads (and writes), not just future ones, so dialing is reusing a general mechanism that already needs to exist.
(This reuse is a little bit tricky for dialing, which is taking advantage of a customary and useful property where the underlying OS only reports a network socket as writeable once it's connected. This means that you generally check for a connection having completed by seeing if it's now writeable, and in turn this means you can sensibly limit or abort this check by setting a write deadline.)
Now that I've discovered this use of deadlines in DialContext
,
it's clear that I can do the same thing to abort outstanding network
reads or writes in my own code.
As a bonus, this will probably return a fairly distinctive error,
or I can wrap this in something that implements 'read with context'
or 'write with context', probably with some of the race precautions
seen in the net package's code.
PS: I was going to say that this is also how net.ListenConfig.Listen
handles its context being cancelled, but then I went to look at the code
and now I have no idea how that actually works.
PPS: If the context you pass to DialContext()
already has a
deadline, DialContext()
immediately sets a write deadline on the
underlying network connection, in addition to its handling of
cancellation. There's also some complexity in the code to stop
as soon as possible if the context is cancelled immediately,
before it starts up the whole extra goroutine infrastructure to
wait.
Stopping udev from renaming your VLAN interfaces to bad names
Back in early December I wrote about Why udev may be trying to rename your VLAN interfaces to bad names, where modern versions of udev tried to rename VLAN devices from the arbitrary names you give them to the base name of the network device they're on. Since the base name is already taken, this fails.
There turns out to be a simple cause and workaround for this, at
least in my configuration, from Zbigniew Jędrzejewski-Szmek. In Fedora,
all I need to do is add 'NamePolicy=keep'
to the [Link] section of my .link
file. This makes my
.link file be:
[Match] MACAddress=60:45:cb:a0:e8:dd [Link] Description=Onboard port MACAddressPolicy=persistent Name=em0 # Stop VLAN renaming NamePolicy=keep
Setting 'NamePolicy=keep' doesn't keep the actual network device from being renamed from the kernel's original name for it to 'em0', but it makes udev leave the VLAN devices alone. In turn this means udev and systemd consider them to have been successfully created, so you get the usual systemd sys-subsystem-net-devices .devices units for them showing up as fully up.
In a way, 'NamePolicy=keep' in a .link file is an indirect way for me to tell apart real network hardware from created virtual devices that share the same MAC, or at least ones created through networkd. As covered in the systemd.netdev manpage, giving a name to your virtual device is mandatory (Name= is a required field), so I think such devices will always be considered to already have a name by udev.
(This was a change in systemd-241, apparently. It changes the semantics of existing .link files in a way that's subtly not backward compatible, but such is the systemd way.)
However, I suspect that things might be different if I didn't use
'biosdevname=0
' in my kernel command line parameters. These days
this is implemented in udev, so allowing udev to rename your network
devices from the kernel assigned names to the consistent network
device naming
scheme may be considered a rename for the purposes of 'NamePolicy=keep'.
That would leave me with the same problem of telling real hardware
apart from virtual hardware that I had in the original entry.
However, for actual matching against physical hardware, I suspect that you can also generally use a Property= on selected attributes (as suggested by Alex Xu in the comments on the original entry). For instance, most people's network devices are on PCI busses, so:
Property=ID_BUS=pci
There are a whole variety of properties that real network hardware
has that VLANs don't (based on 'udev info
' output), although I
don't know about other types of virtual network devices. It does
seem pretty safe that no virtual network device will claim to be
on a PCI bus, though.
(I haven't tested the Property= approach, since 'NamePolicy=keep' is sufficient in my case.)