Wandering Thoughts archives

2020-01-15

How Go's net.DialContext() stops things when the context is cancelled

These days, a number of core Go standard packages support functions that take a context.Context argument and abort their operation if the context is cancelled. This is an interesting trick in Go, because normally you can't gracefully interrupt a goroutine doing network IO (which leads to problems in practice). When I started looking into the relevant standard library code I expected to find that things like net.Dialer.DialContext() had special hooks into the runtime's network poller (netpoller) to do this. This turns out to not be the case; instead dialing uses an interesting and elegant approach that's open to everyone doing network IO.

In order to abort an outstanding dial operation if the context is cancelled, the net package simply sets an expired (write) deadline. In order to do this asynchronously, it starts a background goroutine to listen for the context being cancelled (and then there's some complexity involved to clean everything up properly and handle potential races; races caused a number of issues, eg issue 16523). Setting read and write deadlines is already explicitly documented as affecting currently pending reads (and writes), not just future ones, so dialing is reusing a general mechanism that already needs to exist.

(This reuse is a little bit tricky for dialing, which is taking advantage of a customary and useful property where the underlying OS only reports a network socket as writeable once it's connected. This means that you generally check for a connection having completed by seeing if it's now writeable, and in turn this means you can sensibly limit or abort this check by setting a write deadline.)

Now that I've discovered this use of deadlines in DialContext, it's clear that I can do the same thing to abort outstanding network reads or writes in my own code. As a bonus, this will probably return a fairly distinctive error, or I can wrap this in something that implements 'read with context' or 'write with context', probably with some of the race precautions seen in the net package's code.

PS: I was going to say that this is also how net.ListenConfig.Listen handles its context being cancelled, but then I went to look at the code and now I have no idea how that actually works.

PPS: If the context you pass to DialContext() already has a deadline, DialContext() immediately sets a write deadline on the underlying network connection, in addition to its handling of cancellation. There's also some complexity in the code to stop as soon as possible if the context is cancelled immediately, before it starts up the whole extra goroutine infrastructure to wait.

programming/GoDialCancellationHow written at 23:45:58; Add Comment

Stopping udev from renaming your VLAN interfaces to bad names

Back in early December I wrote about Why udev may be trying to rename your VLAN interfaces to bad names, where modern versions of udev tried to rename VLAN devices from the arbitrary names you give them to the base name of the network device they're on. Since the base name is already taken, this fails.

There turns out to be a simple cause and workaround for this, at least in my configuration, from Zbigniew Jędrzejewski-Szmek. In Fedora, all I need to do is add 'NamePolicy=keep' to the [Link] section of my .link file. This makes my .link file be:

[Match]
MACAddress=60:45:cb:a0:e8:dd

[Link]
Description=Onboard port
MACAddressPolicy=persistent
Name=em0
# Stop VLAN renaming
NamePolicy=keep

Setting 'NamePolicy=keep' doesn't keep the actual network device from being renamed from the kernel's original name for it to 'em0', but it makes udev leave the VLAN devices alone. In turn this means udev and systemd consider them to have been successfully created, so you get the usual systemd sys-subsystem-net-devices .devices units for them showing up as fully up.

In a way, 'NamePolicy=keep' in a .link file is an indirect way for me to tell apart real network hardware from created virtual devices that share the same MAC, or at least ones created through networkd. As covered in the systemd.netdev manpage, giving a name to your virtual device is mandatory (Name= is a required field), so I think such devices will always be considered to already have a name by udev.

(This was a change in systemd-241, apparently. It changes the semantics of existing .link files in a way that's subtly not backward compatible, but such is the systemd way.)

However, I suspect that things might be different if I didn't use 'biosdevname=0' in my kernel command line parameters. These days this is implemented in udev, so allowing udev to rename your network devices from the kernel assigned names to the consistent network device naming scheme may be considered a rename for the purposes of 'NamePolicy=keep'. That would leave me with the same problem of telling real hardware apart from virtual hardware that I had in the original entry.

However, for actual matching against physical hardware, I suspect that you can also generally use a Property= on selected attributes (as suggested by Alex Xu in the comments on the original entry). For instance, most people's network devices are on PCI busses, so:

Property=ID_BUS=pci

There are a whole variety of properties that real network hardware has that VLANs don't (based on 'udev info' output), although I don't know about other types of virtual network devices. It does seem pretty safe that no virtual network device will claim to be on a PCI bus, though.

(I haven't tested the Property= approach, since 'NamePolicy=keep' is sufficient in my case.)

linux/UdevNetworkdVLANLinkMatchingII written at 00:42:43; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.