Wandering Thoughts archives

2019-12-25

Some reasons for Go to not make system calls through the standard C library

One of the recent pieces of news in the Unix world is that as part of its general security work, OpenBSD is moving towards only allowing system calls to be made from the C library, not from any other code (you can read about this in OpenBSD system call origin verification). Right now OpenBSD has an exemption for the code of programs themselves, primarily because Go generally makes system calls directly instead of by calling the C library, but they would like to get rid of that. Other people are not happy about Go making direct system calls; for example, on Solaris and Illumos, the only officially supported method of making system calls is also through the C library (although Go does it itself on those operating systems).

(Update: On Illumos and Solaris, Go actually uses the platform C library to make system calls; I was wrong here.)

On the surface this makes Go sound unreasonable, and you might ask why it can't just make Unix system calls through the system's C library the way pretty much every other Unix language does. Although I don't know exactly why the Go developers chose to do it this way, there are reasons why you might want to avoid the C library in a language like Go, because the standard C library's Unix system call API is under-specified and awkward.

The obvious way that the C library API is under-specified for things like Go is the question of how much free stack space you need. C code (even threaded C code) traditionally allocates large or very large stacks, but Go wants to use very small stacks when it can, on the order of a few KB, in order to keep goroutines lightweight. The C library API makes no promises here, so if you want to be safe you need to call into it with larger stacks and even then you're guessing. The issue of how much stack space you need to call C library system calls has already been a problem for Go. Go solved this for now by increasing the stack size, but since the required stack size is not a documented part of the C library API, it may break in the future (on any Unix, not just Linux for calls to vDSOs).

(Unixes that strongly insist you go through the C library to make system calls generally reserve the right to have those 'system call' library functions do any amount of work behind your back, because the apparent API to system calls may not be the real kernel API. Indeed one reason for Unixes to force this is exactly so they can make changes in the kernel API without changing the 'system call' API that programs use. Such a change in internal implementation can of course cause unpredictable and undocumented changes in how much stack space the C library will demand and use during such function calls.)

For system calls, the most obvious awkward area of the C library API is how the specifics of errors are returned in errno, which is nominally a global variable. Using a global variable was sort of okay in the days before multi-threaded programs and wanting to make system calls from multiple threads, but it's clearly a problem now. Making errno work in the modern world requires behind the scenes magic in the C library, which generally means that you must use the entire C runtime (and yes, C has a runtime) to do things like set up thread local storage, create OS level threads so that they have this thread local storage, and retrieve your thread's errno from its TLS. In the extreme, this may require you to use the C library pthreads API to create any threads that will make system calls, then carefully schedule goroutines that want to make system calls onto those pthreads (likely with large stacks, because of the C library API issues there). All of this is completely unnecessary in the underlying kernel API, which already directly provides the error code to you.

The C global errno exists for historical compatibility and because C has no easy way to return multiple values; the natural modern API is Go's approach of returning the result and the errno, which is intrinsically thread safe and has no pseudo-global variables. Requiring all languages to go through the C library's normal Unix API for system calls means constraining all languages to live with C's historical baggage and limits.

(You could invent a new C library API for all of the system calls that directly wrote the error number to a spot the caller provided, which would make life much simpler, but no major Unix or C library is so far proposing to do this. Everyone wants (or requires) you to go through the traditional Unix API, errno warts and all.)

programming/GoCLibraryAPIIssues written at 23:06:19; Add Comment

Why udev may be trying to rename your VLAN interfaces to bad names

When I updated my office workstation to Fedora 30 back in August, I ran into a little issue:

It has been '0' days since systemd/udev blew up my networking. Fedora 30 systemd/udev attempts to rename VLAN devices to the interface's base name and fails spectacularly, causing the sys-subsystem*.device units to not be present. We hope you didn't depend on them! (I did.)

I filed this as Fedora bug #1741678, and just today I got a clue so that now I think I know why this happens.

The symptom of this problem is that during boot, your system will log things like:

systemd-udevd[914]: em-net5: Failed to rename network interface 4 from 'em-net5' to 'em0': Device or resource busy

As you might guess from the name I've given it here, em-net5 is a VLAN on em0. The name 'em0' itself is one that I assigned, because I don't like the network names that systemd-udevd would assign if left on its own (they are what I would call ugly, or at least tangled and long). The failure here prevents systemd from creating the sys-subsystem-net-devices-em-net5.device unit that it normally would (and then this had further consequences because of systemd's lack of good support for networks being ready).

I use networkd with static networking, so I set up the em0 name through a networkd .link file (as covered here). This looks like:

[Match]
MACAddress=60:45:cb:a0:e8:dd

[Link]
Description=Onboard port
MACAddressPolicy=persistent
Name=em0

Based on what 'udevadm test' reports, it appears that when udevd is configuring the em-net5 VLAN, it (still) matches this .link file for the underlying device and applying things from it. My guess is that this is happening because VLANs and their underlying physical interfaces normally share MACs, and so the VLAN MAC matches the MAC here.

This appears to be a behavior change in the version of udev shipped in Fedora 30. Before Fedora 30, systemd-udevd and networkd did not match VLAN MACs against .link files; from Fedora 30 onward, it appears to do so. To stop this, presumably one needs to limit your .link files to only matching on physical interfaces, not VLANs, but unfortunately this seems difficult to do. The systemd.link manpage documents a 'Type=' match, but while VLANs have a type that can be used for this, native interfaces do not appear to (and there doesn't seem to be a way to negate the match). There are various hacks that could be committed here, but all of them are somewhat unpleasant to me (such as specifying the kernel driver; if the kernel's opinion of what driver to use for this hardware changes, I am up a creek again).

linux/UdevNetworkdVLANLinkMatching written at 01:47:33; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.