Some reasons for Go to not make system calls through the standard C library

December 25, 2019

One of the recent pieces of news in the Unix world is that as part of its general security work, OpenBSD is moving towards only allowing system calls to be made from the C library, not from any other code (you can read about this in OpenBSD system call origin verification). Right now OpenBSD has an exemption for the code of programs themselves, primarily because Go generally makes system calls directly instead of by calling the C library, but they would like to get rid of that. Other people are not happy about Go making direct system calls; for example, on Solaris and Illumos, the only officially supported method of making system calls is also through the C library (although Go does it itself on those operating systems).

(Update: On Illumos and Solaris, Go actually uses the platform C library to make system calls; I was wrong here.)

On the surface this makes Go sound unreasonable, and you might ask why it can't just make Unix system calls through the system's C library the way pretty much every other Unix language does. Although I don't know exactly why the Go developers chose to do it this way, there are reasons why you might want to avoid the C library in a language like Go, because the standard C library's Unix system call API is under-specified and awkward.

The obvious way that the C library API is under-specified for things like Go is the question of how much free stack space you need. C code (even threaded C code) traditionally allocates large or very large stacks, but Go wants to use very small stacks when it can, on the order of a few KB, in order to keep goroutines lightweight. The C library API makes no promises here, so if you want to be safe you need to call into it with larger stacks and even then you're guessing. The issue of how much stack space you need to call C library system calls has already been a problem for Go. Go solved this for now by increasing the stack size, but since the required stack size is not a documented part of the C library API, it may break in the future (on any Unix, not just Linux for calls to vDSOs).

(Unixes that strongly insist you go through the C library to make system calls generally reserve the right to have those 'system call' library functions do any amount of work behind your back, because the apparent API to system calls may not be the real kernel API. Indeed one reason for Unixes to force this is exactly so they can make changes in the kernel API without changing the 'system call' API that programs use. Such a change in internal implementation can of course cause unpredictable and undocumented changes in how much stack space the C library will demand and use during such function calls.)

For system calls, the most obvious awkward area of the C library API is how the specifics of errors are returned in errno, which is nominally a global variable. Using a global variable was sort of okay in the days before multi-threaded programs and wanting to make system calls from multiple threads, but it's clearly a problem now. Making errno work in the modern world requires behind the scenes magic in the C library, which generally means that you must use the entire C runtime (and yes, C has a runtime) to do things like set up thread local storage, create OS level threads so that they have this thread local storage, and retrieve your thread's errno from its TLS. In the extreme, this may require you to use the C library pthreads API to create any threads that will make system calls, then carefully schedule goroutines that want to make system calls onto those pthreads (likely with large stacks, because of the C library API issues there). All of this is completely unnecessary in the underlying kernel API, which already directly provides the error code to you.

The C global errno exists for historical compatibility and because C has no easy way to return multiple values; the natural modern API is Go's approach of returning the result and the errno, which is intrinsically thread safe and has no pseudo-global variables. Requiring all languages to go through the C library's normal Unix API for system calls means constraining all languages to live with C's historical baggage and limits.

(You could invent a new C library API for all of the system calls that directly wrote the error number to a spot the caller provided, which would make life much simpler, but no major Unix or C library is so far proposing to do this. Everyone wants (or requires) you to go through the traditional Unix API, errno warts and all.)

Written on 25 December 2019.
« Why udev may be trying to rename your VLAN interfaces to bad names
The Unix C library API can only be reliably used from C »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Dec 25 23:06:19 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.