The Unix C library API can only be reliably used from C

December 26, 2019

To fully implement system call origin verification, OpenBSD would like Go to make system calls through the C library instead of directly making system calls from its own runtime (which it has some reasons for doing). On the surface, this sounds like only a moderately small issue; sure, it's a bit awkward, but a language like Go should be able to just make calls to the usual C library functions like open() (and using the C calling ABI). Unfortunately it's not that simple, because very often parts of the normal C library API are actually implemented in the C preprocessor. Because of this, the C library API cannot be reliably and generally used without actually writing your own C glue code.

This sounds extreme, so let me illustrate it with everyone's favorite case of errno, which you consult to get the error value from a failed system call (and from some failed library calls). As covered in yesterday's entry, in the modern world errno must be implemented so that different threads can have different values for it, because they may be making different system calls at the same time. This requires thread local storage, and generally thread local storage cannot be accessed as a plain variable; it must be accessed through some special tricks supported by the C runtime. So here are the definitions of 'errno' from OpenBSD 6.6 and a current Fedora Linux with glibc:

/* OpenBSD */
int *__errno(void);
#define errno (*__errno())

/* Fedora glibc */
extern int *__errno_location (void) __THROW __attribute_const__;
# define errno (*__errno_location ())

In both of these cases, errno is actually a preprocessor definition. The definitions refer to non-public and undocumented C library functions (that's what the leading double underscores signal) that are not part of the public API. If you compile C code against this errno API (by including errno.h in your program), it will work, but that's the only officially supported way of doing it. There is no useful errno variable to load in your own language's runtime after a call to, say, the open() function, and if you call __errno or ____errno_location in your runtime, you are using a non-public API and it could break tomorrow (although it probably won't). To build a reliable language runtime that sticks to the public C library API, it's not enough to just call exported functions like open(); you also need to write and compile your own little C function that just returns errno to your runtime.

(There may be other important cases besides errno; I will leave them to interested parties to find.)

This is not a new issue in Unix, of course. From the beginning of stdio in V7, some of the stdio 'functions' were implemented as preprocessor macros in stdio.h. But for a long time, people didn't insist that the C library was the only officially supported way of making system calls, so you could bypass things like the whole modern errno mess unless you needed to be compatible with C code for some reason.

(Before threading came into the Unix picture, errno was a plain variable and a generally good interface, although not perfect.)

Written on 26 December 2019.
« Some reasons for Go to not make system calls through the standard C library
Our setup of Prometheus and Grafana (as of the end of 2019) »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Dec 26 23:50:15 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.