Wandering Thoughts

2022-09-05

The history of sending signals to Unix process groups

All (Unix) processes are members of some process group. Process groups go very far back in Unix; they're present at least as far back as Fourth Edition (V4) Unix. However, they aren't really "process groups" in the modern sense, as we can see from the relevant proc struct field being called p_ttyp. Instead they were used primarily to send signals to your terminal processes when various things happened (see dmr/tty.c and dmr/dc.c), and the 'process group number' was the address of the 'struct tty' for your terminal.

In V7, h/proc.h changed the p_ttyp field to p_pgrp and now called it the 'process group leader'. However, there's (still) no way to send a signal to a process group from user code, although various tools know about the idea of process groups and will report them to user level (for example pstat.1m, which gets this information in the traditional Unix approach of reading kernel memory, per cmd/pstat.c). V7 is also where the 'process group' number becomes the process ID of the first process to open a (serial) tty after it's been closed.

(The V6 ps is aware of p_ttyp and uses it to report the controlling terminal, but I don't think it prints it. In any case the specific value of the 'process group' in V6 isn't very meaningful, since it's still the address of a kernel structure instead of the PID of the process group leader.)

The inability to send signals to process groups changed, apparently independently, in System III and 4BSD. In System III, kill(2) documents the modern approach of sending a signal to the a process group by using a negative 'PID' in the kill(2) system call. System III also has an explicit getpgrp(2) system call and supports setpgrp(2). According to intro.2, System III claims to differentiate between the 'process group' and the 'tty group'; however, proc.h only has the V7 p_pgrp, and the code to do things like handle control-C (in tt0.c) uses p_pgrp (via signal() in sig.c). I don't know enough to say why System III decided to let process groups change and be exposed explicitly.

In 4BSD the reason for a change is much simpler, because 4BSD introduced job control. Job control intrinsically involves multiple process groups, which requires exposing them to user level code and providing user level code ways to send signals to entire process groups. As I mentioned in yesterday's entry, 4BSD implements the ability to signal process groups in a different way from System III. Although 4BSD has a separate killpg(2j) function that calls itself a system call, the actual implementation uses the kill(2) system call with the signal number negated instead of the process ID (see the code for kill() in sys4.c, and also killpg.s). By 4.1c BSD there's an actual killpg() system call, although kern_sig.c calls it temporary. Only in 4.3 BSD does the behavior of negative PIDs appear in kill(2), and even then kill.2 says that it's for compatibility with System V. 4.3 BSD is also where the kill() system call stops supporting the 4BSD behavior of sending signals to process groups instead of PIDs through negative signal numbers (see kern_sig.c).

Before I started down this rabbit hole I would have assumed that you could send signals to process groups as far back as at least V7, and that it would have been done in the modern way. I wouldn't have guessed that signaling process groups was developed separately in both main branches of Unix (AT&T and BSD), and that they initially used different APIs.

Since I just looked it up, POSIX standardized both killpg() and the modern version of kill(). You can of course implement your killpg() through a POSIX standard kill(), so you don't need both as actual system calls.

ProcessGroupsAndSignals written at 22:41:31; Add Comment

2022-09-04

Support for 'kill -SIGNAME ...' was added in 4BSD

The Unix 'kill' command that we're familiar with (and that was standardized as POSIX kill(1)) accepts and even perhaps prefers to be invoked with a signal name, as 'kill -SIGNAME ...' (well, POSIX would like you to use 'kill -s SIGNAME'). For reasons beyond the scope of this blog entry, I was curious about when and where in Unix history this was added to kill. The somewhat surprising answer turns out to be in 4BSD.

One reason this surprised me is that I hadn't really heard much of 4BSD before this; I knew of 4.2 BSD (the famous one) and also 4.1c BSD (a sort of interim predecessor). 4BSD turns out to be more interesting than I expected, and apparently the origin of a number of things I thought as 4.2 BSD features, like job control and curses.

The V7 kill.c is quite simple, and implements 'kill -<signumber>' in basically the obvious way. The 4BSD kill.c has grown more complicated, including with a hard coded table of signal names, and the 4BSD kill.1 manual page documents its new features (which isn't always the case in BSD developments).

I wondered if this also showed up in System III, which I have a blind spot about; the answer appears to be that it didn't. The System III kill.c is larger and more formatted than the V7 version, but doesn't have any support for signaling by name. However, it also supports sending signals to process groups, which are a somewhat complicated subject in Unix history. I believe that the 4BSD kill.c doesn't support sending signals to process groups, although the underlying kill(2) system call in sys4.c does support this in an odd way (which is new since V7).

(The 4BSD kill(2) manual page doesn't document this behavior of the system call, but it's there in the source code with an explicit comment about it. Probably 4BSD wanted you to use killpg(2j) instead, which would hide the odd way this was implemented in the actual system call.)

KillBySignalNameOrigin written at 22:19:28; Add Comment

2022-08-04

The odd return value of the original 4.2 BSD gethostbyname()

In my entry on the history of looking up host addresses in Unix, I touched on how from the beginning gethostbyname() had an issue in its API, one that the BSD Unix people specifically called out in its manual page's BUGS section:

All information is contained in a static area so it must be copied if it is to be saved. [...]

But there is another oddity in how the original gethostbyname() behaved and what it returned. The gethostbyname() API returns a pointer to a 'struct hostent', which in 4.2 BSD was documented as:

struct  hostent {
   char  *h_name;     /* official name of host */
   char **h_aliases;  /* alias list */
   int    h_addrtype; /* address type */
   int    h_length;   /* length of address */
   char  *h_addr;     /* address */
};

The oddity is that in 4.2 BSD, gethostbyname() could return only a single IP address for your host, although the host could have several names (a single 'official' name and then aliases).

The reason for this behavior is that in 4.2 BSD, everything was looked up in /etc/hosts, and the specific behavior for doing this was, to quote the manual page:

Gethostbyname and gethostbyaddr sequentially search from the beginning of the file until a matching host name or host address is found, or until EOF is encountered.

Famously, these functions return the first match (and only the first match) that they find even if there are additional matching entries. An /etc/hosts line has the format:

127.0.0.1    localhost.localdomain localhost myalias

Which is to say, a single IP address but then an official name with additional optional aliases. The 4.2 BSD gethostbyname() API is designed to return exactly this information, which means that you get one IP address but multiple host names. The implication of this is that in 4.2 BSD, if you put the same name on multiple IP addresses in /etc/hosts (perhaps because your host had multiple interfaces), looking up the name would only ever return the first address.

This is exactly backward from the information that DNS naturally provides you when you look up a host by name; the host may easily have multiple IP addresses, but if it has other names there's no natural way for DNS to tell you. As a result, the 'struct hostent' in 4.3 BSD changed (cf):

struct  hostent {
   char  *h_name;      /* official name of host */
   char **h_aliases;   /* alias list */
   int    h_addrtype;  /* address type */
   int    h_length;    /* length of address */
   char **h_addr_list; /* list of addresses from name server */
};

#define h_addr h_addr_list[0] /* address, for backward compatibility */

Now your gethostbyname() lookups could return multiple IP addresses, and still potentially multiple names too. In practice I suspect that name server lookups in 4.3 BSD mostly returned an empty h_aliases list.

(I believe that most gethostbyname() implementations still only returned the first entry they found in /etc/hosts if they searched it, rather than continuing through the whole file and merging the information together from all matching lines.)

Sidebar: Dealing with multiple interfaces in /etc/hosts

If you had a host with multiple IP addresses, my memory is that you gave the additional IPs special names:

192.168.1.1   server server-net1
192.168.2.1   server-net2
192.168.3.1   server-dev

I believe people were inconsistent about whether the additional IPs should have 'server' as their official name, with the per-interface names always aliases. On the one hand, it made gethostbyaddr() give you the official name as, well, the official name; on the other hand, it meant that a gethostbyname() on the official name you'd just gotten back would give you a different IP address.

GethostbynameOddOriginalAPI written at 23:08:19; Add Comment

2022-08-03

Vim settings I'm using for editing YAML (with a sideline into Python)

I normally stick with minimal Vim customizations, partly because as a system administrator I'm not infrequently editing files as a different user instead of myself. However, due to Prometheus and other things I'm editing more and more YAML these days, and YAML files have such a rigid and annoying requirement for their indentation and formatting that it's painful to edit them in a stock vi-like Vim setup. Initially I stuck 'modelines' at the top of most of the the Prometheus YAML files, but by default these are ignored if you're root so I had to remember to ':set' them by hand. Recently I decided that enough was enough, so I'd set our Prometheus server up so that YAML editing worked properly.

My eventual .vimrc setting comes from this blog post:

autocmd FileType yaml setlocal expandtab shiftwidth=2 softtabstop=2

Some people will set tabstop as well (or instead of softtabstop), but I'm one of those people who has strong opinions about what an actual tab is (also), opinions that I want programs I use to respect.

(I started out simply turning on modelines for root with 'set modeline', which is safe on the particular machine I did it on, then found some instructions on setting autocmds for 'BufRead,BufNewFile' for YAML file extensions, then finally found the blog entry with the FileType autocmd. Apparently filetype detection is on by default in the Ubuntu 22.04 vim default settings.)

Possibly I should also set autoindent for YAML files. But that feels more questionable and too overly semi-intelligent. In YAML files I definitely always want those indentation settings, but whether or not a given new line should be autoindented is more context dependent.

Although I mostly edit Python code in GNU Emacs, where I have a well developed environment for it, I sometimes reach for vim for quick edits to scripts. Not all of my Python code is in the modern Python 3 style so I can't set a global option for it in my .vimrc, but I should probably consider sticking a vim modeline in my modern code to the effect of:

# vim: expandtab shiftwidth=4 softtabstop=4

That way there would at least be less chance of annoying accidents when I made quick edits with vim.

PS: I'm aware that I could install a variety of Vim plugins to make editing YAML in vim more pleasant. For a number of reasons, I want to stick to base vim features with no add-ons.

(This is partly an entry I write for myself so that I can find these settings later when I'm setting up another vimrc on another system.)

VimSettingsForYaml written at 22:09:55; Add Comment

2022-08-01

A brief history of looking up host addresses in Unix

In the beginning, back in V7 Unix and earlier, Unix didn't have networking and so the standard C library didn't have anything to look up host addresses. When BSD famously added IP networking to BSD Unix, that had to change, so BSD added C library functions to look up this sort of information, in the form of the gethost* functions, which first appeared in 4.1c BSD but are probably most widely known in the 4.2 BSD version. Because this was before DNS was really a thing, functions like gethostbyname() searched through /etc/hosts.

The next step in practice in host lookups was done by Sun, when they introduced what was then called YP (until it had to be renamed to NIS because of trademark issues). To avoid having to distribute a potentially large /etc/hosts to all machines and to speed up lookups in it, Sun made their gethostbyaddr() be able to look up host entries through YP; on the YP server, your hosts file was compiled into a database file for efficient lookups (along with all of the other YP information sources). As a fallback, gethostbyaddr could still use your local /etc/hosts, which was useful to insure that you weren't completely out to sea if the YP server stopped responding to you. People who didn't use YP (which was a lot of us) still used /etc/hosts, and perhaps distributed a (large) local version to all of their machines.

(YP was not universally loved by system administrators, to put it one way.)

When DNS was introduced to the world of BSD Unix, it didn't initially get integrated into the C library. Instead, my memory is that BIND shipped with a separate library that implemented DNS-based versions of the various host lookup functions. This caused a lot of Makefiles to pick up stanzas to link things with '-lresolv'. The resolver library also contained additional functions specifically for DNS lookups, so programs like mail transport agents were soon specifically using them (MTAs care about MX lookups, which aren't exposed through the BSD gethost* functions). Later, in 4.3 BSD, nameserver lookups were directly included in the C library gethost* functions (see eg the 4.3 BSD manual page). Still later we got the idea of the Name Service Switch to actually configure how all of these lookups worked.

(My memory is that Sun integrated DNS lookups into YP, so that if you looked up hosts in YP, YP could then do DNS lookups instead of having to have everything in a static /etc/hosts. They also added direct DNS lookup support to their C library, although I'm not sure if this was only after they added support for DNS lookups through YP.)

The next thing that happened was threads. Unfortunately, the gethost* functions are not thread safe because, to quote the manual page's BUGS section:

All information is contained in a static area so it must be copied if it is to be saved. [...]

When people started adding threads to Unix, this led to the creation of reentrant versions of these functions, such as gethostbyname_r(). Support for these reentrant versions wasn't and isn't universal; for example, FreeBSD doesn't have them. One reason for this is that another API problem came up around the same time.

The other problem for gethostbyname() was IPv6, because there's no way for you to tell it what sort of IP addresses you want and no good way for it to return a mix of IPv4 and IPv6 address types. POSIX solved both the threading problem and the IPv6 problem at once in getaddrinfo() (and getnameinfo().); see RFC 3493 for some of the history of the development of these functions. This more or less brings us to today, where you should probably use getaddrinfo() (aka 'gai') for everything. I believe that good versions of getaddrinfo() exist in basically any modern Unix that you want to use.

(An early step in trying to get gethostbyname() to deal with IPv6 was the gethostbyname2() function, which sometimes also got a reentrant _r version.)

PS: Although there was a DNS specification fairly early in the 1980s (cf), it took rather a while for DNS support to appear in actual Unix systems, especially as a standard part of the C library instead of as third party software added by the local sysadmin (which was how you often got a -lresolv back in the day; you could compile and install the BIND libraries yourself, then relink critical programs against them).

(This entry was sparked by What does it take to resolve a hostname (via).)

HostLookupHistory written at 21:56:23; Add Comment

2022-07-13

How Unix didn't used to support '#!', a brief history

When I wrote about why an empty executable file is true in Unix, I mentioned that it's traditional Unix behavior that shell scripts without a '#!' line are passed directly to the Bourne shell. The short version of why this behavior exists is that before a certain point, Unix didn't understand '#!' lines at all.

In V7 (and versions before it), the exec*() family of system calls only worked on actual binary executables. In order to make shell scripts more or less work, the shell (and I think perhaps some other programs) reacted to an ENOEXEC error by assuming the executable was actually a shell script. In the traditional Unix manner, the shell didn't do any sort of safety check to see if the file actually had ASCII text in it; if execve() returned ENOEXEC, that was good enough for it (you can see this in the execs() function in sh/service.c).

(The V7 shell was actually reasonably clever about how it handled this case. Rather than exec'ing itself again, it longjmp()'d back to the start of main() in the same process.)

Before I started my research for writing this entry, I would have said that kernel support for '#!' was added in 4.2 BSD. That's certainly where it first appeared and where it became well known, but according to the history section of Wikipedia's page on #!, it was first introduced by Dennis Ritchie in 1980, after V7, although it might have been suggested to him by someone else. It didn't become known through later releases of Research Unix because those were never very widely spread, unlike V7.

(This suggests that at least some of my somewhat apocryphal history of comments in the Bourne shell is wrong. I'd guess that '#' was added as a comment character to the Bourne shell at some point after V7, since otherwise a shell script that started with '#!/bin/sh' would produce an error from the Bourne shell.)

Unix owes a lot to Dennis Ritchie as it is, but I'm still pleased and glad to learn that we owe a bit more to him than I thought (as well as to whoever suggested the idea to him). Plus, finding this out is a nice reminder to myself that the history of Unix can be more interesting than what I remember from the folklore I've absorbed over time.

ExecAndShebangHistory written at 22:42:58; Add Comment

2022-07-04

Why an empty (executable) file is generally true in Unix

Among a certain segment of Unix people, a famous bit of trivia is that /bin/true used to be an empty file until people complicated it (via, which reminded me of this). However, you might wonder why an empty executable file would be true. This more or less follows from a variety of standard and traditional behaviors, like this:

  1. If a shell script ends without an explicit 'exit <status>' command, its exit status is the exit status of the last command run in the script. (This sometimes leads people to put 'exit 0' and the end of scripts to force a successful exit even if the immediately previous command failed.)

  2. If a shell script doesn't run any commands, this 'last command' behavior is extended to have its exit status be 0; after all, no commands have failed.

  3. it's traditional Unix behavior that shell scripts without a #! line are still passed directly to the (Bourne) shell to run. This dates back to V7 Unix (or earlier) and is part of the complicated and surprising history of comments in the Bourne shell.

When you line all of these up, an empty executable file is interpreted as an empty (Bourne) shell script, which has exit status 0. And this is how an empty (executable) file is true.

(Empty files that aren't executable are generally going to be an error; Bash on Linux reports 'permission denied', for example.)

Today, all of this behavior is required by the Unix specification. One part of it is in the section on the exec*() functions. This specifically requires the behavior of passing 'shell scripts' to the shell:

[...] In the cases where the other members of the exec family of functions would fail and set errno to [ENOEXEC], the execlp() and execvp() functions shall execute a command interpreter and the environment of the executed command shall be as if the process invoked the sh utility using execl() as follows: [...]

There's similar language in the description of how the POSIX shell is to execute commands (in 1.e.b and 2). The POSIX sh manual page further requires that the shell exit with a 0 status if the script to be executed consisted only of zero or more blank lines and comments.

A shell that doesn't claim to be POSIX compatible doesn't have to have this 'pass files to sh' behavior, but if it lacks it, some number of random programs will probably fail to get run some of the time. On the whole I suspect that most alternate shell authors implement the behavior for compatibility.

(It's both interesting and reassuring how many of the odd historical little corners of Unix have been chased down and carefully standardized by POSIX.)

EmptyFileWhyTrue written at 22:01:13; Add Comment

2022-06-13

In general Unix system calls are not cancellable, just abortable

One of the common wishes in environments and languages that support concurrency is for (Unix) system calls to be cancellable in the way that other operations often are. Unfortunately this is not practical, which is part of why a lot of such environments don't try to support it (Go is famously one of them, which makes people unhappy since it does have a 'context' package that can cancel other things).

All or almost all Unix system calls can be aborted, which is to say that you can interrupt them before they complete and force control to return to the program. However, when you abort a system call this way the effects of the system call may be either incomplete or indeterminate, leaving you with either broken state or unusable state (or at least a peculiar state that you have to sort out). For example, if a close() is aborted, the state of the file descriptor involved is explicitly unknown. Only some Unix system calls can be cancelled, which is to say stopped with things in some orderly and known state. Often these system calls are the least interesting ones because all they do is inquire about the state of things, such as what file descriptors are ready or whether you have dead or stopped children.

Some interesting system calls can be cancelled under some but not all situations using special mechanisms that may have side effects. You may be able to relatively cleanly cancel certain network IO by setting the file descriptor to non-blocking, for example, but this will probably have done some IO and might affect other threads if they immediately try to do IO on the file descriptor before you can set it back to blocking.

Some languages and environments actually turn certain normally synchronous 'system call' operations into asynchronous actions that can be cancelled (to some degree). For example, Go's runtime and network IO subsystem cooperate to turn what looks like blocking read() or write() operations into non-blocking ones combined with waiting for network sockets to be ready (along with other things, such as periodic timer ticks). This allows these operations to be cancellable (although Go doesn't expose a clean way to do it). But this can only be applied to certain sorts of read()s and write()s; very few Unixes support cleanly cancelling filesystem IO, for example.

The distinction I'm drawing between cancelling a system call and aborting it may seem picky, but I think it's an important one. If you think of it as aborting a system call, it means means that you have to carefully consider and deal with partially done operations and potentially uncertain state. If you could cleanly cancel system calls without worrying about all of that, life would be much nicer. But you can't.

(This was sort of sparked by some comments here.)

SystemCallsNotCancellable written at 22:15:33; Add Comment

2022-06-12

Framebuffer consoles have been around before on Unix workstations

The early Unix machines were what we would today call 'servers', and generally used serial terminals as their system console (well, they used serial ttys for pretty much everyone, but one of them was special). But then Unix workstations came along, which is to say small Unix machines with graphical displays. In theory these machines could have been designed to boot with their system console as a serial terminal (and often you could reconfigure them that way), but in practice that would have been rather awkward, so the graphics hardware and the graphical display were used as the system console. Which is to say that these machines had a framebuffer console, much like how modern Linux kernels work on x86 hardware. And just like modern Linux machines, these Unix workstations generally had surprisingly slow text output on their framebuffer consoles.

(For example, you can watch the somewhat leisurely text output of the kernel boot messages in this video of a Sun 3/60 booting.)

There generally were three reasons for this. First, these machines were just slow in general (by modern standards). Second, the machines were often using simple unaccelerated graphics to render their console text; using sophisticated code and hardware acceleration was generally something only the display server did (and there wasn't always hardware acceleration). Among other things, this kept the complexity of the kernel framebuffer driver down. And third, they were generally rendering text on the entire screen, often with large fonts. This would be like the difference between a normal 80x24 terminal window with normal sized text and a full screen terminal window with big text, although today's hardware often has fast enough text rendering that you might not notice much speed difference.

(One way for Unixes to cheat here was to render the text console as a more normal sized window in the middle of the display, as if it was running inside a terminal program like xterm.)

In fact, PC Unixes (including Linux) were generally the exception in how fast and good their text consoles were. This is because on traditional BIOS based x86 systems, the text console is actually handled by the kernel as more or less text through VGA text mode (also). The PC VGA text console required far less memory to be manipulated to change its contents compared to even a basic black and white bitmapped display (never mind a colour one, even 8-bit colour).

One of the consequences of this is that you almost never used Unix workstations in their text console mode. The text console was mostly for emergencies and for getting into the graphics system (and sometimes for watching boot messages, if you were the kind of person who did that).

(My fallible memory is that there was also a real range of console text rendering speeds. There were some Unix workstation vendors and workstation models that were well known for very slow console text rendering, and others that weren't so bad.)

Workstation Unix vendors did take advantage of the console's graphics capabilities to do various extra things. Sun famously drew their logo on the screen at the start of the boot process, and some of SGI's workstations had relatively graphical and 'user friendly' boot processes (there are various videos of SGI Indys booting that you can watch for some examples of how this looked).

WorkstationFramebufferConsoles written at 23:05:50; Add Comment

2022-05-21

Some things that make languages easy (or not) to embed in Unix shell scripts

Part of Unix shell scripting is that Unix has a number of little languages (and interpreters for them) that are commonly embedded in shell scripts to do various things. Shell scripts aren't just written in the Bourne shell; they're effectively written in the Bourne shell plus things like sed and awk, and later more things like Perl (the little language used by jq may in time become routine). However, not all languages become used on Unix this way, even if they're interpreted and otherwise used for shell script like things. Recently it occurred to me that one factor in this is how embeddable the language is in a shell script.

If you're putting together a shell script, your life is a lot easier if the shell script is self-contained and doesn't need any additional files distributed with it (files that it will probably have to know where to find). If you're going to use an additional little language in your shell script, you really want to be able to provide the program in the little language as part of the shell script. Interpreters and languages can make this more or less easy, in two ways.

First and obviously, the interpreter mostly needs to accept a program as a command line argument, not require it to be in a file that the interpreter reads (and most especially not require the file to have a specific extension). There is a way to embed file contents in shell scripts but it will make your shell script's life harder. For many people this will probably push them to shipping the program in a separate file, which in turn will probably push them using a more shell script embedding friendly language.

It's convenient but not essential if the interpreter accepts multiple snippets of program as separate command line arguments. The poster child for this is sed, where you can supply multiple lines of program with multiple -e arguments. Lack of this isn't fatal, as shown by awk, especially if even snippets of the overall program are probably going to be multiple lines in themselves.

Generally, the only practical way to quote a long, multi-line command line argument in the Bourne shell is with single quotes (' .... '); quoting with double quotes ("...") can be done, but you will have heartburn with all sorts of characters. This makes it quite important that a language to be embedded use single quotes as little as possible. If you can't naturally write a program without using single quotes, you'll have problems providing the program as an embedded command line argument in the shell script. If your language wants you to use all of single quotes, double quotes, backslashes, and dollar signs ('$'), you're really going to have heartburn.

(It also helps if your language isn't picky about formatting and indentation, and lets you squeeze a bunch of statements onto a single physical line.)

There is a way to deal with languages that aren't friendly to shell quoting; you can use a here document to create a shell script variable and then supply the environment variable as the program when you invoke the interpreter. For example:

pyprog="$(cat <<'EOF'
[....]
EOF
)"
python -c "$pyprog" ...

However, this is more awkward than doing the equivalent in awk. This awkwardness acts as friction that pushes people away from using such awkward languages in shell scripts. If they do use them, it's more natural to put the program in a separate file and ship the shell script and the separate file (which will go into some known location, and so on).

ShellScriptLanguageEmbedding written at 21:39:23; Add Comment

(Previous 10 or go back to May 2022 at 2022/05/20)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.