Wandering Thoughts

2018-06-17

The history of terminating the X server with Ctrl + Alt + Backspace

If your Unix machine is suitably configured, hitting Ctrl + Alt + Backspace will immediately terminate the X server, or more accurately will cause the X server to immediately exit. This is an orderly exit from the server's perspective (it will do things like clean up the graphics state), but an abrupt one for clients; the server just closes their connections out of the blue. It turns out that the history of this feature is a bit more complicated than I thought.

Once upon a time, way back when, there was the X releases from the (MIT) X Consortium. These releases came with a canonical X server, with support for various Unix workstation hardware. For a long time, the only way to get this server to terminate abruptly was to sent it a SIGINT or SIGQUIT signal. In X11R4, which I believe was released in 1989, IBM added a feature to the server drivers for their hardware (and thus to the X server that would run on their AIX workstations); if you hit Control, Alt, and Backspace, the server would act as if it had received a SIGINT signal and immediately exit.

(HP Apollo workstations also would immediately exit the X server if you hit the 'Abort/Exit' key that they had on their custom keyboard, but I consider this a different sort of thing since it's a dedicated key.)

In X11R5, released in 1991, two things happened. First, IBM actually documented this key sequence in server/ddx/ibm/README (previously it was only mentioned in the server's IBM-specific usage messages). Second, X386 was included in the release, and its X server hardware support also contained a Ctrl + Alt + Backspace 'terminate the server' feature. This feature was carried on into XFree86 and thus the version of the X server that everyone ran on Linux and the *BSDs. The X386 manpage documents it this way:

Ctrl+Alt+Backspace
Immediately kills the server -- no questions asked. (Can be disabled by specifying "dontzap" in the configuration file.)

I never used IBM workstations, so my first encounter with this was with X on either BSDi or Linux. I absorbed it as a PC X thing, one that was periodically handy for various reasons (for instance, if my session got into a weird state and I just wanted to yank the rug out from underneath it and start again).

For a long time, XFree86/Xorg defaulted to having this feature on. Various people thought that this was a bad idea, since it gives people an obscure gun to blow their foot off with, and eventually these people persuaded the Xorg people to change the default. In X11R7.5, released in October of 2009, Xorg changed things around so that C-A-B would default to off in a slightly tricky way and that you would normally use an XKB option to control this; see also the Xorg manpage.

(You can set this option by hand with setxkbmap, or your system may have an xorg.conf.d snippet that sets this up automatically. Note that running setxkbmap by hand normally merges your changes with the system settings; see its manpage.)

Sidebar: My understanding of how C-A-B works today

In the original X386 implementation (and the IBM one), the handling of C-A-B was directly hard-coded in the low level keyboard handling. If the code saw Backspace while Ctrl and Alt were down, it called the generic server code's GiveUp() function (which was also connected to SIGINT and SIGQUIT) and that was that.

In modern Xorg X with XKB, there's a level of indirection involved. The server has an abstracted Terminate_Server event (let's call it that) that triggers the X server exiting, and in order to use it you need to map some actual key combination to generate this event. The most convenient way to do this is through setxkbmap, provided that all you want is the Ctrl + Alt + Backspace combination, but apparently you can do this with xmodmap too and you'll probably have to do that if you want to invoke it through some other key combination.

The DontZap server setting still exists and still defaults to on, but what it controls today is whether or not the server will pay attention to a Terminate_Server event if you generate one. This is potentially useful if you want to not just disable C-A-B by default but also prevent people from enabling it at all.

I can see why the Xorg people did it this way and why it makes sense, but it does create extra intricacy.

XBackspaceTerminateHistory written at 23:52:57; Add Comment

2018-06-15

Default X resources are host specific (which I forgot today)

I've been using X for a very long time now, which means that over the years I've built up a large and ornate set of X resources, a fair number of which are now obsolete. I recently totally overhauled how my X session loads my X resources, and in the process I went through all of them to cut out things that I didn't need any more. I did this first on my home machine, then copied much of the work over to my office machine; my settings are almost identical on the two machines anyway, and I didn't feel like doing all of the painstaking reform pass a second time.

Xterm has many ornate pieces, one of which is an elaborate system for customizing character classes for double click word selection. The default xterm behavior for this is finely tuned for use on Unix machines; for example, it considers each portion of a path to be a separate word, letting you easily select out one part of it. Way back in Fedora Core 4, the people packaging xterm decided to change this behavior to one that is allegedly more useful in double-clicking on a URL. I found this infuriating and promptly changed it back by setting the XTerm*charClass X resource to xterm's normal default, and all was good. Or rather I changed it on my work machine only, because Fedora apparently rethought the change so rapidly that I never needed to set the charClass resource on my home machine.

(My old grumblings suggest that perhaps I skipped Fedora Core 4 entirely on my own machines, and only had to deal with it on other machines that I ran X programs on. This is foreshadowing.)

When I was reforming my X resources, I noticed this difference between home and work and as a consequence dropped the charClass setting on my work machine because clearly it wasn't necessary any more. Neatening up my X resources and deleting obsolete things was the whole point, after all. Then today I started a remote xterm on an Ubuntu 16.04 machine, typed a command line involving a full path, wanted to select a filename out of the path, and my double-click selected the whole thing. That's when I was pointedly reminded that default X resources for X programs are host specific. Your explicitly set X resources are held by the server, so all programs on all hosts will see them and use them, but if you don't set some resource the program will go look in its resource file on the host it's running on. There is no guarantee that the XTerm resource file on Fedora is the same as the XTerm resource file on Ubuntu, and indeed they're not.

(It turns out that way back in 2006, Debian added a patch to do this to their packaging of xterm 208. They and Ubuntu seem to have been faithfully carrying it forward ever since. You can find it mentioned in things like this version of the Debian package changelog.)

In summary, just because some X resource settings are unnecessary on one machine doesn't mean that they're unnecessary on all machines. If they're necessary anywhere, you need to set them in as X resources even if it's redundant for most machines. You may even want to explicitly set some X resources as a precaution; if you really care about some default behavior happening, explicitly setting the resource is a guard against someone (like Debian) getting cute and surprising you someday.

(There's another use of setting X resources to the default values that the program would use anyway, but it's slightly tricky and perhaps not a good idea, so it's for another entry.)

The reason I never had this problem at home, despite not setting the XTerm*charClass resource, is that I almost never use remote X programs at home, especially not xterm. Instead I start a local xterm and run ssh in it, because in practice that's faster and often more reliable (or at least it was). If I run a remote xterm on an Ubuntu machine from home, I have the problem there too, and so I should probably set XTerm*charClass at home just in case.

PS: To add to the fun of checking this stuff, different systems keep the default X resource files in different places. On Fedora you find them in /usr/share/X11/app-defaults, on Ubuntu and probably Debian they're in /etc/X11/app-defaults, and on FreeBSD you want to look in at least /usr/local/lib/X11/app-defaults.

(On OmniOS and other Illumos based systems it's going to depend on where you installed xterm from, since it's not part of the base OS and there are multiple additional package sources and even package systems that all put things in different places. I recommend using find, which is honestly how I found out most of these hiding places even on Linux and FreeBSD.)

XResourcesPerHost written at 01:17:44; Add Comment

2018-06-07

The history of Unix's confusing set of low-level ways to allocate memory

Once upon a time, the Unix memory map of a process was a very simple thing. You had text, which was the code of the program (and later read-only data), the stack, which generally started at the top of memory and grew down, initialized data right after the text, and then bss (for variables and so on that started out as zero). At the very top of the bss, ie the highest address in the process's data segment, was what was called the program break (or, early on, just the break). The space between the program break and the bottom of the stack was unused and not actually in your process's address space, or rather it started out as unused. If you wanted to get more free memory that your program could use, you asked the operating system to raise this point, with what were even in V7 two system calls: brk() and sbrk(). This is directly described in the description of brk():

char *brk(addr) [...]

Brk sets the system's idea of the lowest location not used by the program (called the break) to addr (rounded up to the next multiple of 64 bytes on the PDP11, 256 bytes on the Interdata 8/32, 512 bytes on the VAX-11/780). Locations not less than addr and below the stack pointer are not in the address space and will thus cause a memory violation if accessed.

Unix programs used brk() and sbrk() to create the heap, which is used for dynamic memory allocations via things like malloc(). The heap in classical Unix was simply the space you'd added between the top of the bss and the current program break. Usually you didn't call brk() yourself but instead left it to the C library's memory allocation functions to manage for you.

(There were exceptions, including the Bourne shell's very creative approach to memory management.)

All of this maintained Unix's simple linear model of memory, even as Unix moved to the fully page-based virtual memory of the DEC Vax. When functions like malloc() ran out of things on their free list of available space, they'd increase the break, growing the process's memory space up, and use the new memory as more space. If you free()d the right things to create a block of unused space at the top of the break, malloc() and company might eventually call brk() or sbrk() to shrink the program's break and give the memory back to the OS, but you probably didn't want to count on that.

This linear memory simplicity had its downsides. For example, fragmentation was a real worry and unless you felt like wasting memory it was difficult to have different 'arenas' for different sorts of memory allocation. And, as noted, Unix programs rarely shrank the amount of virtual memory that they used, which used to matter a lot.

Then, in SunOS 4, Unix got mmap(), which lets people add (and remove) pages of virtual memory anywhere in the process's memory space, not just right above the program's break (or just below the bottom of the stack). This includes anonymous mappings, which are just pages of memory exactly like the pages of memory that you add to the heap by calling sbrk(). It didn't take the people writing implementations of malloc() very long to realize that they could take advantage of this in various ways; for example, they could mmap() several different chunks of address space and use them for arenas, or they could directly allocate sufficiently large objects by direct mmap() (and then directly free them back to the operating system by dropping the mappings). Pretty soon people were using mmap() not just to map files into memory but also to allocate general dynamic memory (which was still called the 'heap', even if it was no longer continuous and linear).

Over time, there's been a tendency for more and more memory allocation libraries and systems to get most or all of their memory from Unix through mmap(), not by manipulating the old-school heap by using sbrk() to change the program break. Often using mmap() only is simpler, and it's also easier to coexist with other memory allocation systems because you're not all fighting over the program break; each mmap() allocation can be manipulated separately by different pieces of code, and all you have to do is worry about not running out of address space (which is generally not a worry on modern 64-bit systems).

(For example, the Go runtime allocates all of its memory through mmap().)

Today, it's generally safe to assume that the memory for almost any large single memory allocation will be obtained from the Unix kernel by mmap(), not by growing the classical heap through sbrk(). In some C libraries and some environments, smaller memory allocations may still come from the classical heap; if you're curious, you can tell by pointing a system call tracer at a program to see if it even calls sbrk() or brk(). How frequently used brk() is probably depends on the Unix (and on Linux, on the C library). I know that GNU libc does use brk() based allocation for programs that only make small allocations, for example /usr/bin/echo.

(Using the classical heap through brk() has some minor advantages, including that it usually doesn't create an additional kernel virtual memory area and those usually have some costs associated with them.)

The current state of low-level Unix memory allocation has thus wound up being somewhat confusing, but that's Unix for you; our current situation is the result of a complicated historical evolution that has been surprisingly focused on backward compatibility. I don't think anyone has ever seriously proposed throwing out brk() entirely, although several of the BSDs call it a historical curiosity or a legacy interface (OpenBSD, FreeBSD), and I suspect that their C libraries never use them for memory allocation.

(This entry was sparked by reading Povilas Versockas' Go Memory Management.)

SbrkVersusMmap written at 01:15:20; Add Comment

2018-05-05

Modern Unix GUIs now need to talk to at least one C library

I've written before about whether the C runtime and library are a legitimate part of the Unix API. The question matters because some languages want to be as self contained as possible on Unix (Go is one example), and so they don't want to have to use anything written in C if at all possible. However, I recently realized that regardless of the answer to this question, it's essentially impossible to do a good quality, modern Unix GUI without using at least one C library.

This isn't because you need to use a toolkit like Gtk+ or QT, or that you need a C library in order to, for example, speak the X protocol (or Wayland's protocol). You can write a new toolkit if you need to and people have already reimplemented the X protocol in pure non-C languages. Instead the minimal problem is fonts.

Modern fonts are selected and especially rendered in your client, and they're all TrueType fonts. Doing a high quality job of rendering TrueType fonts is extremely complicated, which is why everyone uses the same library for this, namely FreeType. FreeType is written in C, so if you want to use it, you're going to be calling a C library (and it will call on some additional services from something like the C runtime, although apparently you can shim in your own versions of some parts of it).

(Selecting fonts is also a reasonably complicated job, especially if you want to have your fonts match with the rest of the system and be specified in the same way. That's another C library, fontconfig.)

There's no good way out from calling FreeType. Avoiding it requires either abandoning the good modern fonts that users want your UI to have, implementing your own TrueType renderer that works as well as FreeType (and updating it as FreeType improves), or translating FreeType's C code into your language (and then re-translating it every time a significant FreeType update comes out). The latter two are theoretically possible but not particularly practical; the first means that you don't really have a modern Unix GUI program.

(I don't know enough about Wayland to be sure, but it may make this situation worse by essentially requiring you to use Mesa in order to use OpenGL to get decent performance. With X, you can at least have the server do much of the drawing for you by sending X protocol operations; I believe that Wayland requires full client side rendering.)

The direct consequence of this is that there will never be a true pure Go GUI toolkit for Unix that you actually want to use. If the toolkit is one you want to use, it has to be calling FreeType somewhere and somehow; if it isn't calling FreeType, you don't want to use it.

(It's barely possible that the Rust people will be crazy enough to either write their own high-quality equivalent of FreeType or automatically translate its C code into Rust. I'm sure there are people who look at FreeType and want a version of it with guaranteed memory safety and parallel rendering and so on.)

UnixGUIsNeedC written at 00:13:25; Add Comment

2018-05-04

Why you can't put zero bytes in Unix command line arguments

One sensible reaction to all of the rigmarole with 'grep -P' I went through in yesterday's entry in order to search for a zero byte (a null byte) is to ask why I didn't just use a zero byte in the command line argument:

fgrep -e ^@ -l ...

(Using the usual notation for a zero byte.)

You can usually type a zero byte directly at the terminal, along with a number of other unusual control characters (see my writeup of this here), and failing that you could write a shell script in an editor and insert the null byte there. Ignoring character set encoding issues for the moment, this works for any other byte, but if you try it you'll discover that it doesn't work for the zero byte. If you're lucky, your shell will give you an error message about it; if you're not, various weird things will happen. This is because the zero byte can't ever be put into command line arguments in Unix.

Why is ultimately simple. This limitation exists because the Unix API is fundamentally a C API (whether or not the C library and runtime are part of the Unix API), and in C, strings are terminated by a zero byte. When Unix programs such as the shell pass command line arguments to the kernel as part of the exec*() family of system calls, they do so as an array of null-terminated C strings; if you try to put a null byte in there as data, it will just terminate that command line argument early (possibly reducing it to a zero-length argument, which is legal but unusual). When Unix programs start they receive their command line arguments as an array of C strings (in C, the argv argument to main()), and again a null byte passed in as data would be seen as terminating that argument early.

This is true whether or not your shell and the program you're trying to run are written in C. They can both be written in modern languages that are happy to have zero bytes in strings, but the command line arguments moving between them are being squeezed through an API that requires null-terminated strings. The only way around this would be a completely new set of APIs on both sides, and that's extremely unlikely at this point.

Because filenames are also passed to the kernel as C strings, they too can't contain zero bytes. Neither can environment variables, which are passed between programs (through the kernel) as another array of C strings.

As a corollary, certain character set encodings really don't work as locales on Unix because they run into this. Any character set encoding that can generate zero bytes as part of its characters is going to have serious problems with filenames and command line arguments; one obvious example of such a character set is UTF-16. I believe the usual way for Unixes to deal with a filesystem that's natively UCS-2 or UTF-16 is to encode and decode to UTF-8 somewhere in the kernel or the filesystem driver itself.

NoNullsInArguments written at 00:08:25; Add Comment

2018-04-18

The sensible way to use Bourne shell 'here documents' in pipelines

I was recently considering a shell script where I might want to feed a Bourne shell 'here document' to a shell pipeline. This is certainly possible and years ago I wrote an entry on the rules for combining things with here documents, where I carefully wrote down how to do this and the general rule involved. This time around, I realized that I wanted to use a much simpler and more straightforward approach, one that is obviously correct and is going to be clear to everyone. Namely, putting the production of the here document in a subshell.

(
cat <<EOF
your here document goes here
with as much as you want.
EOF
) | sed | whatever

This is not as neat and nominally elegant as taking advantage of the full power of the Bourne shell's arcane rules, and it's probably not as efficient (in at least some sh implementations, you may get an extra process), but I've come around to feeling that that doesn't matter. This may be the brute force solution, but what matters is that I can look at this code and immediately follow it, and I'm going to be able to do that in six months or a year when I come back to the script.

(Here documents are already kind of confusing as it stands without adding extra strangeness.)

Of course you can put multiple things inside the (...) subshell, such as several here documents that you output only conditionally (or chunks of always present static text mixed with text you have to make more decisions about). If you want to process the entire text you produce in some way, you might well generate it all inside the subshell for convenience.

Perhaps you're wondering why you'd want to run a here document through a pipe to something. The case that frequently comes up for me is that I want to generate some text with variable substitution but I also want the text to flow naturally with natural line lengths, and the expansion will have variable length. Here, the natural way out is to use fmt:

(
cat <<EOF
My message to $NAME goes here.
It concerns $HOST, where $PROG
died unexpectedly.
EOF
) | fmt

Using fmt reflows the text regardless of how long the variables expand out to. Depending on the text I'm generating, I may be fine with reflowing all of it (which means that I can put all of the text inside the subshell), or I may have some fixed formatting that I don't want passed through fmt (so I have to have a mix of fmt'd subshells and regular text).

Having written that out, I've just come to the obvious realization that for simple cases I can just directly use fmt with a here document:

fmt <<EOF
My message to $NAME goes here.
It concerns $HOST, where $PROG
died unexpectedly.
EOF

This doesn't work well if there's some paragraphs that I want to include only some of the time, though; then I should still be using a subshell.

(For whatever reason I apparently have a little blind spot about using here documents as direct input to programs, although there's no reason for it.)

SaneHereDocumentsPipelines written at 23:05:30; Add Comment

2018-04-16

Some notes and issues from trying out urxvt as an xterm replacement

I've been using xterm for a very long time, but I'm also aware that it's not a perfect terminal emulator (especially in today's Unicode world, my hacks notwithstanding). Years ago I wrote up what I wanted added to xterm, and the recommendation I've received over the years (both on that entry and elsewhere) is for urxvt (aka rxvt-unicode). I've made off and on experiments with urxvt, but for various reasons I've recently been trying a bit more seriously to use it regularly and to evaluate it as a serious alternative to xterm for me.

One of my crucial needs in an xterm replacement is an equivalent of xterm's ziconbeep feature, which I use to see when an iconified xterm has new output. Fortunately that need was met a long time ago through a urxvt Perl extension written by Leah Neukirchen; you can get the extension itself here. In my version I took out the audible bell. Without this, urxvt wouldn't be a particularly viable option for me, so I'm glad that it exists.

Urxvt's big draw as an xterm replacement is that it will reflow lines as you widen and narrow it. However, for a long time this didn't seem to work for me, or didn't seem to work reliably. Back in last September I finally discovered that the issue is that urxvt only reflows lines after a resize if it's already scrolled text in the window. This is the case both for resizing wider and for resizing narrower, which can be especially annoying (since resizing wider can sometimes 'un-scroll' a window). This is something that I can sort of work around; these days I often make it a point to start out my urxvt windows in their basic 80x24 size, dump out the output that I'll want, and only then resize them to read the long lines. This mostly works but it's kind of irritating.

(I'm not sure if this is a urxvt bug or a deliberate design decision. Perhaps I should try reporting it to find out.)

Another difference is that xterm has relatively complicated behavior on double-clicks for what it considers to be separate 'words'; you can read the full details in the manpage's section on character classes. Urxvt has somewhat simpler behavior based on delimiter characters, and its default set of delimiters make it select bigger 'words' than xterm does. For instance, a standard urxvt setup will consider all of a full path to be one word, because / is not a delimiter character (neither is :, so all of your $PATH is one word as far as urxvt is concerned). I'm highly accustomed to xterm's behavior and I prefer smaller words here, because it's much easier to widen a selection than it is to narrow it. You can customize some of this behavior with urxvt's cutchars resource (see the urxvt manpage). Currently I'm using:

! requires magic quoting for reasons.
URxvt*cutchars:   "\\`\"'&()*,;<=>?@[]^{|}.#%+!/:-"

This improves the situation in urxvt but isn't perfect; in practice I see various glitches, generally when several of these delimiters happen in a row (eg given 'a...', a double-click in urxvt may select up to the entire thing). Since I'm using the default selection Perl extension, possibly I could improve things by writing some complicated regular expressions (or replace the selection extension entirely with a more controllable version where I understand exactly what it's doing). If I want to exactly duplicate xterm's behavior, a Perl extension is probably the only way to achieve it.

(I'm not entirely allergic to writing Perl extensions for urxvt, but it's been a long time since I wrote Perl and I'm not familiar with the urxvt extensions API, so at a minimum it's going to be a pain.)

Given these issues I'm not throwing myself into a complete replacement of my xterm usage with urxvt, but I am reaching for it reasonably frequently and I've taken steps to make it easier to use in my environment. This involves both making it as conveniently accessible as xterm and also teaching various bits of my window manager configuration and scripting that urxvt is a terminal window and should be treated like xterm.

This whole thing has been an interesting experience overall. It's taught me both how much I'm attuned to very specific xterm behaviors and how deeply xterm has become embedded into my overall X environment.

UrxvtNotes written at 00:26:59; Add Comment

2018-03-04

The value locked up in the Unix API makes it pretty durable

Every so often someone proposes or muses about replacing Unix with something more modern and better, or is surprised when new surface OSes (such as ChromeOS) are based on Unix (often Linux, although not always). One reason that this keeps happening and that some form of Unix is probably going to be with us for decades to come is that there is a huge amount of value locked up in the Unix API, and in more ways than are perhaps obvious.

The obvious way that a great deal of value is locked up in the Unix API is the kernels themselves. Whether you look at Linux, FreeBSD, OpenBSD, or even one of the remaining commercial Unixes, all of their kernels represent decades of developer effort. Some of this effort is in the drivers, many of which you could do without in an OS written from scratch for relatively specific hardware, but a decent amount of the effort is in core systems like physical and virtual memory management, process handling, interprocess communication, filesystems and block level IO handling, modern networking, and so on.

However, this is just the tip of the iceberg. The bigger value of the Unix API is in everything that runs on top of it. This comes in at least two parts. The first part is all of the user level components that are involved to boot and run Unix and everything that supports them, especially if you include the core of a graphical environment (such as some form of display server). The second part is all of the stuff that you run on your Unix as its real purpose for existing, whether this is Apache (or some other web server), a database engine, your own custom programs (possibly written in Python or Ruby or whatever), and so on. It's also the support programs for this, which blur the lines between the 'system' and being productive with it; a mailer, a nice shell, an IMAP server, and so on. Then you can add an extra layer of programs used to monitor and diagnose the system and another set of programs if you develop on it or even just edit files. And if you want to use the system as a graphical desktop there is an additional stack of components and programs that all use aspects of the Unix API either directly or indirectly.

All of these programs represent decades or perhaps centuries of accumulated developer effort. Throwing away the Unix API in favour of something else means either doing without these programs, rewriting your own versions from scratch, or porting them and everything they depend on to your new API. Very few people can afford to even think about this, much less undertake it for a large scale environment such as a desktop. Even server environments are relatively complex and multi-layered in practice.

(Worse, some of the Unix API is implicit instead of being explicitly visible in things like system calls. Many programs will expect a 'Unix' to handle process scheduling, memory management, TCP networking, and a number of other things in pretty much the same way that current Unixes do. If your new non-Unix has the necessary system calls but behaves significantly differently here, programs may run but not perform very well, or even malfunction.)

Also, remember that the practical Unix API is a lot more than system calls. Something like Apache or Firefox pretty much requires a large amount of the broad Unix API, not just the core system calls and C library, and as a result you can't get them up on your new system just by implementing a relatively small and confined compatibility layer. (That's been tried in the past and pretty much failed in practice, and is one reason why people almost never write programs to strict POSIX and nothing more.)

(This elaborates on a tweet of mine that has some additional concrete things that you'd be reimplementing in your non-Unix.)

UnixAPIDurableValue written at 18:51:44; Add Comment

The practical Unix API is more than system calls (or POSIX)

What is the 'Unix API'? Some people would be tempted to say that this is straightforward; depending on your perspective it's either the relatively standard set of core Unix system calls or the system calls and library functions required by POSIX. This answer is not wrong at one level, but in practice it is not a useful one.

As people have found out in the past, the real Unix API is the whole collection of behaviors and environments that Unix programs assume. It isn't just POSIX library calls; it's also the shell and standard utilities and files that are in known locations and standard capabilities and various other things. A 'Unix' without a useful $HOME environment variable and /tmp may be specification compliant (I haven't checked POSIX) but it's not useful, in that many programs that people want generally won't run on it.

In practice the Unix API is the entire Unix environment. What constitutes the 'Unix' environment instead of the environment specific to a particular flavour of Unix is an ever-evolving topic. Once upon a time mmap() was not part of the Unix environment (cf); today it absolutely is. I'm pretty certain that once upon a time the -o flag to egrep was effectively Linux specific (as it relied on egrep being GNU grep); today it's much closer to being part of Unix, as many Unixes either have GNU grep as egrep or have added support for -o. And so it goes, with the overall Unix API moving forward through de facto evolution.

Unless you intend for your program to be narrowly and specifically portable to POSIX or an even more minimal standard, it is not a bug for it to rely on portions of the broader, de facto Unix API. It's not even necessarily a bug to rely on APIs that are only there on some Unixes (for example Linux and FreeBSD), although it may limit how widely your program spreads. Even somewhat narrow API choices are not necessarily bugs; you may have decided to be limited in your portability or to at least require some common things to be available.

(The Go build process requires Bash on Unix, for example, although it doesn't require that /bin/sh is Bash.)

PS: This is a broader sense of the 'Unix API' (and a different usage) than I used when I wrote about whether the C runtime and library was a legitimate part of the Unix API. The broad Unix API is and always has been layered, and things like Go are deliberately implementing their own API on top of one of the lower layers. In a way, my earlier entry was partly about how separate the layers of the broad Unix API have to be; for example, can you implement a compatible and fully capable Bourne shell using only public Unix kernel APIs, or at most public C library APIs?

(Many people would say that a system where you could not do this was not really 'Unix', even if it complied with POSIX standards.)

UnixAPIMoreThanSyscalls written at 01:03:15; Add Comment

2018-02-18

Memories of MGR

I recently got into a discussion of MGR on Twitter (via), which definitely brings back memories. MGR is an early Unix windowing system, originally dating from 1987 to 1989 (depending on whether you go from the Usenix presentation, when people got to hear about it, to the comp.sources.unix, when people could get their hands on it). If you know the dates for Unix windowing systems you know that this overlaps with X (both X10 and then X11), which is part of what makes MGR special and nostalgic and what gave it its peculiar appeal at the time.

MGR was small and straightforward at a time when that was not what other Unix window systems were (I'd say it was slipping away with X10 and X11, but let's be honest, Sunview was not small or straightforward either). Given that it was partially inspired by the Blit and had a certain amount of resemblance to it, MGR was also about as close as most people could come to the kind of graphical environment that the Bell Labs people were building in Research Unix.

(You could in theory get a DMD 5620, but in reality most people had far more access to Unix workstations that you could run MGR on that they did to a 5620.)

On a practical level, you could use MGR without having to set up a complicated environment with a lot of moving parts (or compile a big system). This generally made it easy to experiment with (on hardware it supported) and to keep it around as an alternative for people to try out or even use seriously. My impression is that this got a lot of people to at least dabble with MGR and use it for a while.

Part of MGR being small and straightforward was that it also felt like something that was by and for ordinary mortals, not the high peaks of X. It ran well on ordinary machines (even small machines) and it was small enough that you could understand how it worked and how to do things in it. It also had an appealingly simple model of how programs interacted with it; you basically treated it like a funny terminal, where you could draw graphics and do other things by sending escape sequences. As mentioned in this MGR information page, this made it network transparent by default.

MGR was not a perfect window system and in many ways it was a quite limited one. But it worked well in the 'all the world's a terminal' world of the late 1980s and early 1990s, when almost all of what you did even with X was run xterms, and it was often much faster and more minimal than the (fancier) alternatives (like X), especially on basic hardware.

Thinking of MGR brings back nostalgic memories of a simpler time in Unix's history, when things were smaller and more primitive but also bright and shiny and new and exciting in a way that's no longer the case (now they're routine and Unix is everywhere). My nostalgic side would love a version of MGR that ran in an X window, just so I could start it up again and play around with it, but at the same time I'd never use it seriously. Its day in the sun has passed. But it did have a day in the sun, once upon a time, and I remember those days fondly (even if I'm not doing well about explaining why).

(We shouldn't get too nostalgic about the old days. The hardware and software we have today is generally much better and more appealing.)

MGRMemories written at 02:00:03; Add Comment

(Previous 10 or go back to February 2018 at 2018/02/02)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.