Wandering Thoughts

2021-10-10

V7 Unix had no stack size limit, and when Unix acquired one

Famously, modern Unixes by limit the (default) size of the main process stack and the size of thread stacks; they pick different limits, which causes issues because C has no way of dealing with this. Today, for reasons beyond the scope of this entry, I became curious if V7 had any stack size limit and if it didn't, when such a limit appeared.

The short answer about V7 is that V7 had no stack size limit. The combination of your stack and your program's data (and perhaps your program's code, depending on the model of PDP-11) could take up all of the 64 KB of memory available to your process for this. If your process took a memory access fault and its stack pointer was below the kernel's idea of the bottom of your stack, the kernel automatically grew the stack as far as necessary. You can see this in the V7 trap.c, where the decision to grow is made, and sig.c, which has the routine for the growing (and also ureg.c for one additional function).

On the BSD line of Unixes, 3BSD doesn't seem to have anything but 4BSD introduces a vlimit(2) system call to get and set limits, including the stack size. By 4.1c BSD, this has changed to the now familiar getrlimit(2) and setrlimit(2) system calls.

(Interestingly, Linux glibc provides a vlimit(), according to the manpage.)

On the AT&T side, there doesn't seem to be anything much that I could find. The System V Interface Definition (SVID) seems to be the official source of a ulimit() system call, but its only official thing is limiting file size. As far back as System III, there is a ulimit(2), but it doesn't set the stack size limit; the closest it comes is that it can get (but not set) the brk(2) limit. The System III VAX trap.c and sig.c still seem to be free of stack size limit checks, and have comments that say things are grown unconditionally.

(The current Oracle Solaris ulimit(2) manual page still talks about being able to get the maximum possible break value, although this isn't required in POSIX.)

StackSizeLimitWhen written at 00:14:45; Add Comment

2021-10-01

Firefox on Unix is moving away from X11-based remote control

On Unix, Firefox has had a long standing feature where you could remote control a running Firefox instance, which is how Firefox insures that running 'firefox <some URL>' works right when you already have Firefox running (probably all browsers implement this on all platforms). For years, there's been two different ways for the Firefoxes to communicate, an older one that used X properties and a more recent one that used D-Bus. Today I discovered that Firefox Nightly has effectively deprecated and removed the X11 based remote control mechanism.

The specific change has the title "Use DBus remote when Firefox is built with --enable-dbus", and is apparently to fix bug 1724242, "Background update applied when mixing X11/Wayland and opening remote link". The broad outline of the problem (which has been an issue for years) is that a Firefox running on Wayland must use D-Bus remote control, but if an X program running in XWayland then tries to start Firefox (for example to open a URL), the new Firefox may (only) try to find a running Firefox via the X mechanism and then fail, with various consequences. The Mozilla solution is to only use D-Bus remote control and basically drop X11 remote control.

(Currently the code is not removed from the Firefox source, but it's not built if D-Bus is enabled in Firefox. Almost all Firefox builds will have D-Bus enabled, both official and from Unix distributions.)

The advantage of D-Bus remote control is that it's simpler and somewhat more reliable, and obviously it works regardless of what is or isn't running under Wayland. The disadvantage of D-Bus remote control is that it doesn't work from other machines you've forwarded X to, unlike the X based remote control. Unfortunately for me, I very much need this to work from at least some remote machines, and so now I do my custom Firefox builds without D-Bus support, which currently doesn't seem to be very limiting.

Sidebar: cross machine remote control of Firefox in a D-Bus world

I can see two ways to make off-machine remote control of Firefox still work if you're stuck with a D-Bus only version of Firefox. The probably harder approach would be to write an X remote control proxy that you ran locally in your X session. This would pretend to be Firefox for the X remote control protocol and then bridge commands and responses to D-Bus. The probably easier approach would be to write a cooperating pair of programs that talked to each other over a SSH connection and proxied the D-Bus protocol. The remote end would claim to be Firefox on your (remote) D-Bus, then pass requests back to the local end over the SSH connection using some simple protocol, where the local end would make D-Bus requests to your local Firefox and pass the answers back.

(There's also a simpler version of the second approach where you don't try to duplicate the D-Bus protocol, but instead just listen on a Unix domain socket and take URLs to pass back to the other end. This requires a new little command on the remote side to connect to your remote relay's Unix domain socket, but that's pretty easy. You could probably build this simpler version with shell scripts.)

Each approach has advantages and drawbacks. Since I already use remote X and have my own remote control program that only speaks the X protocol, I'm biased to the first solution provided that I don't have to actually write the X proxy. In practice, if I have to go to D-Bus I'll probably wind up doing the simpler version of the second approach because I'll be writing it myself.

FirefoxNoX11RemoteControl written at 23:20:13; Add Comment

2021-09-25

Notes on updating OpenBSD machines to current, supported versions

One of the frequently noted things about OpenBSD is that its releases have a short support period. OpenBSD generally releases security and other updates for the current release and the previous one, and only updates third party packages in the current release (cf). Since OpenBSD releases more or less every six months, this gives you at most a year of support for what you install. One of the practical results of this policy is that people wind up running unsupported releases, but let's suppose that you want to keep with supported ones. There are two general approaches.

The gold standard is to reinstall your systems from scratch using the new release, which insures more reproducible and understandable installs and also gives you the natural opportunity to rethink and re-check your customizations to make sure they're still appropriate. This is generally what we do, following our general habits for everything. Although I find its disk partitioning a bit annoying, OpenBSD installs fast and we generally have minimal customizations. On the downside, you need to set up additional machines to be the new versions and there's some hassles at the downtime to change over.

(If we had significant state on our OpenBSD machines then this would be much harder.)

The other option is to use OpenBSD's sysupgrade(8) to upgrade an existing system in place. As OpenBSD says repeatedly, upgrades are only supported from one release to the release immediately following; if you need to jump several releases (perhaps because you let a machine sit), you'll need to go through the process repeatedly. Even with sysupgrade(8), you'll need to do some manual steps to fix up differences and adjust configuration files. OpenBSD covers these for their packages in their upgrade guides, such as the 6.8 to 6.9 upgrade guide. The most recent upgrade guide is always linked from The OpenBSD FAQ, and it links to the previous one and so on (plus, they have predictable URLs). In my limited experimentation, these version to version upgrades work, although I haven't attempted to see how different an upgraded machine is from a machine that was reinstalled from scratch.

Even if you're going to reinstall from scratch, I think it's worth reading the upgrade guide, because the upgrade guides often discuss important changes that you'll otherwise get to find the hard way. For example, the 6.8 to 6.9 guide discusses how PF is now stricter in port ranges and the syntax for routing options has changed.

(Since I looked into all of this and experimented with the sysupgrade method, I wanted to write it down before I forgot it.)

OpenBSDUpgrading written at 22:54:35; Add Comment

2021-09-13

The rc shell's nice feature for subdirectory searching of $PATH

I've used Byron Rakitzis's Unix reimplementation of Tom Duff's rc Plan 9 shell for a long time (for a variety of reasons). One of the ways that rc is different from standard Unix shells is that it has an unusual and surprisingly convenient additional feature in how it handles searching your $PATH.

In most Unix shells, if you type 'fred/barney' as a command name to execute, it's immediately interpreted as a relative path. For example, if you do:

$ fred/barney --help

this is the same as './fred/barney' and your shell will try to run something with that relative file name. This behavior matches the interpretation of this command name as a file name, since all file names that don't start with '/' are relative names.

In rc, relative names that don't start with './' or '../' are instead searched for on your $PATH, as the obvious combination of subdirectory (or subdirectories) and executable. If you do:

; fred/barney --help

then rc searches through your $PATH for something with a fred subdirectory with a barney in it. For example, it might wind up running $HOME/bin/fred/barney. See also this short description of rc's behavior.

This simple sounding feature turns out to be surprisingly useful. One way of putting it is that it lets you create namespaces for commands (with subdirectories being the namespaces); another way is that it lets you have "subcommands" of "commands" (where the "commands" are subdirectories and the subcommands are programs or shell scripts in it). In the rc approach, you would not have a "git" command that had dozens of subcommands and type 'git <subcommand>'; instead you would have a 'git' subdirectory somewhere in your $PATH and type 'git/<subcommand>'.

(In a version of rc that has command completion that includes these subdirectories (see here and here), this also creates a natural and general mechanism for completion of these 'subcommands', since they are regular executables.)

As far as I know, this nice little feature isn't widely available in other Unix shells. It's supported in zsh through the PATH_DIRS option, although apparently command completion can be wonky. I don't think Bash implements this directly with some option, but you could probably use Bash's 'command not found' hook function to do a version of this (maybe without command completion).

(I was inspired to write this entry by reading sd: my script directory (via), which basically implements this for normal Unix shells by using a cover program.)

PS: In rc, your $PATH is really your $path. But rc synchronizes the two environment variables for you so I've simplified the situation in this entry.

RcSubdirectoryPATHSearch written at 23:57:04; Add Comment

2021-09-01

Large Unix programs were historically not all that portable between Unixes

I recently read Ruben Schade's I’m not sure that UNIX won (via) and had a number of reactions to it. One of them is about the portability of programs among Unixes, which is one of the issues that Schade sees as a problem today. Unfortunately, I have bad news for people who are yearning for the (good) old days. The reality is that significant Unix programs have never been really portable between Unix variants, and if anything today is at an all-time high for program portability by default between Unixes.

Back in the days (the late 1980s and early 1990s specifically), one of the things that Larry Wall was justly famous for was his large, intricate, and comprehensive configure scripts that made rn and Perl build on pretty much any Unix and Unix-like system that you could name. Wall's approach of configure scripts was generalized and broadened by GNU Autoconf, GNU Autotools, and so on. These tools did not automatically make your complex programs portable between different Unixes, but they gave you the tools that you could use to sort out how to achieve that, and to automatically detect various things you needed to do to adopt to the local Unix (and if you used some of them, you automatically got pointed to the right include directories and the right libraries to link with).

People did not create and use all of these tools because they wanted a complex build process or to write lots of extra (and often obscure) code. They used these systems because they had to, because there were all sorts of variations between the Unix systems of the time. Some of these variations were in where programs were and what their capabilities were (the POSIX compatible Bourne shell wasn't always /bin/sh, for example). Others were in what functions were available, what include files you used to get access to them, and what libraries you had to link in.

(Hands up everyone who ever had to add some variation of '-lsocket -lnsl -lresolv' to their compile commands on some Unix to use hostname resolution and make IP connections.)

You might hope that POSIX would have made all of this obsolete in old Unixes. Not so. First, not all Unixes were fully POSIX compatible in the first place; some only added partial POSIX compatibility over time (I'm not sure any were really POSIX compatible very fast). Second, even when Unixes such as Solaris had a POSIX compatibility layer, they didn't necessarily make it the default; you could have to go out of your way to get POSIX compatible utilities, functions, include files, and libraries. And finally, not everything that substantial Unix programs wanted to use was even covered by POSIX (or free of issues when implemented in practice).

All of this incompatibility was encouraged by the commercial Unix vendors because it was in their cold blooded self interest to get people to make their current programs hard to build and run outside of Solaris, IRIX, HP-UX, OSF/1, or whatever. The more of a pain it would be to move to another vendor's Unix, the less chance that vendor could steal your customer from you by offering a cheaper deal. In a related development, Unix vendors spent a long time invoking the specter of "backwards compatibility" as a reason for never changing their systems to make them more POSIX compatible by default, to modernize their command line tools, and so on.

The situation with modern open source Unixes is much better. They are mostly POSIX compatible by default, and Unixes having converged on a relatively standard set of include files, standard library functions, and so on. There are variations between Unixes (including between different libc implementations on Linux) and between current and older releases, but for the most part the differences are much smaller today, to the degree that a lot of the work that GNU Autoconf does by default feels quaint and time-wasting.

(Where there are major differences they tend to be in areas related to system management and system level concerns, instead of user level C programs like rn.)

PS: Unix programs tended to be much more portable between the same Unix on different architectures, but relatively few old Unix vendors ever had such environments, especially for long. And let us not talk about the move from 32-bit to 64-bit environments, or the issue that was known as the time as "all the world's a Vax" (experienced as people began to move to Suns, which among other differences had a different endianness).

ProgramsVsPortability written at 22:06:05; Add Comment

2021-08-13

Learning that Vim has insert mode keystrokes that do special things

I use Vim a fair bit, but most of the time I'm merely doing ordinary text entry, predominantly in insert mode. At the same time, I am not the world's best typist (my Delete key gets a good workout). One of my long-standing Vim experiences is that I will be typing along, happily entering text, and then I will do something and suddenly I will have a jumble of unwanted text and text changes.

(This is different from the classical Vi experience where you fumble what you're typing in command mode and all sorts of things happen.)

For a long time, I assumed that I had probably accidentally escaped into command mode and triggered the classical Vi mistake of typing random things in command mode. However, recently I was reading A Vim Guide for Adept Users (one of my hobbies is reading Vim guides), and hit the section on Useful Keystrokes in Insert Mode. A little light went on in my mind.

I've always known that Vim responds to some control keys and key sequences in insert mode, and in fact one of the ways I'm using Vim instead of Vi is that I want Delete in insert mode to back up past the start of the line. However, I hadn't previously known that Vim had such a significant collection of text modification keystrokes in insert mode. The two keystrokes that seem most likely to be responsible for various of my mistakes are Ctrl-a (which will insert various amounts of text) and Ctrl-@ (which inserts text and then escapes to command mode on the spot, where my continued typing will cause even more damage). Ctrl-a is relatively easy to hit, too.

The ins-special-keys section of the insert mode documentation has the full list. Some of them seem potentially useful, especially Ctrl-t and Ctrl-d.

PS: My unintended text alteration adventures are probably not helped by my habit of escaping to command mode periodically to do various fidgets, like writing the file or reflowing the paragraph. Command mode has all sorts of dangerous characters that can cause lots of havoc, including '.' and the number keys, and there are a number of ways to accidentally start entering a multi-character sequence that will trap and reinterpret the rest of what you think you're typing as commands.

VimHasInsertModeKeystrokes written at 00:19:10; Add Comment

2021-08-08

The xterm terminal emulator can do a lot more than just display text

Recently, a problem report on the fvwm mailing list caused me to discover that Vim can react to losing or gaining X keyboard focus, through automatic commands (specifically FocusGained and FocusLost). I'm sorry, did I say that Vim did that? Well, Vim does, but Vim isn't working alone; the venerable X terminal emulator xterm is a partner in this. Xterm, it turns out, can basically pass through a lot of X events to "terminal" programs running inside it, including FocusIn and FocusOut.

(Xterm also plays a role in Bash's bracketed paste mode. In fact the name for it comes from the fact that xterm will bracket pastes with special control sequences so you can recognize them.)

As you may have guessed by now, xterm (and things that copy it sufficiently completely) can do a lot more than just display text and emulate a DEC VT terminal. The XTerm Control Sequences documentation goes on at significant length, and it doesn't even fully describe everything it covers (it assumes you have a certain amount of existing knowledge about the area). These days, my impression is that XTerm is a fairly capable environment for implementing text-mode X based programs, one where you can get a lot of fairly raw X events (including focus ones) and react to them.

I've used xterm for long enough that this feels a little bit weird to me. I still think of xterm as a terminal emulator, not a general support environment for text-based programs to interact with X. But there's clearly a desire and a need for this sort of thing, both for full featured programs like Vim (which can do a lot in cooperation with xterm if you let it) and for features like Bash's bracketed paste mode.

(I may not like bracketed paste mode myself, but plenty of other people clearly do. I'm not going to stand in the way of them having that capability.)

PS: People who are interested in this in Vim might want to read the summary of the issue from the fvwm mailing list.

XTermQuiteSophisticated written at 22:47:14; Add Comment

2021-08-06

Some bits of how Bash and GNU Readline's "bracketed paste" mode behaves

Recent versions of Bash and GNU Readline now default to special handling of pasting into them, in what is called "bracketed paste" mode. This mode requires you to explicitly hit Return in order to have the pasted text accepted, and allows you to edit the paste before then. This simple description might leave you wondering how bracketed paste mode interacts with various other things, and as it happens I have done some experiments here (out of curiosity, since bracketed paste mode is not for us).

First, although it's not completely clear from the description, bracketed paste treats multiple lines pasted at once as a single unit for the purposes of editing and accepting them (or rejecting them with a Control-C). After they've been accepted, Bash (and presumably Readline) will treat them as separate lines for history and so on. This is probably the behavior you want if you're pasting multiple commands.

Second, bracketed paste sends all of your pasted text to Bash (or the Readline based program you're using), regardless of how it would otherwise be interpreted if you typed it, and Bash will process all lines as command lines. That sounds abstract, so let's give you a concrete example. Suppose that you have the following text to cut and paste from some instructions:

cat >/etc/some-config.conf
directive one
directive two

If you paste this normally (and then hit Control-D afterward), the some-config.conf winds up with the two directives in it, because Bash first runs the cat and then the remaining two lines are read by cat. If you paste this into a bracketed paste Bash and then hit Return to accept it all, Bash first runs cat, with it reading from the terminal as standard input and waiting for you to type something to it, and then when you hit Ctrl-D it will attempt to run 'directive one' and 'directive two' as commands. There's no visual indication that this is happening and that Bash has grabbed the second and third lines for itself.

On the one hand, in the Unix model it's hard to see how Bash could do this differently without deep cooperation from the kernel terminal driver (which isn't available). On the other hand, this makes bracketed paste mode dangerously different from un-bracketed paste in a way that is not particularly obvious and I think is not necessarily intuitive.

PS: Because I tested it, Bash really takes all of your pasted text as command lines, even if one of the pasted commands is 'read', which Bash handles internally and so could pass your text to. I assume that internally, Readline is buffering all of this up as command input. Programs that use Readline for multiple things might mix your pasted input in a different way.

BracketedPasteBehaviorNotes written at 22:28:25; Add Comment

2021-08-01

Unix APIs are where I first saw C #define used to rename struct fields

In my entry on learning that you can use C unions for grouping struct fields into namespaces, I mentioned that the traditional version of this was done with #define to create convenient short names for fields that were actually hidden deeper in a struct than they appeared. I didn't come up with this idea on my own; instead, I'm pretty sure that where I first saw this was in Unix APIs. More exactly, in the actual implementation of Unix APIs, as visible in C header files. Most often, I think this was done because of unions in the struct, which at the time had to be named.

A typical example that I can find is in the 4.2 BSD arpa/tftp.h header:

struct tftphdr {
  short th_opcode;     /* packet type */
  union {
    short tu_block;    /* block # */
    short tu_code;     /* error code */
    char  tu_stuff[1]; /* request packet stuff */
  } th_u;
  char th_data[1];     /* data or error string */
};

#define th_block  th_u.tu_block
#define th_code   th_u.tu_code
#define th_stuff  th_u.tu_stuff

Here various #defines are being used to give short names to things that are actually inside a union. Today we could do this directly with an anonymous union, but those only officially appeared in C11 after being a GNU gcc extension.

(Another example is in the 4.2 BSD wait.h, which despite its position in the tree here was also in /usr/include; see MAKEDIRS. Another generally more internal use is in inode.h.)

This approach would have serious problems with th_block or other #definedd fields if they were also used in one of your own structs for some other purpose. But in the very old days of C, struct fields had to be globally unique across your entire program, so it was common to give them a relatively unique name and you already knew you had problems if you reused a field name from what the manpage said was a struct that was part of a library API.

(I haven't looked at how the early C compilers implemented struct fields, but I suspect that the V7 C compiler just put all struct field names into a global table with their type and offset. This seems a very V7 shortcut.)

UnixCDefinesForFields written at 23:09:22; Add Comment

2021-07-14

Making two Unix permissions mistakes in one

I tweeted:

Today's state of work-brain:
mkdir /tmp/fred
umask 077 /tmp/fred

Immediately after these two commands, I hit cursor-up to change the 'umask' to 'chmod', so that I then ran 'chmod 077 /tmp/fred'. Fortunately I was doing this as a regular user, so my next action exposed my error.

This whole sequence of commands is a set of mistakes jumbled together in a very Unix way. My goal was to create a new /tmp/fred directory that was only accessible to me. My second command is not just wrong because I wanted chmod instead of umask (I should have run umask before the mkdir, not after), but because I had the wrong set of permissions for chmod. It was as if my brain wanted Unix to apply a 'umask 077' to the creation of /tmp/fred after the fact. Since the numeric permissions you give to umask are the inverse of the permissions you give to chmod (you tell umask what you don't want instead of what you do), my change of umask to chmod then left /tmp/fred with completely wrong permissions; instead of being only accessible to me, it was fully accessible to everyone except me.

(Had I been doing this as root, I would then have been able to cd into the directory, put files in it, access files in it, and so on, and might not have noticed that the permissions were reversed from what I actually wanted.)

The traditional Unix umask itself is a very Unix command (well, shell built-in), in that it more or less directly calls umask(). This allows a very simple implementation, which was a priority in early Unixes like V7. A more sensible implementation would be that you specify effectively the maximum permissions that you want (for example, that things can be '755') and then umask would invert this to get the value it uses for umask(). But early Unixes took the direct approach, counting on people to remember the inversion and perform it in their heads.

In the process of writing this entry I learned that POSIX umask supports symbolic modes, and that they work this way. You get and set umask modes like 'u=rwx,g=rx,o=rx' (aka '022', the traditional friendly Unix umask), and they're the same permissions as you would use with chmod. I believe that this symbolic mode is supported by any modern Bourne compatible shell (including zsh), but it isn't necessarily supported by non-Bourne shells such as tcsh or rc (which is my shell).

PermissionsTwoMistakes written at 23:53:11; Add Comment

(Previous 10 or go back to July 2021 at 2021/07/10)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.