Wandering Thoughts

2024-06-10

The NFS server 'subtree' export problem

NFS servers have a variety of interesting problems that ultimately exist because NFS was first defined a long time ago in a world where (Unix) filesystems were simpler and security was perhaps less of a concern. One of their classical problems is that how NFS clients identify files is surprisingly limited. Another problem is what I will call the 'subtree export' issue.

Suppose that you have a filesystem called '/special', and this filesystem contains directory trees '/special/a' and '/special/b'. The first directory tree is exported only to one NFS client, A, and the second directory tree is exported only to another NFS client, B. Now suppose that client A presents the NFS server with a request to read some file, which it identifies by an NFS filehandle. How does the NFS server know that this file is located under /special/a, the only part of the /special filesystem that client A is supposed to have access to? This is the subtree export issue.

(This problem comes up because NFS clients can forge their own NFS filehandles or copy NFS filehandles from other clients, and NFS filehandles generally don't contain the path to the object being accessed. Normally all the NFS server can recover from an NFS filehandle is the filesystem and some non-hierarchical identifier for the object, such as its inode number.)

The very early NFS servers ignored the entire problem because they started out with no NFS filehandle access checks at all. Even when NFS servers started applying some access checks to NFS filehandles, they generally ignored the subtree issue because they had no way to deal with it. If you exported a subtree of a filesystem to some client, in practice the client could access the entire filesystem if it made up appropriate valid NFS filehandles. The manual pages sometimes warned you about this.

(One modern version of this warning appears in the the FreeBSD exports manual page. The Illumos share_nfs manual page doesn't seem to discuss this subtree issue, so I don't know how Illumos handles it.)

Some modern NFS servers try to do better, and in particular the Linux kernel NFS server does. Linux does this by trying to work out the full path within the filesystem of everything you access, leveraging the kernel directory entry caches and perhaps filesystem specific information about parent directories. On Linux, and in general on any system where the NFS server attempts to do this, checking for this subtree export issue adds some overhead to NFS operations and may possibly reject some valid NFS operations because the NFS server can't be sure that the request is within an allowed subtree. Because of this, Linux's exports(5) NFS options support a 'no_subtree_check' option that disables this check and in fact any security check that requires working out the parent of something.

Generally, the subtree export issue is only a problem if you think NFS clients can be compromised to present NFS filehandles that you didn't give them. If you only export a subtree of a filesystem to a NFS client, a properly operating NFS environment will deny the client's request to mount anything else in the filesystem, which will stop the client from getting NFS filehandles for anything outside its allowed subtree.

(This still leaves you with the corner case of moving a file or a directory tree from inside the client's allowed subtree to outside of it. If the NFS client is currently using the file or directory, it will still likely be able to keep accessing it until it stops and forgets the file's NFS filehandle.)

Obviously, life is simpler if you only export entire filesystems to NFS clients and don't try to restrict them to subtrees. If a NFS client only wants a subtree, it can do that itself.

NFSServerSubtreeProblem written at 23:06:44; Add Comment

2024-06-05

Maybe understanding uname(1)'s platform and machine fields

When I wrote about some history and limitations of uname(1) fields, I was puzzled by the differences between 'uname -m', 'uname -i', and 'uname -p' in the two variants of uname that have all three, Linux uname and Illumos uname. Illumos is descended from (Open)Solaris, and although I can't find manual pages for old Solaris versions of uname online, I suspect that Solaris is probably the origin of both '-i' and '-p' (the '-m' option comes from the original System V version that also led to POSIX uname). The Illumos manual page doesn't explain the difference, but it does refer to sysinfo(2), which has some quite helpful commentary if you read various bits and pieces. So here is my best guess at the original meanings of the three different options in Solaris.

Going from most general to most specific, it seems to be that on Solaris:

  • -p tells you the broad processor ISA or architecture, such as 'sparc', 'i386', or 'amd64' (or 'x86_64' if you like that label). This is what Illumos sysinfo(2) calls SI_ARCHITECTURE.

  • -m theoretically tells you a more specific processor and machine type. For SPARC specifically, you can get an idea of Solaris's list of these in the Debian Wiki SunSparc page (and also Wikipedia's Sun-4 architecture list).

  • -i theoretically tells you about the specific desktop or server platform you're on, potentially down to a relatively narrow model family; the Illumos sysinfo(2) section on SI_PLATFORM gives 'SUNW,Sun-Fire-T200' as one example.

Of course, 'uname -m' came first, and '-p' and '-i' were added later. I believe that Solaris started out being relatively specific in 'uname -m', going along with the System V and POSIX definition of it as the 'hardware type' or machine type. Once Solaris had done that, it couldn't change the output of 'uname -m' even as people started to want a broader processor ISA label, hence '-p' being the more generic version despite -m being the more portable option.

(GNU Coreutils started with only -m, added '-p' in 1996, and added '-i' in 2001. The implementation of both -p and -i initially only used sysinfo() to obtain the information.)

On x86 hardware, it seems that Unixes chose to interpret 'uname -m' generically, instead of trying to be specific about things like processor families. Especially in the early days of x86 Unix, the information needed for 'uname -i' probably just wasn't available, and also wasn't necessarily particularly useful. The Illumos sysinfo(2) section on SI_PLATFORM suggests that it just returns 'i86pc' on all conventional x86 platforms.

(GNU Coreutils theoretically supports '-i' and '-p' on Linux, but in practice both will normally report "unknown".)

Of course, once x86 Unixes started reporting generic things for 'uname -m', they were stuck with it due to backward compatibility with build scripts and other things that people had based on the existing output (and it's not clear what more specific x86 information would be useful for 'uname -m', although for 32-bit x86, people have done variant names of 'i586' and 'i686'). While there was some reason to support 'uname -p' for compatibility, it is probably not surprising that on both FreeBSD and OpenBSD, the output of 'uname -m' is probably mostly the same as 'uname -p'.

(OpenBSD draws a distinction between the kernel architecture and the application architecture, per the OpenBSD uname(1). FreeBSD draws a distinction between the 'hardware platform' (uname -m) and the 'processor architecture' (uname -p), per the FreeBSD uname(1), but on the FreeBSD x86 machine I have access to, they produce the same output. However, see the FreeBSD arch(7) manual page and this FreeBSD bug from 2017.)

PS: In a comment on my first uname entry. Phil Pennock showed that 'uname -m' and 'uname -p' differed in some macOS environments. I suspect the difference is following the FreeBSD model but I'm not sure.

Sidebar: The details of uname -p and uname -i on Linux

Linux distributions normally use the GNU Coreutils version of 'uname'. In its normal state, Coreutils' uname.c gets this information from either sysinfo() or sysctl(), if either support obtaining it (see the end of src/uname.c). On Linux, the normal C library sysinfo() and sysctl() don't support this, so normally 'uname -p' and 'uname -i' will both report 'unknown', since they're left with no code that can determine this information.

The Ubuntu (and perhaps Debian) package for coreutils carries a patch, originally from Fedora (but no longer used there), that uses the machine information that 'uname -m' would report to generate the information for -p and -i. The raw machine information can be modified a bit for both -i and -p. For -i, all 'i?86' results are turned into 'i386', and for -p, if the 'uname -m' result is 'i686', uname checks /proc/cpuinfo to see if you have an AMD and reports 'athlon' instead if you do (although this code may have decayed since it was written).

UnameOnPlatformFields written at 17:17:34; Add Comment

2024-06-04

Some history and limitations of uname(1) fields

Uname(1) is a command that hypothetically prints some potentially useful information about your system. In practice what information it prints, how useful that information is, and what command line options it supports varies widely between both different sorts of Unixes and between different versions of Linux (due to using different versions of GNU Coreutils, and different patches for it). I was asked recently if this situation ever made any sense and the general answer is 'maybe'.

In POSIX, uname(1) is more or less a program version of the uname() function. It supports only '-m', '-n', '-r', '-s', and '-v', and as a result of all of these arguments being required by POSIX, they are widely supported by the various versions of uname that are out there in various Unixes. All other arguments are non-standard and were added well after uname(1) initially came into being, which is one reason they are so divergent in presence and meaning; there is no enhanced ancestral 'uname' command for things to descend from.

The uname command itself comes from the System V side of Unix; it was first added as far back as at least System III, where the System III uname.c accepts -n, -r, -s, and -v with the modern meanings. System III gets the information from the kernel, in a utssys() system call. I believe that System V added the 'machine' information ('-m'), which then was copied straight into POSIX. On the BSD side, a uname command first appeared in 4.4 BSD, and the 4.4 BSD uname(1) manual page says that it also had the POSIX arguments, including -m. The actual implementation didn't use a uname() system call but instead extracted the information with sysctl() calls.

The modern versions of uname that I can find manual pages for are rather divergent; contrast Linux uname(1) (also), FreeBSD uname(1), OpenBSD uname(1), NetBSD uname(1), and Illumos uname(1) (manual pages for other Unixes are left as an exercise). For instance, take the '-i' argument, supported in Linux and Illumos to print a theoretical hardware platform and FreeBSD to print the 'kernel ident'. On top of that difference, on Linux distributions that use an unpatched build of Coreutils, I believe that 'uname -i' and 'uname -p' will both report 'unknown'.

(Based on how everyone has the -p argument for processor type, I suspect that it was one of the earliest additions to POSIX uname. How 'uname -m' differs from 'uname -p' in practice is something I don't know, but apparently people felt a need to distinguish the two at some point. Some Internet searches suggest that on Unixes such as Solaris, the processor type might be 'sparc' while the machine hardware name might be more specific, like 'sun4m'.)

On Linux and several other Unixes, much of the core information for uname comes from the kernel, which means that options like 'uname -r' and 'uname -v' have traditionally reported about the kernel version and build string, not anything to do with the general release of Linux (or Unix). On FreeBSD, the kernel release is usually fairly tightly connected to the userland version (although FreeBSD uname can tell you about the latter too), but on Linux it is not, and the Linux uname has no option to report a 'distribution' name or version.

In general, I suspect that the only useful fields you can count on from uname(1) are '-n' (some version of the hostname), '-s' (the broad operating system), and perhaps '-m', although you probably want to be wary about that. One of the cautions with 'uname -m' is that there is no agreement between Unixes about what the same hardware platform should be called; for example, OpenBSD uses 'amd64' for 64-bit x86 while Linux uses 'x86_64'. Illumos recommends using 'uname -p' instead of 'uname -m'.

(This entry's topic was suggested to me by Hunter Matthews, although I suspect I haven't answered their questions about historical uname values.)

Sidebar: An options comparison for non-POSIX options

The common additional options are:

  • -i: semi-supported on Linux, supported on Illumos, and means something different on FreeBSD (it's the kernel identifier instead of the hardware platform).
  • -o: supported on Linux, where it is often different from 'uname -s', FreeBSD, where it is explicitly the same as 'uname -s', and Illumos, where I don't know how it relates to 'uname -s'.
  • -p: semi-supported on Linux and fully supported on FreeBSD, OpenBSD, NetBSD, and Illumos. Illumos specifically suggests using 'uname -p' instead of 'uname -m', which will generally make you sad on Linux.

FreeBSD has -b, -K, and -U as additional FreeBSD specific arguments.

How 'uname -i', 'uname -p', and 'uname -m' differ on Illumos is not something I know; they all report something about the hardware, but the Illumos uname manpage mostly doesn't illuminate the difference. It's possible that this is more or less covered in sysinfo(2).

(The moral is that you can't predict the result of these options without running uname on an applicable system, or at least having a lot of OS-specific knowledge.)

UnameFieldsHistory written at 22:16:12; Add Comment

2024-05-15

Turning off the X server's CapsLock modifier

In the process of upgraded my office desktop to Fedora 40, I wound up needing to turn off the X server's CapsLock modifier. For people with a normal keyboard setup, this is simple; to turn off the CapsLock modifier, you tap the CapsLock key. However, I turn CapsLock into another Ctrl key (and then I make heavy use of tapping CapsLock to start dmenu (also)), which leaves the regular CapsLock functionality unavailable to me under normal circumstances. Since I don't have a CapsLock key, you might wonder how the CapsLock modifier got turned on in the first place.

The answer is that sometimes I have a CapsLock key after all. I turn CapsLock into Ctrl with setxkbmap settings, and apparently some Fedora packages clears these keyboard mapping settings when they're updated. Since upgrading to a new Fedora release updates all of these packages, my 'Ctrl' key resets to CapsLock during the process and I don't necessarily notice immediately. Because I expect my input settings to get cleared, I have a script to re-establish them, which I run when I notice my special Ctrl key handling isn't working. What happened this time around was that I noticed that my keyboard settings had been cleared when CapsLock didn't work as Ctrl, then reflexively invoked the script. Of course at this point I had tapped CapsLock, which turned on the CapsLock modifier, and then when the script reset CapsLock to be Ctrl, I no longer had a key that I could use to turn CapsLock off.

(Actually dealing with this situation was made more complicated by how I could now only type upper case letters in shells, browser windows, and so on. Fortunately I had a phone to do Internet searches on, and I could switch to another Linux virtual console, which had CapsLock off, and access the X server with 'export DISPLAY=:0' so I could run commands that talked to it.)

There are two solutions I wound up with, the narrow one and the general one. The narrow solution is to use xdotool to artificially send a CapsLock key down/up event with this:

xdotool key Caps_Lock

This will toggle the state of the CapsLock modifier in the X server, which will turn CapsLock off if it's currently on, as it was for me. This key down/up event works even if you have the CapsLock key remapped at the time, as I did, and you can run it from another virtual console with 'DISPLAY=:0 xdotool key Caps_Lock' (although you may need to vary the :0 bit). Or you can put it in a script called 'RESET-CAPSLOCK' so you can type its name with CapsLock active.

(Possibly I should give my 'reset-input' script an all-caps alias. It's also accessible from a window manager menu, but modifiers can make those inaccessible too.)

However, I'd like something to clear the CapsLock modifier that I can put in my 're-establish my keyboard settings' script, and since this xdotool trick only toggles the setting it's not suitable. Fortunately you can clear modifier states from an X client; unfortunately, as far as I know there's no canned 'capslockx' program the way there is a numlockx (which people have and use for good reasons). Fortunately, the same AskUbuntu question and answer that I got the xdotool invocation from also had a working Python program (you want the one from this answer by diegogs. For assorted reasons, I'm putting my current version of that Python program here:

#!/usr/bin/python
from ctypes import *

class Display(Structure):
  """ opaque struct """

X11 = cdll.LoadLibrary("libX11.so.6")
X11.XOpenDisplay.restype = POINTER(Display)

display = X11.XOpenDisplay(c_int(0))
X11.XkbLockModifiers(display, c_uint(0x0100), c_uint(2), c_uint(0))
X11.XCloseDisplay(display)

(There is also a C version in the question and answers, but you obviously have to compile it.)

In theory there is probably some way to reset the setxkbmap settings state so that CapsLock is a CapsLock key again (after all, package updates do it), which would have let me directly turn off CapsLock. In practice I couldn't find out how to do this in my flailing Internet searches so I went with the answer I could find. In retrospect I could might also have been able to reset settings by unplugging and replugging my USB keyboard or plugging in a second keyboard, and we do have random USB keyboards sitting around in the office.

XTurningOffCapslock written at 23:05:12; Add Comment

2024-05-14

The X Window System and the curse of NumLock

In X, like probably any graphical environment, there are a variety of layers to keys and characters that you type. One of the layers is the input events that the X server sends to applications. As covered in the xlib manual, these contain a keycode, representing the nominal physical key, a keysym, representing what is nominally printed on the key, and a bitmap of the modifiers currently in effect, which are things like 'Shift' or 'Ctrl' (cf). The separation between keycodes and keysyms lets you do things like remap your QWERTY keyboard to Dvorak; you tell X to change what keysyms are generated for a bunch of the keycodes. Programs like GNU Emacs read the state of the modifiers to determine what you've typed (from their perspective), so they can distinguish 'Ctrl-Return' from plain 'Return'.

Ordinary modifiers are normally straightforward, in that they are additional keys that are held down as you type the main key. Control, Shift, and Alt all work this way (by default). However, some modifiers are 'sticky', where you tap their key once to turn them on and then tap their key again to turn them off. The obvious example of this is Caps Lock (unless you turn its effects off, remapping its physical key to be, say, another Ctrl key). Another example, one that many X users have historically wound up quietly cursing, is NumLock. Why people wind up cursing NumLock, and why I have a program to control its state, is because of how X programs (such as window managers) often do their key and mouse button bindings.

(There are also things that will let you make non-sticky modifier keys into sticky keys.)

Suppose, for example, that you have a bunch of custom fvwm mouse bindings that are for things like 'middle mouse button plus Alt', 'middle mouse button plus Shift and Alt', 'plain right mouse button on the root', and so on. Fvwm and most other X programs will normally (have to) interpret this completely literally; when you create a binding for 'middle mouse plus Alt', the state of the current modifiers must be exactly 'Alt' and nothing else. If the X server has NumLock on for some reason (such as you hitting the key on the keyboard), the state of the current modifiers will actually be 'NumLock plus Alt', or 'NumLock plus Alt and Shift', or just 'NumLock' (instead of 'no modifiers in effect'). As a result, fvwm will not match any of your bindings and nothing will happen as you're poking away at your keyboard and your mouse.

Of course, this can also happen with CapsLock, which has the same sticky behavior. But CapsLock has extremely obvious effects when you type ordinary characters in terminal windows, editors, email, and so on, so it generally doesn't take very long before people realize they have CapsLock on. NumLock doesn't normally change the main letters or much of anything else; on some keyboard layouts, it may not change anything you can physically type. As a result, having NumLock on can be all but invisible (or completely invisible on keyboards with no NumLock LED). To make it worse, various things have historically liked 'helpfully' turning NumLock on for you, or starting in a mode with NumLock on.

(X programs can alter the current modifier status, so it's possible for NumLock to get turnd on even if there is no NumLock key on your keyboard. The good news is that this also makes it possible to turn it off again. A program can also monitor the state of modifiers, so I believe there are ones that give you virtual LEDs for some combination of CapsLock, ScrollLock, and NumLock.)

So the curse of NumLock in X is that having NumLock on can be cause mysterious key binding failures in various programs, while often being more or less invisible. And for X protocol reasons, I believe it's hard for window managers to tell the X server 'ignore NumLock when considering my bindings' (see, for example, the discussion of IgnoreModifiers in the fvwm3 manual).

XNumlockCurse written at 23:15:38; Add Comment

2024-04-25

The importance of an ordinary space in a Unix shell command line

In the sidebar to yesterday's entry I (originally) made a Unix command line mistake by unthinkingly leaving out an ordinary, innocent looking space (it's corrected in the current version of the entry after it was noted by Emilio in a comment). This innocent looking mistake and its consequences are an illustration of something in Unix shell command lines, although I'm not sure of just what, so I'm going to write it up.

The story starts with the general arguments of Bash's 'read' builtin:

read [-ers] [-a aname] [-d delim] [-i text] [-n nchars] [-N nchars] [-p prompt] [-t timeout] [-u fd] [name …]

The 'read' builtin follows the general standard behavior of Unix commands where '-d delim' and other options that take an argument can be shortened to omit the space, so '-ddelim'. So you can write, for example:

echo "a:b:c" | while IFS= read -r -d':' l; do echo "$l"; done

Bash also has a special feature for -d. Normally the first character of delim is taken as the 'line' terminator, but if delim is blank, read will terminate the line when it reads a NUL character (0 byte), which is just what you want to handle the output of, for example, 'find ... -print0'.

The way you create an empty string argument in a Bash command line is to use an empty pair of quotes:

read -r -d '' line

So when I was writing the original command line in yesterday's entry, I absently mashed these two things together in my mind and wrote:

read -r -d'' line

I've used '' to create an empty argument and then I've done the standard thing of removing the space between -d and its argument. So clearly I've given '-d' an empty argument, right? Nope.

In Bash and other conventional shells, '' is nothingness. It only means an argument that is an empty string if it occurs on its own; this is a special interpretation added by the shell, and programs don't actually see the ''s. If you put a '' next to other non-whitespace characters, it disappears in the command line that the program will see. So writing -d'' was the same as writing -d with no argument, and the command line as 'read' would see it was actually:

read -r -d line

Which would have caused 'read' to use 'l' as the line terminator.

In the process of writing this entry, I realized that there's a more interesting way to make what is fundamentally the same mistake, although it goes deeper into Unix arcana and doesn't look half as obvious. In many modern shells, the Bourne shell included, you can write a NUL character (0 byte) as $'\0'. So you will see people write a 'read with NUL terminated lines' command line as:

IFS= read -r -d $'\0' line

This works fine, and unlike the '' case we obviously have a real argument here, not just an empty argument, so clearly we can shorten this to:

IFS= read -r -d$'\0' line

If you try this you will discover it doesn't work. The fundamental problem is that Unix command line arguments can't include NUL characters, because the Unix command line API passes the arguments as an array of NUL-terminate (C) strings. No matter how you invoke a program, the first NUL character in an argument is the end of that argument from the program's perspective. So although it looked very different as typed, from read's perspective what we did was the same as:

IFS= read -r -d line

(And then it would have the same effect as my mistake.)

PS: This is a little tangled because 'read' is a Bash builtin so in theory Bash doesn't have to stick to the limits of the kernel API, but in practice I think Bash does do so.

ShellImportanceOfASpace written at 23:17:18; Add Comment

2024-04-20

What the original 4.2 BSD csh hashed (which is not what I thought)

Recently, Unix shells keeping track of where they'd found commands came up on the Fediverse again, as it does every so often; for instance, last year I advocated for doing away with the whole thing. As far as I know, (Unix) shell command hashing originated with BSD Unix's csh. which added command hashing and a 'rehash' builtin. However, if you actually read the 4.2 BSD csh(1) manual page, it says something a bit odd (emphasis mine):

rehash: Causes the internal hash table of the contents of the directories in the path variable to be recomputed. This is needed if new commands are added to directories in the path while you are logged in. [...]

The way command hashing typically works in modern shells is that the shell remembers the specific full path to a given command (or sometimes that the command doesn't exist). This is explicitly described in the Bash manual, which says (for example) 'Bash uses a hash table to remember the full pathnames of executable files'. In this case, if you or someone else adds a new command to something in $PATH and you've never run that command before (because it didn't used to exist), you're fine and don't need to rehash; your shell will automatically go looking for a new command in $PATH.

It turns out that the 4.2 BSD csh did not hash commands this way. Instead, well, let's quote a comment from sh.exec.c:

Xhash is an array of HSHSIZ chars, which are used to hash execs. If it is allocated, then to tell whether ``name'' is (possibly) present in the i'th component of the variable path, you look at the i'th bit of xhash[hash("name")]. This is setup automatically after .login is executed, and recomputed whenever ``path'' is changed.

To translate that, csh does not 'hash' where commands are found the way modern shells do. Instead of looking up commands and then remembering where it found them, it scans all of the directories on your $PATH and remembers the hash values of the names it saw in each of them. When csh tries to run a command, it gets the hash value of the command name, looks it up in the hash table, and skips all $PATH entries that hash value definitely isn't in. If you run a newly added command, the odds are very low that its name will hash to a hash value that has the right bit set in its hash table entry.

There can be hash value collisions between different command names and if you have more than 8 $PATH entries, more than one entry can set the same bit, so finding a set bit merely means that potentially the command is there. So this is not as good as remembering exactly where the command is, but on the other hand it takes up a lot less memory; the default csh hash size is 511 bytes. It also means that you definitely want to do 'rehash' when you or someone else modifies any directory on your $PATH, because the odds are very high that any new additions won't be properly recognized.

(What 'rehash' does is that it re-runs the code that sets up this hash table, which is also run when $PATH is changed and so on.)

CshWhatItHashed written at 23:38:07; Add Comment

2024-04-09

Bash's sadly flawed smart (programmable) completion

Bash has an interesting and broadly useful feature called 'programmable completion' (this has sort of come up before). Programmable completion makes it possible for Bash to auto-complete things like command line options for the current program for you. Unfortunately one flaw in Bash's programmable completion is that it doesn't understand enough Bash command line syntax and so can get in your way.

Suppose, not hypothetically, that you are typing the following on the Bash command line on a Debian-based Linux system, with the bits in bold being what you typed before you hit tab:

# apt-get install $(grep -v '^#' somefi<TAB>

When you hit TAB to complete the file name, nothing will happen. This is because Bash has been told what the arguments to apt-get are and so 'knows' that they don't include files (this is actually wrong these days, but never mind). Bash isn't smart enough to recognize that by typing '$(' you've started writing a command substitution and are now in a completely different completion context, that of grep, where you definitely should be allowed to complete file names.

Bash could in theory be this smart but there are probably a number of obstacles to doing that in practice. For example, we don't have a well-formed command substitution here, since we haven't typed the closing ')' yet; Bash would effectively have to do something like the sort of on the fly parsing of incomplete code that editor autocompletion does. It's possible that Bash could do better with some heuristics, but the current situation is broadly easy to explain and reason about, even if the result is sometimes frustrating.

There are at least two ways to disable these programmable completions in Bash. You can turn off the feature entirely with 'shopt -u progcomp' or you can flush all of the registered programmable completions with 'complete -r'. In theory you can see the list of current completions with 'complete -p' and then remove just one of them with 'complete -r <name>', but in practice 'complete -p' doesn't always list a program until I've started trying to do completions with it.

(Where the shell snippets that define completions go is system dependent, but Linux systems will often put them in '/usr/share/bash-completion/completions'.)

People with better memories than me can also use M-/ instead to force completion of a filename no matter what Bash's programmable completion thinks should go there. M-! will complete command names, M-$ will complete variable names, and M-~ will complete user names. You can find these in the Bash manual page as the various 'complete-*' readline (key binding) command names.

(Another general flaw on programmable completion is that it relies on the people who provide the completion definitions for commands getting it right and they don't always do that, as I've seen in the past.)

BashProgrammableCompletionFlaw written at 23:32:37; Add Comment

2024-03-05

A peculiarity of the X Window System: Windows all the way down

Every window system has windows, as an entity. Usually we think of these as being used for, well, windows and window like things; application windows, those extremely annoying pop-up modal dialogs that are always interrupting you at the wrong time, even perhaps things like pop-up menus. In its original state, X has more windows than that. Part of how and why it does this is that X allows windows to nest inside each other, in a window tree, which you can still see today with 'xwininfo -root -tree'.

One of the reasons that X has copious nested windows is that X was designed with a particular model of writing X programs in mind, and that model made everything into a (nested) window. Seriously, everything. In an old fashioned X application, windows are everywhere. Buttons are windows (or several windows if they're radio buttons or the like), text areas are windows, menu entries are each a window of their own within the window that is the menu, visible containers of things are windows (with more windows nested inside them), and so on.

This copious use of windows allows a lot of things to happen on the server side, because various things (like mouse cursors) are defined on a per-window basis, and also windows can be created with things like server-set borders. So the X server can render sub-window borders to give your buttons an outline and automatically change the cursor when the mouse moves into and out of a sub-window, all without the client having to do anything. And often input events like mouse clicks or keys can be specifically tied to some sub-window, so your program doesn't have to hunt through its widget geometry to figure out what was clicked. There are more tricks; for example, you can get 'enter' and 'leave' events when the mouse enters or leaves a (sub)window, which programs can use to highlight the current thing (ie, subwindow) under the cursor without the full cost of constantly tracking mouse motion and working out what widget is under the cursor every time.

The old, classical X toolkits like Xt and the Athena widget set (Xaw) heavily used this 'tree of nested windows' approach, and you can still see large window trees with 'xwininfo' when you apply it to old applications with lots of visible buttons; one example is 'xfontsel'. Even the venerable xterm normally contains a nested window (for the scrollbar, which I believe it uses partly to automatically change the X cursor when you move the mouse into the scrollbar). However, this doesn't seem to be universal; when I look at one Xaw-based application I have handy, it doesn't seem to use subwindows despite having a list widget of things to click on. Presumably in Xaw and perhaps Xt it depends on what sort of widget you're using, with some widgets using sub-windows and some not. Another program, written using Tk, does use subwindows for its buttons (with them clearly visible in 'xwininfo -tree').

This approach fell out of favour for various reasons, but certainly one significant one is that it's strongly tied to X's server side rendering. Because these subwindows are 'on top of' their parent (sub)windows, they have to be rendered individually; otherwise they'll cover what was rendered into the parent (and naturally they clip what is rendered to them to their visible boundaries). If you're sending rendering commands to the server, this is just a matter of what windows they're for and what coordinates you draw at, but if you render on the client, you have to ship over a ton of little buffers (one for each sub-window) instead of one big one for your whole window, and in fact you're probably sending extra data (the parts of all of the parent windows that gets covered up by child windows).

So in modern toolkits, the top level window and everything in it is generally only one X window with no nested subwindows, and all buttons and other UI elements are drawn by the client directly into that window (usually with client side drawing). The client itself tracks the mouse pointer and sends 'change the cursors to <X>' requests to the server as the pointer moves in and out of UI elements that should have different mouse cursors, and when it gets events, the client searches its own widget hierarchy to decide what should handle them (possibly including client side window decorations (CSD)).

(I think toolkits may create some invisible sub-windows for event handling reasons. Gnome-terminal and other Gnome applications appear to create a 1x1 sub-window, for example.)

As a side note, another place you can still find this many-window style is in some old fashioned X window managers, such as fvwm. When fvwm puts a frame around a window (such as the ones visible on windows on my desktop), the specific elements of the frame (the title bar, any buttons in the title bar, the side and corner drag-to-resize areas, and so on) are all separate X sub-windows. One thing I believe this is used for is to automatically show an appropriate mouse cursor when the mouse is over the right spot. For example, if your mouse is in the right side 'grab to resize right' border, the mouse cursor changes to show you this.

(The window managers for modern desktops, like Cinnamon, don't handle their window manager decorations like this; they draw everything as decorations and handle the 'widget' nature of title bar buttons and so on internally.)

XWindowsAllTheWayDown written at 21:26:30; Add Comment

2024-03-04

An illustration of how much X cares about memory usage

In a comment on yesterday's entry talking about X's server side graphics rendering, B.Preston mentioned that another reason for this was to conserve memory. This is very true. In general, X is extremely conservative about requiring memory, sometimes to what we now consider extreme lengths, and there are specific protocol features (or limitations) related to this.

The modern approach to multi-window graphics rendering is that each window renders into a buffer that it owns (often with hardware assistance) and then the server composites (appropriate parts of) all of these buffers together to make up the visible screen. Often this compositing is done in hardware, enabling you to spin a cube of desktops and their windows around in real time. One of the things that clients simply don't worry about (at least for their graphics) is what happens when someone else's window is partially or completely on top of their window. From the client's perspective, nothing happens; they keep drawing into their buffer and their buffer is just as it was before, and all of the occlusion and stacking and so on are handled by the composition process.

(In this model, a client program's buffer doesn't normally get changed or taken away behind the client's back, although the client may flip between multiple buffers, only displaying one while completely repainting another.)

The X protocol specifically does not require such memory consuming luxuries as a separate buffer for each window, and early X implementations did not have them. An X server might have only one significant-sized buffer, that being screen memory itself, and X clients drew right on to their portion of the screen (by sending the X server drawing commands, because they didn't have direct access to screen memory). The X server would carefully clip client draw operations to only touch the visible pixels of the client's window. When you moved a window to be on top of part of another window, the X server simply threw away (well, overwrote) the 'under' portion of the other window. When the window on top was moved back away again, the X server mostly dealt with this by sending your client a notification that parts of its window had become visible and the client should repaint them.

(X was far from alone with this model, since at the time almost everyone was facing similar or worse memory constraints.)

The problem with this 'damage and repaint' model is that it can be janky; when a window is moved away, you get an ugly result until the client has had the time to do a redraw, which may take a while. So the X server had some additional protocol level features, called 'backing store' and 'save-under(s)'. If a given X server supported these (and it didn't have to), the client could request (usually during window creation) that the server maintain a copy of the obscured bits of the new window when it was covered by something else ('backing store') and separately that when this window covered part of another window, the obscured parts of that window should be saved ('save-under', which you might set for a transient pop-up window). Even if the server supported these features in general it could specifically stop doing them for you at any time it felt like it, and your client had to cope.

(The X server can also give your window backing store whether or not you asked for it, at its own discretion.)

All of this was to allow an X server to flexibly manage the amount of memory it used on behalf of clients. If an X server had a lot of memory, it could give everything backing store; if it started running short, it could throw some or all of the backing store out and reduce things down to (almost) a model where the major memory use was the screen itself. Even today you can probably arrange to start an X server in a mode where it doesn't have backing store (the '-bs' command line option, cf Xserver(1), which you can try in Xnest or the like today, and also '-wm'). I have a vague memory that back in the day there were serious arguments about whether or not you should disable backing store in order to speed up your X server, although I no longer have any memory about why that would be so (but see).

As far as I know all X servers normally operate with backing store these days. I wouldn't be surprised if some modern X clients would work rather badly if you ran them on an X server that had backing store forced off (much as I suspect that few modern programs will cope well with PseudoColor displays).

PS: Now that I look at 'xdpyinfo', my X server reports 'options: backing-store WHEN MAPPED, save-unders NO'. I suspect that this is a common default, since you don't really need save-unders if everything has backing store enabled when it's visible (well, in X mapped is not quite 'visible', cf, but close enough).

XServerBackingStoreOptional written at 22:02:53; Add Comment

(Previous 10 or go back to March 2024 at 2024/03/03)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.