Wandering Thoughts

2017-03-13

What should it mean for a system call to time out?

I was just reading Evan Klitzke's Unix System Call Timeouts (via) and among a number of thoughts about it, one of the things that struck me is a simple question. Namely, what should it mean for a Unix system call to time out?

This question may sound pointlessly philosophical, but it's actually very important because what we expect a system call timeout to mean will make a significant difference in how easy it would be to add system calls with timeouts. So let's sketch out two extreme versions. The first extreme version is that if a timeout occurs, the operation done by the system call is entirely abandoned and undone. For example, if you call rename("a", "b") and the operation times out, the kernel guarantees that the file a has not been renamed to b. This is obviously going to be pretty hard, since the kernel may have to reverse partially complete operations. It's also not always possible, because some operations are genuinely irreversible. If you write() data to a pipe and time out partway through doing so (with some but not all data written), you cannot reach into the pipe and 'unwrite' all of the already sent data; after all, some of it may already have been read by a process on the other side of the pipe.

The second extreme version is that having a system call time out merely causes your process to stop waiting for it to complete, with no effects on the kernel side of things. Effectively, the system call is shunted to a separate thread of control and continues to run; it may complete some time, or it may error out, but you never have to wait for it to do either. If the system call would normally return a new file descriptor or the like, the new file descriptor will be closed immediately when the system call completes. In practice implementing a strict version of this would also be relatively hard; you'd need an entire infrastructure for transferring system calls to another kernel context (or more likely, transplanting your user-level process to another kernel context, although that has its own issues). This is also at odds with the existing system calls that take timeouts, which generally result in the operation being abandoned part way through with no guarantees either way about its completion.

(For example, if you make a non-blocking connect() call and then use select() to wait for it with a timeout, the kernel does not guarantee that if the timeout fires the connect() will not be completed. You are in fact in a race between your likely close() of the socket and the connection attempt actually completing.)

The easiest thing to implement would probably be a middle version. If a timeout happens, control returns to your user level with a timeout indication, but the operation may be partially complete and it may be either abandoned in the middle of things or completed for you behind your back. This satisfies a desire to be able to bound the time you wait for system calls to complete, but it does leave you with a messy situation where you don't know either what has happened or what will happen when a timeout occurs. If your mkdir() times out, the directory may or may not exist when you look for it, and it may or may not come into existence later on.

(Implementing timeouts in the kernel is difficult for the same reason that asynchronous IO is hard; there is a lot of kernel code that is much simpler if it's written in straight line form, where it doesn't have to worry about abandoning things part way through at essentially any point where it may have to wait for the outside world.)

SystemCallTimeoutMeaning written at 01:03:40; Add Comment

2017-03-06

Modern X Windows can be a very complicated environment

I mentioned Corebird, the GTK+ Twitter client the other day, and generally positive. That was on a logical weekend. The next day I went in to the office, set up Corebird there, and promptly ran into a problem: I couldn't click on links in Tweets, or rather I could but it didn't activate the link (it would often do other things). Corebird wasn't ignoring left mouse clicks in general, it's just that they wouldn't activate links. I had not had this problem at home (or my views would not have been so positive. I use basically the same fvwm-based window manager environment at home and at work, but since Corebird is a GTK+ application and GTK+ applications can be influenced by all sorts of magic settings and (Gnome) setting daemons, I assumed that it was something subtle that was different in my work GTK+/Gnome environment and filed a Fedora bug in vague hopes. To my surprise, it turned out to be not merely specific to fvwm, but specific to one aspect of my particular fvwm mouse configuration.

The full version is in the thread in the fvwm mailing list, but normally when you click and release a button, the X server generates two events, a ButtonPress and then a ButtonRelease. However, if fvwm was configured in a way such that it might need to do something with a left button press, a different set of events was generated:

  • a LeaveNotify with mode NotifyGrab, to tell Corebird that the mouse pointer had been grabbed away from it (by fvwm).
  • an EnterNotify with mode NotifyUngrab, to tell Corebird 'here is your mouse pointer back because the grab has been released' (because fvwm was passing the button press through to Corebird).
  • the ButtonPress for the mouse button.

The root issue appears to be that something in the depths of GTK+ takes the LeaveNotify to mean that the link has lost focus. Since GTK+ doesn't think the link is focused, when it receives the mouse click it doesn't activate the link, but it does take other action, since it apparently still understands that the mouse is being clinked in the text of the GtkLabel involved.

(There's a test program that uses a simple GtkLabel to demonstrate this, see this, and apparently there are other anomalies in GtkLabel's input processing in this area.)

If you think that this all sounds very complex, yes, exactly. It is. X has a complicated event model to start with, and then interactions with the window manager add extra peculiarities on top. The GTK+ libraries are probably strictly speaking in the wrong here, but I also rather suspect that this is a corner case that the GTK+ programmers never imagined, much less encountered. In a complex environment, some possibilities will drop through the cracks.

(If you want to read a high level overview of passive and active (mouse button) grabs, see eg this 2010 writeup by Peter Hutter. Having read it, I feel like I understand a bit more about what fvwm is doing here.)

By the way, some of this complexity is an artifact of the state of computing when X was created, specifically that both computers and networking were slow. Life would be simpler for everyone if all X events were routed through the window manager and then the window manager passed them on to client programs as appropriate. However, this would require all events to pass through an extra process (and possibly an extra one or two network hops), and in the days when X was young this could have had a real impact on overall responsiveness. So X goes to a great deal of effort to deliver events directly to programs whenever possible while still allowing the window manager to step in.

(My understanding is that in Wayland, the compositor handles all events and passes them to clients as it decides. The Wayland compositor is a lot more than just the equivalent of an X window manager, but it fills that role, and so in Wayland this issue wouldn't come up.)

ModernXCanBeVeryComplex written at 22:39:05; Add Comment

2017-03-01

Cheap concurrency is an illusion (at least on Unix)

Recently I wound up reading this article (via), which contains the following:

[...] this memo assumes that there already exists an efficient concurrency implementation where forking a new lightweight process takes at most hundreds of nanoseconds and context switch takes tens of nanoseconds. Note that there are already such concurrency systems deployed in the wild. One well-known example are Golang's goroutines but there are others available as well.

When designing APIs and similar things, it is quite important to understand that extremely lightweight processes are an illusion. I don't mean that in the sense that they aren't actually lightweight in practice (although you probably want to pay attention to CPU cache effects here). I mean that in the sense that they don't actually exist and their nonexistence has real consequences.

All 'lightweight processes' on Unix are some form of what is known as 'green threads', which is to say that they exist purely at the user level. They are extremely cheap to create and switch to because all that the user level has to do here is shuffle some registers around or mark some memory as allocated for the initial stack. But the moment that you have to materialize a kernel entity to back a lightweight process (perhaps because it is about to do a blocking operation), things become much more expensive.

The reality is that there is no such thing as a genuinely lightweight kernel process, at least not in Unixes. Kernel processes have associated data structures and not insignificant kernel stacks, and they take involved locks to maintain reference counts on virtual memory areas and so on. Kernel processes are certainly a lot more lightweight these days than they used to be ten or twenty years ago, sufficiently so that POSIX threads are mostly 1:1 user to kernel threads because it's simpler, but they don't even start approaching the kind of lightweight you need to have as many of them as you can have, say, goroutines in a Go program. Systems like Go engage in a very careful behind the scenes dance in order to multiplex all of those goroutines onto many fewer kernel processes.

(One reason kernel processes need not insignificant kernel stacks is that Unix kernels are written in C, and C does not like having its stack relocated around in order to resize it. Languages like Go go to a great deal of effort in their compiler and runtime to make this work (and this can be alarmingly complicated).)

The consequence of this is that if you want to do a lot of concurrent things at once, at the level of the API to the kernel you can't do these things in a 1 to 1 relationship between a single thing and a kernel thread. If you try to have a 1:1 relationship, your system will explode under load with too many relatively expensive kernel threads. Instead you really must have some form of aggregation in the userspace-to-kernel API, regardless of whether or not you expose it to people.

This doesn't mean that the illusion of lightweight processes and cheap concurrency doesn't work or isn't useful. Usually it does work and is useful, partly because people find it much easier to reason about the behavior of straight-line code than they do about the behavior of state machines or callbacks. But it's an abstraction, with all that that implies, and if you are designing APIs you need to think about how they will be implemented at the level where this illusion is not true. Otherwise you may wind up designing APIs that promise things that you can't actually deliver.

(This holds true for more than just user to kernel APIs. If you design a Go based system where you appear to promise that a ton of goroutines can all do file IO all at once, you are probably going to have pain because in most environments each separate bit of file IO will require its own kernel thread. The Go runtime may do its best to deliver this, but you probably won't like what happens to your machine when a few thousand goroutines are busy trying to do their theoretically lightweight file IO.)

CheapConcurrencyIllusion written at 21:46:41; Add Comment

2017-02-04

My views on X (Windows)

One of the famous quotes about the X Windows System is this one:

Sometimes when you fill a vacuum, it still sucks. - Rob Pike

The first thing to say here is that it is not quite the case that X filled a vacuum and so succeeded by default. Certainly X was the first cross-Unix window system, but Unix workstations existed before X, so of course various Unix vendors had to come up with GUI environments and window systems for them like SunView. However, people do not speak entirely favourably about those systems.

Early Unix window systems were not designed by idiots (despite what some commentary might say); they were designed by dedicated, smart engineers who were doing the best job that they could at the time and on the machines that they had. It's just that people didn't yet know enough about window systems to see what was a good or a bad idea, what sort of APIs worked, and so on, and it didn't help that early machines and Unixes were very limited. Mistakes were made; indeed, mistakes were inevitable. So the early window systems were rather far from ideal and once the dust settled, not entirely appealing.

This is part of the vacuum that X filled (although not all of it). X came along at the right time to learn from prior experience (and evolved a couple of times without having to worry much about backwards compatibility issues), and the result was, well, quite New Jersey. It worked, and through at least the early 1990s, when it didn't work people bashed on it until it did. Generally it even worked better than its competition at the time or delivered compelling additional features, or both (although this is not the only reason it succeeded so well).

At the same time, X has never been elegant. In this it partakes of the spirit of (V7) Unix; not the side that gave us the clean Unix design ideas, but the side that gave us a kernel where active processes were just put in a fixed-size array (that was walked with for loops):

struct proc proc[NPROC];

Why not? It worked well enough, and V7 was a New Jersey approach work in progress.

That X is not elegant or simple or very Unixy is what Rob Pike meant when he said that it sucked. X is a generally unattractive beast under the hood, full of protocol and API complexity, with various messy features and any number of omissions of things that people would really like to be covered (some of which were later sort of added). It's not as policy-free as it claims to be, and anyways being policy-free is ducking various important issues (especially if you want disparate clients to genuinely interoperate). But X did in a sense fill a vacuum and more broadly it definitely filled a need, and doing so was (and is) important. The logic of X existing is inorexable and it's a clear improvement over what came before. For all of X's warts, no one really wants to go back to SunView or the other pre-X options.

These days, my own reaction to X's warts is basically to shrug. Given that it's the best option I have in practice, well, it works well enough; in fact, it generally works pretty great. I'm aware that a great deal of sweat and irritation is being expended behind the scenes to make that happen and every so often I stub my toe on a rough edge of modern X, but for the most part I can ignore all of that. I would like it to be more elegant (or to be magically replaced by something that was), but it's not something I'm passionate about. I am passionate about not going back to a long catalog of not-as-nice window systems that I have used in the past, sometimes out of necessity. And yes, that even includes cool ones like the Blit.

(I'm not convinced we know enough to do a window system design that's significantly more elegant than X and still has X's virtues, including working well across the network. I guess Wayland may give us a test of this. There was NeWS, always the great white hope of Unix window systems, but I think it had the wrong core design and anyways who knows what would have happened to its initial elegance after ten or twenty years of being adapted to the harsh realities of general use.)

PS: Yes, X is not supposed to be called 'X Windows'. Sorry, that ship sailed years ago; in practice, 'X Windows' is a perfectly widely accepted name for it in situations where you don't feel like spelling out the full name but 'X' or 'X11' is too short.

XWindowsViews written at 01:50:21; Add Comment

2017-02-03

What it's sensible to use a bunch of Unix swap space for

I've long written that you don't want too much swap space because if you try to actively use more RAM than your machine actually has, swap space basically just gives you more rope. Your machine is not going to perform very well (or at all) while things are busy frantically paging memory in and out, so it's usually better to just have things fail immediately. But there are sensible uses for decent amounts of swap space, even if they're relatively rare, and today I feel like trying to run them down.

First off, you almost always want a bit of swap space configured even if you never plan to use it, generally something on the order of a few hundred megabytes up to a gigabyte. The sad reality of life is that many Unix kernels contains code that simply assumes that you have some amount of swap space, even just a bit (partly because very few kernel developers even try to test with no swap set up). If you have no swap space configured at all, this code can malfunction in various unhappy ways. Feeding your system 256 MB or 512 MB for swap is a small price to pay to avoid running into these corner cases.

But that's just a little space, not a bunch of it. So here are some uses for an appreciable amount of swap space that I can think of:

  • Hibernating a laptop or other machine that can do this (under some configurations).

  • If you have a memory based tmpfs /tmp or other scratch directory, which are increasingly popular and common (although I think it's a bad idea). If you have one and you expect to use much space in it and be under overall memory pressure, you probably want as much swap space as you ever expect something to use in your /tmp, and then probably some more for insurance.

    (Unfortunately there are likely to be a lot of programs that will write large files to /tmp under the right circumstances, especially if they get fed unusually large inputs. This is one reason I don't like tmpfs /tmp.)

  • If you have a system that insists on some relatively strong form of 'strict overcommit', where it wants to reserve RAM or swap space for almost all of the memory that your programs may potentially use, and you have programs that allocate a lot of memory that they don't touch. Here you're using a bunch of swap space to basically trick the system's memory accounting, so that it's happy letting programs allocate memory they'll never use.

  • If you have mostly inactive programs that use an appreciable amount of memory plus programs with occasional (or periodic) short term spike demands for most or almost all of the memory on the machine. With sufficient swap space, the spike demand will push everything else out to swap, run to completion, and then everything else will slowly wake up and page back in. You won't enjoy the system having to page back in a few gigabytes of memory, but it probably beats the alternatives (including splitting things out to separate machines).

    (Bonus points are awarded here if you have a scheduling system that actively SIGSTOPs or otherwise totally suspends the lower priority programs so that they don't even try to run during the demand spike.)

  • If you have programs with (significant) memory leaks and what they lose track of is dirty memory (as opposed to clean memory that they allocated but never touched). As leaked memory, the program is basically never going to touch it again, but as dirty memory it can only live in RAM or swap and clearly you'd rather have it live in swap.

  • If you have programs that use a lot of memory but touch most of it only very rarely, very slowly, or both. Unlike leaked memory, the program will look at this memory again at some point but in the mean time it wastes your valuable RAM, so you would rather page it out to swap space for now and then pay the performance impact of swapping it back in later.

    (At this point you're teetering on the edge. If you've misjudged how much memory the program will want to look at how fast, you can shove your overall system right into memory starvation and swap death.)

  • If the most important thing is that the system not crash, even if it's not doing very much else besides swapping madly. Such systems are probably doing their important work either in the kernel (using reserved memory) or in programs that are locked into memory, so that they keep going even while the rest of the system is more or less locked in swap death. This may or may not work very well in practice; among other issues, kernels often want memory themselves and so may get entangled in what theoretically is only 'unimportant' user space swap death.

As it probably shows from how I described these things, the further down the list you go the more dubious I get about how wise these hacks are (and I maintain that most of them are hacks). Generally you should have a pretty strong confidence that you know exactly what your overall system is going to do (and why).

(You can also be desperate and hoping that adding more swap space will let one of these cases limp along. If so, I hope that you have good monitoring so you can reboot entire machines if or when they fall over.)

SwapLotsWhatFor written at 00:51:47; Add Comment

2017-01-29

How Unix erases things when you type a backspace while entering text

Yesterday I mentioned in passing that printing a DEL character doesn't actually erase anything. This raises an interesting question, because when you're typing something into a Unix system and hit your backspace key, Unix sure erases the last character that you entered. So how is it doing that?

The answer turns out to be basically what you'd expect, although the actual implementation rapidly gets complex. When you hit backspace, the kernel tty line discipline rubs out your previous character by printing (in the simple case) Ctrl-H, a space, and then another Ctrl-H.

(In Unix kernel source code you'll generally see this using not the raw byte value but the C escape sequence for Ctrl-H, \b. Printing Ctrl-H is not the only use for \b, but it's certainly one reason it's one of the few control characters with one (cf).)

Of course just backing up one character is not always the correct way of erasing input, and that's when it gets complicated for the kernel. To start with we have tabs, because when you (the user) backspace over a tab you want the cursor to jump all the way back, not just move back one space. The kernel has a certain amount of code to work out what column it thinks you're on and then back up an appropriate number of spaces with Ctrl-Hs.

(By the way, the kernel assumes that tabstops are every 8 characters. I'm not sure any Unix version lets you change this with stty or the equivalent.)

Then we have the case when you quoted a control character while entering it, eg by typing Ctrl-V Ctrl-H; this causes the kernel to print the Ctrl-H instead of acting on it, and it prints it as the two character sequence ^H. When you hit backspace to erase that, of course you want both (printed) characters to be rubbed out, not just the 'H'. So the kernel needs to keep track of that and rub out two characters instead of just one.

A final complication for some kernels is multibyte characters with a display width bigger than one (yes, really, some kernels try to handle this). These kernels get to go through interesting gyrations; you can see an example in Illumos's ldterm.c in the ldterm_csi_erase function.

(FreeBSD also handles backspacing a space specially, because you don't need to actually rub that out with a '\b \b' sequence; you can just print a plain \b. Other kernels don't seem to bother with this optimization. The FreeBSD code for this is in sys/kern/tty_ttydisc.c in the ttydisc_rubchar function.)

PS: If you want to see the kernel's handling of backspace in action, you usually can't test it at your shell prompt, because you're almost certainly using a shell that supports command line editing and readline and so on. Command line editing requires taking over input processing from the kernel, and so such shells are handling everything themselves. My usual way to see what the kernel is doing is to run 'cat >/dev/null' and then type away.

HowUnixBackspaces written at 02:13:44; Add Comment

2017-01-28

What we still use ASCII CR for today (on Unix)

I recently read Things Every Hacker Once Knew, which is really mostly about the somewhat less grander topic of ASCII, RS-232, and serial terminals (via, and also). Part of the article is a writeup of all of the ASCII control characters, covering their original purposes and what they're still used for today, if anything. It has this to say about CR (aka Ctrl-M, C \r, decimal byte 13 (hex 0x0d, octal 015)):

CR (Carriage Return)
It is now possible that the reader has never seen a typewriter, so this needs explanation: "carriage return" is the operation of moving your print head or cursor to the left margin. Windows, other non-Unix operating systems, and some Internet protocols (such as SMTP) tend to use CR-LF as a line terminator, rather than bare LF. Pre-Unix MacOS used a bare CR.

This description may sound like CR is no longer used on Unix, except as part of being carefully compatible with old protocols like SMTP and newer ones like HTTP. This is misleading, because CR is still in active use on Unix today.

(Sadly, HTTP and other new(ish) protocols continue to specify that 'lines' in the protocol are terminated with CR LF instead of plain LF. This is generally an annoying mistake that simply complicates everyone's life, but that's another rant.)

You see, printing a CR has an extremely useful property: it painlessly resets the cursor to the start of the line but doesn't advance it to the next line. So if you print something without a newline, print a CR, and then carefully print again just so, you will overwrite your original output with new output on the same line. This is the traditional and frequently used low-rent way of creating a constantly updated line for program progress, the current status of something, or basically anything where you want frequent updates but not to scroll things madly the way you would if you printed each update as a new line.

Of course this has limits. The big limit is that what you want to print and over-print can't be longer than one row in your (emulated) terminal. If your terminal is, say, 60 columns wide and you print a 70-character status, the last ten characters or so will overflow onto the next physical line, along with the cursor, and then your CR will only return the cursor to the start of that second line. If you write another 70-character status update, you'll advance yet another line, and so on.

(Dealing with multi-line status updates requires going to full cursor addressing using curses(3) or the like. This is a lot more complicated, which is why people really like to stick to just printing CRs for as long as possible and thus why some things will explode if you run them in too-narrow terminals or resize the terminal on them as they're running.)

As a side note, this isn't the only way to do same-line status updates; you can also backspace over what you've printed by printing some number of Ctrl-Hs. The Ctrl-H trick tends to be what gets used if you just want to update a bit of status at the end of a line, eg updating the percentage in a message like 'current progress on frobnicating things: XX%'. The CR trick usually gets used when counting how many Ctrl-Hs to print (and printing them all) gets annoying. However, Ctrl-H has a quiet advantage; it often does a better job of handling overly-long status lines, because in many (emulated) terminals enough Ctrl-Hs will back up to previous lines. If you print 90 normal characters and then 90 Ctrl-Hs, you usually wind up with the cursor where you started no matter what the width of the terminal is.

(Reading the description of DEL in the article might make you think you could print DELs instead of BSs, with the extra advantage that this would not merely move the cursor back but also erase that pesky existing status for you. In practice (emulated) terminals generally don't respond at all to having DEL printed out to them; it gets ignored and does nothing.)

CarriageReturnWhatFor written at 00:16:58; Add Comment

2017-01-20

Why tiling window managers are not really for me (on the desktop)

For years now, the Unix geek thing to do has been to use a tiling window manager. There are a relative tone of them, and many of them promising the twin goals of simplicity and (full) keyboard control out of the box. However, I've never been particularly interested in any of them, despite my broad view that fvwm is a fairly complex window manager as these things go and definitely shows its age in some ways.

The short version of a large part of why I don't feel attracted to tiling window managers is that I generally like empty space on my desktop. Tiling window managers seem to generally be built around the idea that you want to fill everything up to make maximal use of your screen real estate. I don't feel this way; unless I'm going a lot, I actively want there to be empty space on my screen so that my overall environment feels uncluttered. If I only need one or two terminal windows active at the moment, I don't see any reason to have them take up the entire screen.

(Related to this is that I sometimes deliberately overlap windows in order to put certain windows in what I consider good or correct positions for what I'm doing without forcing others to shrink.)

I'm sure that at least some tiling window managers can be taught to leave empty space if you want it and not force the screen to be filled with windows all the time. And probably some can have overlapping, partially occluded windows as well. But it's not how people usually talk about using tiling window managers and so I've wound up feeling that it's not an entirely natural thing for them. I'd rather not try to force a new window manager to operate in a way that it's not really built for when I already have a perfectly good non-tiling window manager.

(There's also the issue of my use of spatial memory for finding windows, both active and especially inactive, which I also have the impression that tiling window managers are not so hot on.)

At the same time, I've seen tiling window layouts work very well in some circumstances when the layout is sufficiently smart; Rob Pike's Acme is the poster child for this for me. There are certainly situations where I would like to switch over to an Acme-style tiled approach to window layout (probably usually when my screen gets busy and cluttered with active windows). It's just that I don't want to live in that world all of the time (and, to be honest, there are issues with xterm that make it annoying to keep changing its width all the time).

It's a pity, though. Some of those tiling window managers definitely do look cool and sound interesting.

PS: All of this is relatively specific to my desktop, where I have a physically large display and so it's frequently uncluttered, with room to spare.

(I'm not particularly attracted by 'control everything from the keyboard', either. I like using the mouse for things it's good at.)

TilingWMNotReallyForMe written at 20:38:42; Add Comment

2017-01-09

Making modern FreeType-using versions of xterm display CJK characters

For a long time, my setup of xterm has not displayed Chinese and Japanese characters (or Korean ones, although I encounter those less often). Until recently it displayed the Unicode 'no such character' empty box in their place, which was okay and told me that there were problems, but after my upgrade to Fedora 25 it started showing spaces instead (or at least some form of whitespace). This is just enough extra irritation that I've been pushed into figuring out how to fix it.

I switched my xterm setup from old style bitmapped fonts to new style XFT/FreeType several years ago. It turns out that enabling CJK fonts in this environment is actually quite simple, as I found out from the Arch Linux wiki. All you need to do is to tell xterm what font to use for these characters with either the -fd command line argument or the faceNameDoublesize X resource (I recommend the latter, unless you already have a frontend script for xterm).

Well, that elides a small but important detail, namely finding such a font. Modern fonts tend to have a lot more glyphs and language coverage than old fonts did, but common fonts like the monospaced font I'm using for xterm don't go quite as far as covering the CJK glyphs; instead this seems to be reserved for special fonts with extended ranges. Sophisticated systems like Gnome come magically set up to pick the right font(s) in gnome-terminal, but in xterm we're on our own to dig up a suitable font and I'm not quite sure what the right way to do that is.

As far as I know, fontconfig can be used to show us a list of fonts that claim to support a language, for example with 'fc-list :lang=zh-cn family'. A full list of things you can query for is here, and a more useful query may be 'fc-list :weight=medium:lang=zh-cn family', which excludes all of the bold and italic and so on versions.

(It looks like you can find out fonts that include a specific Unicode character by querying for 'charset=<hex codepoint>'.)

What I don't know is whether xterm requires its CJK font to be monospaced (I suspect it does if you want completely correct rendering) and if so, how you tell if any specific font is monospaced in its CJK glyphs. When I ask for 'fc-list :lang=zh-cn:spacing=mono', I get no fonts, although there are CJK fonts with 'Mono' in their names on my system and I'm using one of them in xterm without explosions so far. It may be that CJK fonts with 'Mono' in their name are monospaced in their CJK glyphs even if they are not monospaced in all glyphs. But then there is eevee's exploration into fontconfig, which suggests that 'monospace' in fontconfig is actually kind of arbitrary anyways.

(The other thing I don't know how to do for xterm is set things up if you need multiple fonts in order to get full coverage of the CJK glyphs, possibly in a genuinely monospaced font. This is especially interesting because Google's Noto Sans fonts have a collection of 'Noto Sans Mono CJK <language>' fonts. There appears to be overlap between them, but it's not clear if you need to stich up one (double-width) font for xterm out of them all or some subset.)

XTermFreeTypeCJKFonts written at 01:26:03; Add Comment

2016-12-30

A few useful standard readline bindings

One day, I'm not sure exactly when, I stumbled over the standard readline binding of M-. for 'yank-last-arg'. The description you can find in your local readline manpage may be a little abstract, so here's the simplified version: M-. inserts whatever was the last word on the previous command line into your current command line.

Did you perhaps type:

$ ls *some*[Cc]omplex*pattern

and now you want to grep through files matching that pattern? Easy; type 'grep thing M-.' and you'll wind up with 'grep thing *some*[Cc]omplex*pattern'.

(You can also use M-_, according to the manpage.)

This turns out to not be the only potentially useful standard readline binding that I wasn't aware of, so here's my current little collection.

  • M-C-y is 'yank first argument'. With an argument, both it and M-. are 'yank nth argument', but I'm not sure I'd ever do that instead of some form of line editing.

    (Providing numeric arguments is a bit awkward in readline and we're into the same 'counting words' problem territory that I have with Vi. It's mentally easier to reshape a command line than count out that I want the third, fourth, and sixth argument.)

  • C-_ or C-x C-u is incremental undo.
  • M-r reverts the line to its original state. This is probably mostly useful if I've accidentally modified a line in the history buffer and want to discard those changes to revert it to its pristine as-originally-typed state.

In Bash specifically (although not necessarily other things that use readline), you can force certain completions regardless of bad attempts to be clever. These are:

  • M-/ for filename completion
  • M-! for command name completion
  • M-$ for variable name completion
  • M-@ for hostname completion
  • M-~ for user name completion

All have C-x <char> versions that list all the possible completions. The keys for these are reasonably related to what you're forcing a completion for, so I have at least some chance of remembering them in the future.

(There are probably other readline bindings that are useful, and for that matter you can rebind things with a .inputrc. Advanced users of readline probably do all sorts of things there, especially with conditional constructs and variables and so on.)

ReadlineUsefulBindings written at 03:14:31; Add Comment

(Previous 10 or go back to December 2016 at 2016/12/26)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.