Wandering Thoughts archives

2016-04-14

Unix's file durability problem

The core Unix API is overall a reasonably well put together programming environment, one where you can do what you need and your questions have straightforward answers. It's not complete by any means and some of the practical edges are rough as a result of that, but the basics are solid. Well. Most of the basics.

One area where the Unix API really falls down on is the simple question of how to make your file writes be durable. Unix will famously hold your writes in RAM for an arbitrary length of time in the interests of performance. Often this is not quite what you want, as there are plenty of files that you rather want to survive a power loss, abrupt system crash, or the like. Unfortunately, how you make Unix put your writes on disk is what can charitably be called 'underspecified'. The uncharitable would call it a swamp.

The current state of affairs is that it's rather difficult to know how to reliably and portably flush data to disk. Both superstition and uncertainty abound. Do you fsync() or fdatasync() the file? Do you need to fsync() the directory? Are there any extra steps? Do you maybe need to fsync() the parent of the directory too? Who knows for sure.

One issue is that unlike many other Unix API issues, it's impossible to test to see if you got it all correct and complete. If your steps are incomplete, you don't get any errors; your data is just silently sometimes at risk. Even with a test setup to create system crashes or abrupt power loss (which VMs make much easier), you need uncommon instrumentation to know things like if your OS actually issued disk flushes or just did normal buffered writes. And straightforward testing can't tell you if what you're doing will work all the time, because what is required varies by Unix, kernel version, and the specific filesystem involved.

Part of the problem is that any number of filesystem authors have taken advantage of POSIX's weak wording and how nothing usually goes wrong in order to make their filesystems perform faster (most of the time). It's clear why they do this; the standard is underspecified, people run filesystems against each other and reward the fastest ones, and testing actual durability is fiendishly hard so no one bothers. When actual users lose data, filesystem authors have historically behaved a great deal like the implementors of C compiler optimizations; they find some wording that justifies their practice of not flushing, explain how it makes their filesystem faster for almost everyone, and then blame the software authors for not doing the right magic steps to propitiate the filesystem.

(How people are supposed to know what the right steps are is left carefully out of scope for filesystem authors. That's someone else's job.)

This issue is not unsolvable at a technical level, but it probably is at a political level. Someone would have to determine and write up what is good enough now (on sane setups), and then Unix kernel people would have to say 'enough, we are not accepting changes that break this de facto standard'. You might even get this into the Single Unix Specification in some form if you tried hard, because I really do think there's a need here.

I'll admit that one reason I'm unusually grumpy about this is that I feel rather unhappy not knowing what I need to do to safeguard data that I care about. I could do my best, write code in accordance with my best understanding, and still lose data in a crash because I'd missed some corner case or some new additional requirement that filesystem people have introduced. Just the thought of it is alarming. And of course at the same time I'm selfish, because I want my filesystem activity to go as fast as it can and I'm not going to do 'crazy' things like force lots of IO to be synchronous. In this I'm implicitly one of the people pushing filesystem implementors to find those tricks that I wind up ranting about later.

FileSyncProblem written at 00:36:46; Add Comment

2016-04-06

What is behind Unix's 'Text file is busy' error

Perhaps you have seen this somewhat odd Unix error before:

# cp prog /usr/local/bin/prog
cp: cannot create regular file 'prog': Text file is busy

This is not just an unusual error message, it's also a rare instance of Unix being friendly and not letting you blow your foot off with a perfectly valid operation that just happens to be (highly) unwise. To understand it, let's first work out what exact operation is failing. I'll do this with strace on Linux, mostly because it's what I have handy:

$ cp /usr/bin/sleep /tmp/
$ /tmp/sleep 120 &
$ strace cp /usr/bin/sleep /tmp/
[...]
open("/usr/bin/sleep", O_RDONLY)        = 3
fstat(3, {st_mode=S_IFREG|0755, st_size=32600, ...}) = 0
open("/tmp/sleep", O_WRONLY|O_TRUNC)    = -1 ETXTBSY (Text file busy)
[...]

There we go. cp is failing when it attempts to open /tmp/sleep for writing and truncate it, which we have a running program, and the specific Unix errno value here is ETXTBSY. If you experiment some more you'll discover that we're allowed to remove /tmp/sleep if we want to, just not write to it or truncate it (at least on Linux; the specifics of what's disallowed may vary slightly on other Unixes). This is an odd limitation for Unix, because normally there's nothing that prevents one process from modifying a file out from underneath another process (even in harmful ways). Unix leaves it up to the program(s) involved to coordinate things between themselves, rather than enforcing a policy of 'no writing if there are readers' or something in the kernel.

But running processes are special, because really bad things usually happen if you modify the on-disk code of a running process. The problem is virtual memory, or more exactly paged virtual memory. On a system with paged virtual memory, programs aren't loaded into RAM all at once and then kept there; instead they're paged into RAM in bits and pieces as bits of code (and data) are needed. In fact, some times already-loaded bits and pieces are dropped from RAM in order to free up space, since they can always be loaded back in from disk.

Well, they can be loaded back in from disk if some joker hasn't gone and changed them on disk, at least. All of this paging programs into RAM in sections only works if the program's file on disk doesn't ever change while the program is running. If the kernel allowed running programs to change on disk, it could wind up loading in one page of code from version 1 of the program and another page from version 2. If you're lucky, the result would segfault. If you're unlucky, you might get silent malfunctions, data corruption, or other problems. So for once the Unix kernel does not let you blow your foot off if you really want to; instead it refuses to let you write to a program on disk if the program is running. You can truncate or overwrite any other sort of file even if programs are using it, just not things that are part of running programs. Those are special.

Given the story I've just told, you might expect ETXTBSY to have appeared in Unix in or around 3BSD, which is more or less the first version of Unix with paged virtual memory. However, this is not the case. ETXTBSY turns out to be much older than BSD Unix, going back to at least Research V5. Research Unix through V7 didn't have paged virtual memory (it only swapped entire programs in and out), but apparently the Research people decided to simplify their lives by basically locking the files for executing programs against modification.

(In fact Research Unix was stricter than modern Unixes, as it looks like you couldn't delete a program's file on disk if it was running. That section of the kernel code for unlink() gets specifically commented out no later than 3BSD, cf.)

PS: the 'text' in 'text file' here actually means 'executable code', per say size's output. Of course it's not just the actual executable code that could be dangerous if it changed out from underneath a running program, but there you go.

Sidebar: the way around this if you're updating running programs

To get around this, all you have to do is remove the old file before writing the new file into place. This (normally) doesn't cause any problems; the kernel treats the 'removed but still being used by a running program' executable the same way it treats any 'removed but still open' file. As usual the file is only actually removed when the last reference goes away, in this case the last process using the old executable exits.

(Of course NFS throws a small monkey wrench into things, sometimes in more than one way.)

WhyTextFileBusyError written at 23:05:38; Add Comment

2016-03-09

A sensible surprise (to me) in the Bourne shell's expansion of "$@"

I generally like to think that I'm pretty well up on the odd corners of the Bourne shell due to having around Unix for a fair while. Every so often I stumble over something that shows me that I'm wrong.

So let's start with the following, taken from something Jed Davis discovered about Bash:

$ set -- one two three
$ for i in "front $@ back"; do echo $i; done
front one
two
three back
$

When I saw this, my first reaction was basically 'what?', because it didn't seem to make any sense. After I mumbled a bit on Twitter, Jed Davis found the explanation in the Single Unix Specification here:

When the expansion occurs within double-quotes, and where field splitting [...] is performed, each positional parameter shall expand as a separate field, with the provision that the expansion of the first parameter shall still be joined with the beginning part of the original word (assuming that the expanded parameter was embedded within a word), and the expansion of the last parameter shall still be joined with the last part of the original word.

The purpose of "$@" is to preserve arguments that originally have spaces in them as single arguments. So, for example:

$ set -- "one argument" "two argument"
$ for i in "$@"; do echo $i; done
one argument
two argument
$ for i in "$*"; do echo $i; done
one argument two argument
$

This is what the first part of the SuS specification describes (up to 'shall expand as a separate field'). But this definition opens up a question; what is result of expansion if you have not a simple "$@" but instead something with additional text inside the double quotes? One answer would be to completely turn off the special splitting and argument preserving behavior of "$@" (making it identical to "$*" here), but that probably wouldn't be very satisfying. Traditional Unix and thus SuS instead says that you should continue field splitting but pretend that any front text is attached to the first argument and any back text is attached to the last one.

(Since it's still text inside a "...", the front and rear text is not subject to any word splitting; it's attached untouched as a single unit.)

When I saw this, my first and not well thought out expectation was that any leading and trailing text would be subject to regular word splitting and thus be taken as separate, additional arguments. Of course this doesn't actually make sense if I think about it for real, because there is normally no word splitting inside double quotes. Thus, the traditional Unix and SuS behavior is perfectly reasonable here and makes sense from an algorithmic perspective.

Given all this, the result of the following is not really surprising:

$ set -- one two three
$ for i in "$@ $@"; do echo $i; done
one
two
three one
two
three
$

(Writing this entry has been useful in forcing me to confront some of my own fuzzy thinking around the whole area of "$@", as you can tell from the story of my first reaction to this.)

BourneDollarAtExpansionSurprise written at 23:33:06; Add Comment

2016-03-07

Why it makes sense for true and false to ignore their arguments

It's standard when writing Unix command line programs to make them check their arguments and complain if the usage is incorrect. It's reasonably common to do this even for programs that don't take options or positional arguments. After all, if your command is supposed to take no arguments, it's really an error if someone runs it and gives it arguments.

(Not all scripts, programs, and so on actually check this, because you usually have to go at least a little bit out of your way to look at the argument count. But it's the kind of minor nit you might get code review comments about, or an issue report.)

true and false are an exception to this, in that they more or less completely ignore any arguments given to them. Part of this behavior is historical; the V7 /bin/true and /bin/false were extremely minimal, and when you're being minimal it's easiest to not even look at the arguments. But beyond the history, I think that this is perfectly sensible behavior for true and false because it makes them universal substitutes for other commands, for when you want to null out a command so that it does nothing.

Want to make a command do nothing but always succeed? Simple: 'mv command command.real; ln -s /bin/true command'. Want to do the same thing but have the command always fail? Use false instead of true. Sure, you can do the same thing with shell scripts that deliberately ignore the arguments and just do 'exit 0' or 'exit 1', but this is a little bit simpler and matches the historical behavior.

(You can also do this in shell scripts as a way of creating a 'don't actually do anything' mode, but there are probably better patterns there.)

On that note, it's interesting to note that although GNU true and false have command line options that will cause them to produce output, there is no way to get them to return the wrong exit status. And while they respond to --help and --version, they silently ignore other options (as opposed to, say, reporting a syntax error).

(This entry was sparked by Zev Weiss's mention of true in his comment on this entry.)

Sidebar: true and false in V7

In V7 Unix, true is an empty file and false is a file that is literally just 'exit 1'. Neither has a #! line at the start of the file, because that came in later. That true is empty instead of 'exit 0' saves V7 a disk block, which probably mattered back then.

TrueFalseAndArguments written at 23:13:13; Add Comment

2016-02-17

The many load averages of Unix(es)

It turns out that the meaning of 'load average' on Unixes is rather more divergent than I thought it was. So here's the story as I know it.

In the beginning, by which I mean 3 BSD, the load average counted how many processes were runnable or in short term IO wait (in a decaying average). The BSD kernel computed this count periodically by walking over the process table; you can see this in for example 4.2BSD's vmtotal() function. Unixes that were derived from 4 BSD carried this definition of load average forward, which primarily meant SunOS and Ultrix. Sysadmins using NFS back in those days got very familiar with the 'short term IO wait' part of load average, because if your NFS server stopped responding, all of your NFS clients would accumulate lots of processes in IO waits (which were no longer so short term) and their load averages would go skyrocketing to absurd levels.

(Technically the definition was not 'IO wait', it was 'any process that was sleeping with a non-interruptible priority'. In theory this was only processes in IO wait. Yes, this included processes waiting on NFS IO on NFS mounts marked intr; it's complicated.)

When Linux implemented the load average (which it did very early, as 0.96c has it), it copied this traditional definition. Linux load average has been 'run queue plus (short term) IO wait' ever since, although the exact mechanics of how it was computed have changed over time to be more efficient.

(Once multiprocessor systems and large numbers of processes showed up, people soon worked out that 'iterate over the entire process table' was not necessarily a good idea.)

When Sun executed the great SunOS 4 to Solaris transition, I'm not quite sure what happened to their definition of the load average. At least some sources claim that it was immediately redefined to drop IO waits (which would mean that a NFS client would maintain a low load average even when the NFS server went away). Exactly how Solaris counted up 'runnable processes' apparently changed somewhat in Solaris 10; in theory I think this is not supposed to affect the results materially. By Solaris 10 it seems definite that Solaris does not count processes in IO wait in the load average, and this has been carried forward into Illumos and derivatives.

(I looked at the Illumos source code very briefly and determined that it was complicated enough that it was too much work to understand it for this entry.)

The situation with the *BSDs is messy. I haven't thoroughly investigated historical source trees, but I can't imagine that 386BSD and then NetBSD people immediately changed the 4BSD definition of the load average to drop processes in IO wait. Certainly the FreeBSD 2.0 sources I have handy access to (via this Github repo) still count processes in IO wait. Then at some point things get very tangled and some of the available information I could find seems to be wrong (eg). The net result is that FreeBSD split apart from OpenBSD and NetBSD in load average calculations, and OpenBSD and NetBSD are somewhat divergent from each other.

As far as I can decode the current state of load average calculations on the three are:

  • In FreeBSD, load average counts only runnable processes, not processes in IO wait. The count of runnable processes is maintained on the fly by the scheduler in code that I'm not going to try to link to.

  • In NetBSD, kern/kern_synch.c's sched_pstats() function counts both runnable processes and all sleeping processes that have slept for less than one second so far (at least that's what I think l_slptime is counting).

  • In OpenBSD, uvm/uvm_meter.c's uvm_loadav() function counts both runnable processes and sleeping processes that are in high priority IO wait and have slept for less than one second so far (assuming I understand p_slptime correctly). This is fewer sleeping processes than NetBSD seems to include.

(Don't ask me what Dragonfly BSD does here.)

This is all very messy and contradicts some things knowledgeable OpenBSD people have said. Mind you, they said them in 2009, but on the other hand I can't imagine that OpenBSD would have dropped and then restored counting processes in IO wait (and I can't find any sign of that in their CVS logs).

(I don't know what any other commercial Unixes do here, including Mac OS X. Energetic people are encouraged to do their own research.)

The real moral is that the exact definition of 'load average' is a mess today. If you think you care about load average, you should find out how much IO waiting and general sleeping it includes on your system, ideally via actual experimentation.

ManyLoadAveragesOfUnix written at 02:40:15; Add Comment

2016-02-08

Old Unix filesystems and byte order

It all started with a tweet by @JeffSipek:

illumos/solaris UFS don't use a fixed byte order. SPARC produces structs in BE, x86 writes them out in LE. I was happier before I knew this.

As they say, welcome to old time Unix filesystems. Solaris UFS is far from the only filesystem defined this way; in fact, most old time Unix filesystems are probably defined in host byte order.

Today this strikes us as crazy, but that's because we now exist in a quite different hardware environment than the old days had. Put simply, we now exist in a world where storage devices both can be moved between dissimilar systems and are. In fact, it's an even more radical world than that; it's a world where almost everyone uses the same few storage interconnect technologies and interconnects are common between all sorts of systems. Today we take it for granted that how we connect storage to systems is through some defined, vendor neutral specification that many people implement, but this was not at all the case originally.

(There are all sorts of storage standards: SATA, SAS, NVMe, USB, SD cards, and so on.)

In the beginning, storage was close to 100% system specific. Not only did you not think of moving a disk from a Vax to a Sun, you probably couldn't; the entire peripheral interconnect system was almost always different, from the disk to host cabling to the kind of backplane that the controller boards plugged into. Even as some common disk interfaces emerged, larger servers often stayed with faster proprietary interfaces and proprietary disks.

(SCSI is fairly old as a standard, but it was also a slow interface for a long time so it didn't get used on many servers. As late as the early 1990s it still wasn't clear that SCSI was the right choice.)

In this environment of system specific disks, it was no wonder that Unix kernel programmers didn't think about byte order issues in their on disk data structures. Just saying 'everything is in host byte order' was clearly the simplest approach, so that's what people by and large did. When vendors started facing potential bi-endian issues, they tried very hard to duck them (I think that this was one reason endian-switchable RISCs were popular designs).

In theory, vendors could have decided to define their filesystems as being in their current endianness before they introduced another architecture with a different endianness (here Sun, with SPARC, would have defined UFS as BE). In practice I suspect that no vendor wanted to go through filesystem code to make it genuinely fixed endian. It was just simpler to say 'UFS is in host byte order and you can't swap disks between SPARC Solaris and x86 Solaris'.

(Since vendors did learn, genuinely new filesystems were much more likely to be specified as having a fixed and host-independent byte order. But filesystems like UFS trace their roots back a very long way.)

OldFilesystemByteOrder written at 23:04:43; Add Comment

2016-01-07

The format of strings in early (pre-C) Unix

The very earliest version of Unix was written before C was created and even after C's creation the whole system wasn't rewritten in it immediately. Courtesy of the Unix Heritage Society, much of the surviving source code from this era is available online. It doesn't make up a complete source tree for any of the early Research Unixes, but it does let us peek back in time to read code and documentation that was written in that pre-C era.

In light of a recent entry on C strings, I became curious about what the format of strings was in Unix back before C existed. Even in the pre-C era, the kernel and assembly language programs needed strings for some things; for example, system calls like creat() and open() have to take filaname arguments in some form, and programs often have constant strings for messages that they'll print out. So I went and looked at early Unix source and documentation, for Research V1 (entirely pre-C), Research V2, and Research V3.

I will skip to the punchline:

Unix strings have been null-terminated from the very beginning of Unix, even before C existed.

Unix did not get null-terminated strings from C. Instead, C got null-terminated strings from Unix (specifically, Research V1 Unix). I don't know where V1 Unix got them from, if anywhere.

There's plenty of traces of this in the surviving Research V1 files. For instance, the V1 creat manpage says:

creat creates a new file or prepares to rewrite an existing file called name; name is the address of a null--terminated string. [...]

The V1 shell also contains uses of null-terminated strings. These are written with an interesting notation:

[...]
   bec 1f / branch if no error
   jsr r5,error / error in file name
       <Input not found\n\0>; .even
   sys exit
[...]
qchdir:
   <chdir\0>
glogin:
   <login\0>
[...]

Not all strings in the shell are null-terminated in this way, probably because it was natural to have their lengths just known in the code. If we need more confirmation, the error function specifically comments that a 0 byte is the end of the 'line' (here a string):

error:
   movb  (r5)+,och / pick up diagnostic character
   beq   1f / 0 is end of line
   mov   $1,r0 / set for tty output
   sys   write; och; 1 / print it
   br    error / continue to get characters
1:
   [... goes on ...]

I suspect that one reason this format for strings was adopted was simply that it was easy to express and support in the assembler. Based on the usage here, a string was simply a '<....>' block that supported some escape sequences, including \0 for a null byte; presumably this was basically copied straight into the object file after escape translation. There's no need for either the assembler or the programmer to count up the string length and then get that too into the object code somehow.

(It turns out that the V1 as manpage documents all of this.)

PS: it's interesting that although the V1 write system call supports writing many bytes at once, the error code here simply does brute force one character at a time output. Presumably that was just simpler to code.

Update: See the comments for interesting additional information and pointers. Other people have added a bunch of good stuff.

UnixEarlyStrings written at 02:21:26; Add Comment

2016-01-06

A fun Bash buffering bug (apparently on Linux only)

I'll present this in the traditional illustrated form:

$ cat repro
#!/bin/bash
function cleanup() {
   r1=$(/bin/echo one)
   r2=$(echo two)
   #echo $r1 '!' $r2 >>/tmp/logfile
   echo $r1 '!' $r2 1>&2
}
trap cleanup EXIT
sleep 1
echo final

$ ./repro | false
final
one final ! final two
$

Wait, what? This should print merely 'one ! two'. The 'echo final' should be written to stdout, which is a pipe to false (and anyways, it's closed by the time the echo runs, since false will already have exited by the time 'sleep 1' finishes). Then the cleanup function should run on termination, with $r1 winding up being "one" and $r2 winding up being "two" from the command substitutions, and they should then get echoed out to standard error as 'one ! two'.

(Indeed you can get exactly this output with a command line like './repro >/dev/null'.)

What is actually happening here (or at least appears to be happening) is an interaction between IO buffering, our old friend SIGPIPE, and forking children while you have unclean state. Based on strace, the sequence of events is:

  1. Bash gets to 'echo final' and attempts to write it to the now-closed standard output pipe by calling 'write(1, "final\n", 6)'. This gets an immediate SIGPIPE.
  2. Since we've set a general EXIT trap, bash immediately runs our cleanup function (since the script is now exiting due to the SIGPIPE).

  3. Bash forks to run the $(/bin/echo one) command substitution. In the child, it runs /bin/echo and then, just before the child exits, does a 'write(1, "final\n", 6)'. This succeeds, since the child's stdout is connected to the parent's pipe. In the main bash process, it reads back one\n and then final\n from the child process, and turns this into "one final" as the value assigned to $r1.

  4. For '$(echo two)', the child process just winds up calling write(1, "final\ntwo\n", 10). This becomes "final two" for $r2's value.

    (This child doesn't fork to run /bin/echo because we're using the Bash builtin echo instead.)

  5. At last, the main process temporarily duplicates standard error to standard output and calls 'write(1, ....)' to produce all of the output we see here.

What appears to be going on is that when the initial 'echo final' in the main Bash process was interrupted by a SIGPIPE, it left "final\n" in stdout's IO buffer as unflushed output. When Bash forked for each $(...) command substitution, the child inherited this unflushed buffer. In the $r1 case, the child noticed this unflushed buffer as it was exiting and wrote it out at that point; in the $r2 case, the child appended the output of its builtin echo command to the unflushed stdout buffer and then wrote the whole thing out. Then, finally, when the parent ran the echo at the end of cleanup(), it too appended its echo output to the stdout buffer and wrote everything out.

There are two Bash bugs here. First, the output from the initial failed 'echo final' should have been discarded from stdout's IO buffer, not retained to show up later on. Second, the children forked for each $(...) should not have inherited this unflushed IO buffer, because allowing unflushed buffers to make it into children is a well known recipe for having exactly this sort of multiple-flush happen.

(Well, allowing any unflushed or un-cleaned-up resource into children will do this, at least if your children are allowed to then notice it and attempt to clean it up.)

I don't know why this bug seems to be Linux specific. Perhaps Bash is using stdio, and Linux's stdio is the only version that behaves this way (either on the initial write that gets SIGPIPE, or in allowing the unflushed buffer state to propagate into forked children). If this is Linux stdio at work, I don't know if the semantics are legal according to POSIX/SUS and in a way it doesn't matter, as stdio libraries with this behavior are on a lot of deployed machines so your code had better be prepared for it.

(Regardless of what POSIX and SUS say, in practice 'standard' Unix is mostly defined this way. Code that you want to be portable has to cope with existing realities, however non-compliant they may be.)

By the way, finding this Bash issue from the initial, rather drastic symptoms that manifested in a complex environment was what they call a lot of fun (and it was Didier Spezia who did the vital work of identifying where the failure was; I just ground away from there).

PS: If you want to see some extra fun, switch which version of the final echo is run in cleanup() and then watch the main Bash process with strace.

BashBufferingForkBug written at 01:37:13; Add Comment

2015-12-30

Some notes on entering unusual characters in various X applications

Once upon a time it was perfectly fine to only type plain ASCII. But these days even people writing in English (like me) have a steadily increasing desire to occasionally write with accented characters, special symbols like right arrows, and so on. As it happens you can do this in X, although how is not entirely easy and natural unless (and until) you do it a lot.

The first thing you want to do is to make some key your Compose key. If you have a Windows key or two on your keyboard, they make a handy donor key for this purpose and you can set this up with setxkbmap:

setxkbmap -option compose:rwin

This makes my right Windows key into a Compose key.

The basic use of the Compose key is to either tap or hold it and then enter a two-character sequence to get some unusual character. You might ask 'what two-character sequence gets what', and that is a good question. X being X, and Gnome being Gnome, people have made this complicated.

For normal X applications, which I believe includes KDE applications, all of the available key sequences are defined in a Compose file. The standard default file is /usr/share/X11/locale/<locale>/Compose, where the locale is, for example, en_US.UTF-8. You can also have a personal version of this file as $HOME/.XCompose, which may be useful if you want to add additional characters or shuffle things to be more useful. One example of this is this Github repo. Wikipedia also has a guide of common compose combinations.

As covered in the Ubuntu wiki page on the Compose key, GTK-based applications are gratuitously different by defaulting to a hard-coded list that is derived from some version of the default X Compose file. This is probably going to be the same for most common keys, but obviously it'll differ if you have a custom .XCompose file. You can apparently force GTK to use the normal X mechanism if you want, although I haven't tried this.

One advantage of sticking with the default GTK setup is that GTK will then let you directly enter Unicode code points. How this is done is described on the Ubuntu wiki page, although not all GTK applications in all environments will display any indication that this entry mode is active. Having tested it, I can say that this works in Firefox and in gnome-terminal. Unfortunately this is GTK only; it doesn't work in KDE or in general X applications, although there is an upstream bug on this. Don't hold your breath, though.

(Some X applications may have specific support for Unicode code entry and other frills, but you'll have to check their documentation and then maybe experiment to see if it works for you or if it clashes with some other part of your environment.)

Given the GTK situation, my conclusion is that I don't want to bother trying to customize a Compose file for my own use. If I do start wanting unusual special characters on a regular basis (and thus a custom .XCompose), I expect I'll force GTK to use the standard X mechanism and live without the ability to enter Unicode code points. Since I didn't know about it until recently and haven't actually used it, I doubt I'll miss it very much.

(See also this and this.)

UsingComposeKeyInX written at 03:05:42; Add Comment

2015-12-14

Getting xterm and modern X applications to do cut and paste together

I was all set to write a somewhat grumpy blog entry about getting xterm and Chrome to play nice with each other for cut and paste when I decided to re-test this systematically. Let me tell you, systematic testing and investigation changes quite a lot of blog entries around here, and this one appears to be no exception.

The basic background on this is that X basically has two levels of cut and paste, which I'll call a casual level and a serious level. The casual level is making ordinary selections and pasting them with the middle mouse button (in most X applications), while the serious level is done through explicit Copy and Paste actions, generally found in menu entries. In X jargon, the 'casual level' is the PRIMARY selection and the 'serious level' is the CLIPBOARD selection.

(The history behind why both of these exist is long and tangled and basically goes back to the history of cut and paste in X and xterm's different cut and paste model.)

Xterm works fine at the ordinary selection level. Selections you make in xterm can be pasted into modern X applications like Firefox and Chrome, and selections that you sweep out in Chrome or Firefox can be pasted into xterm. But if you look at xterm, you will search in relative vain for a Copy or a Paste menu item. Modern terminal programs like gnome-terminal have entries right there and they do what you expect, but xterm does not.

There are two ways to get Copy in xterm. The first is the confusingly named 'Select to Clipboard' entry on the xterm menu you get with Control-middle mouse button; this makes all of your ordinary xterm selections be Copy-level selections instead of ordinary selections (ie, CLIPBOARD instead of PRIMARY, hence the name of the menu item). You can no longer paste them into most other things with the middle mouse button, but you can Paste them. The second is to add a keyboard binding for Copy in your X resources:

XTerm*VT100.Translations: #override \
     Ctrl Shift <KeyPress> C: copy-selection(CLIPBOARD)

The advantage of a keyboard binding is that regular selection stuff continues to work as it normally does; you just explicitly invoke the Copy operation whenever you need it. Among other things, this means you don't have to remember the 'Select to ...' mode that any particular xterm is in (or forget and be puzzled about why you may not be able to middle mouse button paste a selection in one xterm into something else).

Getting Paste in xterm is much more confusing, so I will give you the conclusion; if you want a real Paste, you need to add another keyboard override:

XTerm*VT100.Translations: #override \
     Ctrl Shift <KeyPress> C: copy-selection(CLIPBOARD) \n\
     Ctrl Shift <KeyPress> V: insert-selection(CLIPBOARD)

Xterm tries very hard to do something smart when you use the middle mouse button to paste things. As part of this smartness, it normally pastes the basic selection from whatever program has one; however, if there is no basic selection at the moment and there is a Copy-level selection, it will paste that instead. The time when you need an explicit Paste operation (as provided here with Shift-Control-V) is when the basic selection is different from what got Copy'd and you specifically want to paste the Copy'd stuff.

(In the jargon, xterm normally tries to paste the PRIMARY selection but if that doesn't exist, it's willing to try the CLIPBOARD. Our new key binding does an explicit insert from CLIPBOARD.)

XtermModernCutAndPaste written at 01:24:19; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.