Wandering Thoughts archives

2018-04-18

The sensible way to use Bourne shell 'here documents' in pipelines

I was recently considering a shell script where I might want to feed a Bourne shell 'here document' to a shell pipeline. This is certainly possible and years ago I wrote an entry on the rules for combining things with here documents, where I carefully wrote down how to do this and the general rule involved. This time around, I realized that I wanted to use a much simpler and more straightforward approach, one that is obviously correct and is going to be clear to everyone. Namely, putting the production of the here document in a subshell.

(
cat <<EOF
your here document goes here
with as much as you want.
EOF
) | sed | whatever

This is not as neat and nominally elegant as taking advantage of the full power of the Bourne shell's arcane rules, and it's probably not as efficient (in at least some sh implementations, you may get an extra process), but I've come around to feeling that that doesn't matter. This may be the brute force solution, but what matters is that I can look at this code and immediately follow it, and I'm going to be able to do that in six months or a year when I come back to the script.

(Here documents are already kind of confusing as it stands without adding extra strangeness.)

Of course you can put multiple things inside the (...) subshell, such as several here documents that you output only conditionally (or chunks of always present static text mixed with text you have to make more decisions about). If you want to process the entire text you produce in some way, you might well generate it all inside the subshell for convenience.

Perhaps you're wondering why you'd want to run a here document through a pipe to something. The case that frequently comes up for me is that I want to generate some text with variable substitution but I also want the text to flow naturally with natural line lengths, and the expansion will have variable length. Here, the natural way out is to use fmt:

(
cat <<EOF
My message to $NAME goes here.
It concerns $HOST, where $PROG
died unexpectedly.
EOF
) | fmt

Using fmt reflows the text regardless of how long the variables expand out to. Depending on the text I'm generating, I may be fine with reflowing all of it (which means that I can put all of the text inside the subshell), or I may have some fixed formatting that I don't want passed through fmt (so I have to have a mix of fmt'd subshells and regular text).

Having written that out, I've just come to the obvious realization that for simple cases I can just directly use fmt with a here document:

fmt <<EOF
My message to $NAME goes here.
It concerns $HOST, where $PROG
died unexpectedly.
EOF

This doesn't work well if there's some paragraphs that I want to include only some of the time, though; then I should still be using a subshell.

(For whatever reason I apparently have a little blind spot about using here documents as direct input to programs, although there's no reason for it.)

SaneHereDocumentsPipelines written at 23:05:30; Add Comment

2018-04-16

Some notes and issues from trying out urxvt as an xterm replacement

I've been using xterm for a very long time, but I'm also aware that it's not a perfect terminal emulator (especially in today's Unicode world, my hacks notwithstanding). Years ago I wrote up what I wanted added to xterm, and the recommendation I've received over the years (both on that entry and elsewhere) is for urxvt (aka rxvt-unicode). I've made off and on experiments with urxvt, but for various reasons I've recently been trying a bit more seriously to use it regularly and to evaluate it as a serious alternative to xterm for me.

One of my crucial needs in an xterm replacement is an equivalent of xterm's ziconbeep feature, which I use to see when an iconified xterm has new output. Fortunately that need was met a long time ago through a urxvt Perl extension written by Leah Neukirchen; you can get the extension itself here. In my version I took out the audible bell. Without this, urxvt wouldn't be a particularly viable option for me, so I'm glad that it exists.

Urxvt's big draw as an xterm replacement is that it will reflow lines as you widen and narrow it. However, for a long time this didn't seem to work for me, or didn't seem to work reliably. Back in last September I finally discovered that the issue is that urxvt only reflows lines after a resize if it's already scrolled text in the window. This is the case both for resizing wider and for resizing narrower, which can be especially annoying (since resizing wider can sometimes 'un-scroll' a window). This is something that I can sort of work around; these days I often make it a point to start out my urxvt windows in their basic 80x24 size, dump out the output that I'll want, and only then resize them to read the long lines. This mostly works but it's kind of irritating.

(I'm not sure if this is a urxvt bug or a deliberate design decision. Perhaps I should try reporting it to find out.)

Another difference is that xterm has relatively complicated behavior on double-clicks for what it considers to be separate 'words'; you can read the full details in the manpage's section on character classes. Urxvt has somewhat simpler behavior based on delimiter characters, and its default set of delimiters make it select bigger 'words' than xterm does. For instance, a standard urxvt setup will consider all of a full path to be one word, because / is not a delimiter character (neither is :, so all of your $PATH is one word as far as urxvt is concerned). I'm highly accustomed to xterm's behavior and I prefer smaller words here, because it's much easier to widen a selection than it is to narrow it. You can customize some of this behavior with urxvt's cutchars resource (see the urxvt manpage). Currently I'm using:

! requires magic quoting for reasons.
URxvt*cutchars:   "\\`\"'&()*,;<=>?@[]^{|}.#%+!/:-"

This improves the situation in urxvt but isn't perfect; in practice I see various glitches, generally when several of these delimiters happen in a row (eg given 'a...', a double-click in urxvt may select up to the entire thing). Since I'm using the default selection Perl extension, possibly I could improve things by writing some complicated regular expressions (or replace the selection extension entirely with a more controllable version where I understand exactly what it's doing). If I want to exactly duplicate xterm's behavior, a Perl extension is probably the only way to achieve it.

(I'm not entirely allergic to writing Perl extensions for urxvt, but it's been a long time since I wrote Perl and I'm not familiar with the urxvt extensions API, so at a minimum it's going to be a pain.)

Given these issues I'm not throwing myself into a complete replacement of my xterm usage with urxvt, but I am reaching for it reasonably frequently and I've taken steps to make it easier to use in my environment. This involves both making it as conveniently accessible as xterm and also teaching various bits of my window manager configuration and scripting that urxvt is a terminal window and should be treated like xterm.

This whole thing has been an interesting experience overall. It's taught me both how much I'm attuned to very specific xterm behaviors and how deeply xterm has become embedded into my overall X environment.

UrxvtNotes written at 00:26:59; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.