2011-03-31
A really annoying gap in system observability
The other day I had a problem; new gnome-session processes weren't
working right on one of our major login servers. They didn't quite
hang outright; instead, strace showed that they seemed to spend all
of their time talking very slowly to one file descriptor instead of
responding to other Gnome processes that were trying to talk to them.
(When this happens, the other Gnome processes are not very happy and your entire Gnome session basically hangs.)
This left me with a big question: what was on the other end of that file descriptor?
Answering this question turned out to be absurdly difficult, and that is the problem. At least theoretically, 'observability' of systems is one of the next big things; everyone is burning with enthusiasm for tools like Solaris's DTrace and Linux's SystemTap. Yet vendors (and Linux people) have almost completely neglected basic observability tools for tasks like simply seeing what processes are connected to.
On Linux, lsof can be said to be officially supported and it was able
to tell me that the particular file descriptor was a Unix domain socket;
however it couldn't tell me what the other end was connected to, and
I'm not sure that that information is exported by the kernel. On other
Unixes like Solaris, lsof isn't even officially supported by the
vendor; to the extent that it works (it's often incomplete), it works
only because people have put heroic amounts of effort into reverse
engineering portions of the Solaris kernel and obtaining information by
force and trickery.
Frankly, this is absurd. Tools like lsof and lslk have been a vital
part of the sysadmin arsenal for more than fifteen years. Yet it's still
the case that no one (or at least almost no one) officially supports
them and makes sure that they can get the complete information that
sysadmins need, or even makes sure that sysadmins can get the same
information through other tools.
In 2011, in the era of observability as a big thing, it should be trivial for sysadmins to find out information like 'what is this process talking to' or 'what is using resource X', or even 'who is using what resources of type Y'. That it is not says sad things about vendors (and open source developers).
Sidebar: how I answered my question
I used brute force. I hacked my own X environment to strace my
gnome-session from the moment it was started; this let me see the
moment when the relevant file descriptor was created and connect()'d
(it turned out to be talking to the system DBus daemon). However, this
workaround was only possible because I could start the program on demand
and it hung reliably; had I been dealing with a long-running daemon that
was malfunctioning like this, I would have been out of luck.
PS: it turns out that you do not want to restart the system DBus
daemon out from underneath gnome-session. If you do, all existing
gnome-session processes immediately exit, taking every user's session
with them.
2011-03-11
What tools I use to deal with email
The most important thing I do to deal with my email is that I am not on any high-volume mailing lists. As far as I'm concerned, that's what gmane.org (or some equivalent) is for, plus a good newsreader. I sometimes temporarily subscribe to such mailing lists, but only when I'm going to send email to them. This one decision means that I don't need a mail environment that's designed to cope with a huge volume of email, which in turn means that I have a lot more choices and options.
I sort incoming email with procmail, mostly to divert spam and certain
routine 'things are fine' administrative reports off to separate
files. Everything else goes into my inbox.
I read my email using MH, in its current modern form of nmh. MH is natively a command line environment, but there are GUI interfaces for it as well. At the office, I primarily use MH through exmh, because I like good graphical interfaces and because it's the best option for showing me various sorts of MIME-encrusted messages. My exmh environment is fairly extensively customized to fit what I like.
(I have been using exmh for a very long time, more or less since it was first announced publicly, so I have all sorts of bits accreted on top of it.)
At home, I use MH through its command line interface. Well, I don't use
the raw MH commands as-is; instead, I have a large suite of aliases
and shortened names for things and so on (for example, loading my MH
environment defines a 'd' alias to delete the current message, clear
the screen, and show the next one). Part of my customizations are
things to make MH less verbose and to do things like show fewer message
headers; for obscure reasons I do this in peculiar ways instead of with
the built-in MH facilities to do some of this.
One important part of this command line environment is a replacement
for the stock nmh way of displaying MIME-encoded messages. The problem
with the stock version is that it wants to quiz you and pause and flail
around and so on (as anyone who has ever used it knows). My version's
goal in life is simply to print out a plain text representation of as
much of the message as possible, doing a sensible job with things like
HTML and so on. This can then be run through the pager of your choice
to actually see things (I use a very simple one that just pauses every
screen's worth for me to hit return; most people would use less).
Even at work I sometimes drop into the MH command line environment; sometimes it's the easiest way to do something quick and simple, and sometimes it's the best way to do some more sophisticated operation. Exmh does not have a great interface for picking out messages, among other things; I do sometimes envy Thunderbird's really convenient UI for this.
(And of course I am not always in front of my desktop that has exmh running. If I am checking email from a laptop or the like, I am effectively 'at home' as far as MH usage is concerned.)
2011-03-07
What terminal emulators I use when
Recently I mentioned in passing that I use several different X Windows terminal emulators, depending on the circumstances. A commentator sensibly asked what the circumstances were. I routinely use three different terminal emulators; xterm, 9term (which is more of a non-terminal emulator), and gnome-terminal.
Gnome-terminal gets used when I need something that is completely set up
for UTF-8 or modern character graphics. I don't like it as much as the
other two alternatives (and the Gnome people keep making questionable
user interface choices), but sometimes it's what I need, warts and all.
One common need for modern character graphics is apt-get's periodic
dialog boxes for questions, and another is various menu-based serial
console management interfaces for things like switches.
9term is normally my first choice for many things, basically any time I don't need either actual terminal emulation (for, eg, vi or su) or easy copy and paste support. Unfortunately I haven't been using it as much lately due to a Linux kernel issue.
Xterm is my true default terminal emulator, the one that I start if I don't want to think about which terminal emulator I want. There are just too many seductive little attractions to it, and besides I've been using xterm for decades so I am completely acclimatized to how it behaves.
(If I start 9term I have to be certain that I'm not going to want to
run vi or something else that needs cursor addressing. If I start
gnome-terminal, I have to go to extra work and then I have to put up
with it, including how it is different from xterm. So xterm is the easy
choice.)
Finally, if I'm being honest I have to admit that there have always been a number of little irritations and bits of extra work with using 9term instead of xterm, even in situations where 9term is usable. This has not infrequently made xterm my lazy choice even when I could use 9term. This sort of makes me unhappy, because intellectually I like 9term better.
(9term versus xterm is thus sort of like the BSDs versus Linux. It feels a little sleazy and lazy to use xterm instead of 9term, but I do it anyways because it's so convenient. In theory I should like the purity and vision of 9term; in practice, well, xterm again.)