Wandering Thoughts archives

2018-07-30

My own configuration files don't have to be dotfiles in $HOME

Back when I started with Unix (a long time ago), programs had a simple approach to where to look for or put little files that they needed; they went into your $HOME as dotfiles, or if the program was going to have a bunch of them it might create a dot-directory for itself. This started with shells (eg $HOME/.profile) and spread steadily from there, especially for early open source programs. When I started writing shell scripts, setup scripts for my X environment, and other bits and pieces that needed configuration files or state files, the natural, automatic thing to do was to imitate this and put my own dotfiles and dot-directories in my $HOME. The entirely unsurprising outcome of this is that my home directories have a lot of dotfiles (some of them very old, which can cause problems). How many is a lot? Well, in my oldest actively used $HOME, I have 380 of them.

(Because dotfiles are normally invisible, it's really easy for them to build up and build up to absurd levels. Not that my $HOME is neat in general, but I have many fewer non-dotfiles cluttering it up.)

Recently it slowly dawned on me that my automatic reflex to put things in $HOME as dotfiles is both not necessary and not really a good idea. It's not necessary because I can make my own code look wherever I want it to, and it's not a good idea because $HOME's dotfiles are a jumbled mess where it's very hard to keep track of things or even to see them. Instead I'm better off if I put my own files in non-dotfile directory hierarchies somewhere else, with sensible names and sensible separation into different subdirectories and all of that.

(I'm not quite sure when and why this started to crystalize for me, but it might have been when I was revising my X resources and X setup stuff on my laptop and realized that there was no particular reason to put them in _$HOME/.X<something> the way I had on my regular machines.)

I'm probably not going to rip apart my current $HOME and its collection of dotfiles. Although the idea of a scorched earth campaign is vaguely attractive, it'd be a lot of hassle for no visible change. Instead, I've decided that any time I need to make any substantial change to things that are currently dotfiles, I'll take the opportunity to move them out of $HOME.

(The first thing I did this with was my X resources, which had to change on my home machine due to a new and rather different monitor. Since I was basically gutting them to start with, I decided it made no sense to do it in place in $HOME.)

PS: Modern Unix (mostly Linux) has the XDG Base Directory Specification, which tries to move a lot of things under $HOME/.config, $HOME/.local/share, and $HOME/.cache. In theory I could move my own things under there too. In practice I'm not particularly interested in hiding them away that way; I'd rather put them somewhere more obvious, such as $HOME/share/X11/resources.

unix/MovingOutOfHOME written at 21:36:41; Add Comment

Being reminded that an obvious problem isn't necessarily obvious

The other day we had a problem with one of our NFS fileservers, where a ZFS filesystem filled up to its quota limit, people kept writing to the filesystem at high volume, and the fileserver got unhappy. This nice neat description hides the fact that it took me some time to notice that the one filesystem that our DTrace scripts were pointing to as having all of the slow NFS IO was a full filesystem. Then and only then did the penny finally start dropping (which led me to a temporary fix).

(I should note that we had Amanda backups and a ZFS pool scrub happening on the fileserver at the time, so there were a number of ways it could have been overwhelmed.)

In the immediate aftermath, I felt a bit silly for missing such an obvious issue. I'm pretty sure we've seen the 'full filesystem plus ongoing writes leads to problems' issue, and we've certainly seen similar problems with full pools. In fact four years ago I wrote an entry about remembering to check for this sort of stuff in a crisis. Then I thought about it more and kicked myself for hindsight bias.

The reality of sysadmin life is that in many situations, there are too many obvious problem causes to keep track of them all. We will remember common 'obvious' things, by which I mean things that keep happening to us. But fallible humans with limited memories simply can't keep track of infrequent things that are merely easy to spot if you remember where to look. These things are 'obvious' in a technical sense, but they are not in a practical sense.

This is one reason why having a pre-written list of things to check is so potentially useful; it effectively remembers all of these obvious problem causes for you. You could just write them all down by themselves, but generally you might as well start by describing what to check and only then say 'if this check is positive ...'. You can also turn these checks (or some of them) into a script that you run and that reports anything it finds, or create a dashboard in your monitoring and alert system. There are lots of options.

(Will we try to create such a checklist or diagnosis script? Probably not for our current fileservers, since they're getting replaced with a completely different OS in hopefully not too much time. Instead we'll just hope that we don't have more problems over their remaining lifetime, and probably I'll remember to check for full filesystems if this happens again in the near future.)

Sidebar: Why our (limited) alerting system didn't tell us anything

The simple version is that our system can't alert us only on the combination of a full filesystem, NFS problems with that fileserver, and perhaps an observed high write volume to it. Instead the best it can do is alert us on full filesystems alone, and that happens too often to be useful (especially since it's not something we can do anything about).

sysadmin/ObviousNotAlwaysObvious written at 00:59:57; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.