2015-03-20
Unix's mistake with rm and directories
Welcome to Unix, land of:
; rm thing rm: cannot remove 'thing': Is a directory ; rmdir thing ;
(And also rm may only be telling you half the story, because
you can have your rmdir fail with 'rmdir: failed to remove
'thing': Directory not empty'. Gee thanks both of you.)
Let me be blunt here: this is Unix exercising robot logic. Unix knows
perfectly well what you want to do, it's perfectly safe to do so, and
yet Unix refuses to do it (or tell you the full problem) because you
didn't use the right command. Rm will even remove directories if you
just tell it 'rm -r thing', although this is more dangerous than
rmdir.
Once upon a time rm had almost
no choice but to do this because removing directories took special
magic and special permissions (as '.' and '..' and the directory
tree were maintained in user space). Those days are long over, and
with them all of the logic that would have justified keeping this
rm (mis)feature. It lingers on only as another piece of Unix
fossilization.
(This restriction is not even truly Unixy; per Norman Wilson, Research Unix's 8th edition removed the restriction, so the very heart of Unix fixed this. Sadly very little from Research Unix V8, V9, and V10 ever made it out into the world.)
PS: Some people will now say that the Single Unix Specification
(and POSIX) does not permit rm to behave this way. My view is
'nuts to the SUS on this'. Many parts of real Unixes are already
not strictly POSIX compliant, so if you really have to have this
you can add code to rm to behave in a strictly POSIX compliant
mode if some environment variable is set. (This leads into another
rant.)
(I will reluctantly concede that having unlink(2) still fail on
directories instead of turning into rmdir(2) is probably safest,
even if I don't entirely like it either. Some program is probably
counting on the behavior and there's not too much reason to change
it. Rm is different in part because it is used by people; unlink(2)
is not directly. Yes, I'm waving my hands a bit.)
2015-03-19
A brief history of fiddling with Unix directories
In the beginning (say V7 Unix), Unix directories were remarkably
non-special. They were basically files that
the kernel knew a bit about. In particular, there was no mkdir(2)
system call and the . and .. entries in each directory were
real directory entries (and real hardlinks), created by hand by
the mkdir program.
Similarly there was no rmdir() system call and rmdir
directly called unlink() on dir/.., dir/., and dir itself.
To avoid the possibility of users accidentally damaging the directory
tree in various ways, calling link(2) and unlink(2) on directories
was restricted to the superuser.
(In part to save the superuser from themselves, commands like ln
and rm then generally refused to operate on directories at all,
explicitly checking for 'is this a directory' and erroring out if
it was. V7 rm would remove directories with 'rm -r', but it
deferred to rmdir to do the actual work. Only V7 mv has
special handling for directories; it knew how to actually rename
them by manipulating hardlinks to them, although this only worked
when mv was run by the superuser.)
It took until 4.1 BSD or so for the kernel to take over the work
of creating and deleting directories, with real mkdir() and
rmdir() system calls. The kernel also picked up a rename()
system call at the same time, instead of requiring mv to do the
work with link(2) and unlink(2) calls; this rename() also
worked on directories. This was the point, not coincidentally,
where BSD directories themselves became more complicated. Interestingly, even in 4.2 BSD link(2) and
unlink(2) would work on directories if you were root and mknod(2)
could still be used to create them (again, if you were root),
although I suspect no user level programs made use of this (and
certainly rm still rejected directories as before).
(As a surprising bit of trivia, it appears that the 4.2 BSD ln
lacked a specific 'is the source a directory' guard and so a superuser
probably could accidentally use it to make extra hardlinks to a
directory, thereby doing bad things to directory tree integrity.)
To my further surprise, raw link(2) and unlink(2) continued to
work on directories as late as 4.4 BSD; it was left for other Unixes
to reject this outright. Since the early Linux kernel source is
relatively simple to read, I can say that Linux did from very early
on. Other Unixes, I have no idea about. (I assume but don't know for
sure that modern *BSD derived Unixes do reject this at the kernel
level.)
(I've written other entries on aspects of Unix directories and their history: 1, 2, 3, 4.)
PS: Yes, this does mean that V7 mkdir and rmdir were setuid
root, as far as I know. They did do their own permission checking
in a perfectly V7-appropriate way, but in general, well, you really
don't want to think too hard about V7, directory creation and
deletion, and concurrency races.
In general and despite what I say about it sometimes, V7 made decisions that were appropriate for its time and its job of being a minimal system on a relatively small machine that was being operated in what was ultimately a friendly environment. Delegating proper maintenance of a core filesystem property like directory tree integrity to user code may sound very wrong to us now but I'm sure it made sense at the time (and it did things like reduce the kernel size a bit).
2015-03-03
The latest xterm versions mangle $SHELL in annoying ways
As of patch #301 (and
with changes since then), the canonical version of xterm has some
unfortunate behavior changes surrounding the $SHELL environment
variable and how xterm interacts with it. The full details are
in the xterm manpage
in the OPTIONS section, but the summary is that xterm now clears
or changes $SHELL if the $SHELL value is not in /etc/shells,
and sometimes even if it is. As far as I can tell, the decision
tree goes like this:
- if
xtermis (explicitly) running something that is in/etc/shells(as 'xterm /some/thing', not 'xterm -e /some/thing'),$SHELLwill be rewritten to that thing. - if
xtermis running anything (including running$SHELLitself via being invoked as just 'xterm') and$SHELLis not in/etc/shellsbut your login shell is,$SHELLwill be reset to your login shell. - otherwise
$SHELLwill be removed from the environment, resulting in a shell environment with$SHELLunset. This happens even if you run plain 'xterm' and soxtermis running$SHELL.
It is difficult for me to summarize concisely how wrong this is and
how many ways it can cause problems. For a start, this is a misuse
of /etc/shells, per my entry on what it is and isn't;
/etc/shells is in no way a complete list of all of the shells (or
all of the good shells) that are in use on the system. You cannot
validate the contents of $SHELL against /etc/shells because
that is not what /etc/shells is there for.
This xterm change causes significant problems for anyone with
their shell set to something that is not in /etc/shells, anyone
using an alternate personal shell (which
is not in /etc/shells for obvious reasons), any program that
assumes $SHELL is always set (historically a safe assumption),
and any environment that assumes $SHELL is not reset when set to
something non-standard such as a captive or special purpose 'shell'.
(Not all versions of chsh restrict you to what's in /etc/shells,
for that matter; some will let you set other things if you really
ask them to.)
If you fall into one or more of these categories and you use xterm,
you're going to need to change your environment at some point.
Unfortunately it seems unlikely that this change will be reverted, so
if your version of Unix updates xterm at all you're going to have it
sooner or later (so far only a few Linux distributions are recent enough
to have it).
PS: Perhaps this should be my cue to switch to urxvt. However my
almost-default configuration of it is still just enough different
from xterm to be irritating for me, although maybe I could fix
that with enough customization work. For example, I really want
its double-click selection behavior to exactly match xterm because
that's what my reflexes expect and demand by now. See also.
PPS: Yes, I do get quite irritated at abrupt incompatible changes in the behavior of long-standing Unix programs, at least when they affect me.