2015-03-19
A brief history of fiddling with Unix directories
In the beginning (say V7 Unix), Unix directories were remarkably
non-special. They were basically files that
the kernel knew a bit about. In particular, there was no mkdir(2)
system call and the .
and ..
entries in each directory were
real directory entries (and real hardlinks), created by hand by
the mkdir
program.
Similarly there was no rmdir()
system call and rmdir
directly called unlink()
on dir/..
, dir/.
, and dir
itself.
To avoid the possibility of users accidentally damaging the directory
tree in various ways, calling link(2)
and unlink(2)
on directories
was restricted to the superuser.
(In part to save the superuser from themselves, commands like ln
and rm
then generally refused to operate on directories at all,
explicitly checking for 'is this a directory' and erroring out if
it was. V7 rm
would remove directories with 'rm -r
', but it
deferred to rmdir
to do the actual work. Only V7 mv
has
special handling for directories; it knew how to actually rename
them by manipulating hardlinks to them, although this only worked
when mv
was run by the superuser.)
It took until 4.1 BSD or so for the kernel to take over the work
of creating and deleting directories, with real mkdir()
and
rmdir()
system calls. The kernel also picked up a rename()
system call at the same time, instead of requiring mv
to do the
work with link(2)
and unlink(2)
calls; this rename()
also
worked on directories. This was the point, not coincidentally,
where BSD directories themselves became more complicated. Interestingly, even in 4.2 BSD link(2)
and
unlink(2)
would work on directories if you were root and mknod(2)
could still be used to create them (again, if you were root),
although I suspect no user level programs made use of this (and
certainly rm
still rejected directories as before).
(As a surprising bit of trivia, it appears that the 4.2 BSD ln
lacked a specific 'is the source a directory' guard and so a superuser
probably could accidentally use it to make extra hardlinks to a
directory, thereby doing bad things to directory tree integrity.)
To my further surprise, raw link(2)
and unlink(2)
continued to
work on directories as late as 4.4 BSD; it was left for other Unixes
to reject this outright. Since the early Linux kernel source is
relatively simple to read, I can say that Linux did from very early
on. Other Unixes, I have no idea about. (I assume but don't know for
sure that modern *BSD derived Unixes do reject this at the kernel
level.)
(I've written other entries on aspects of Unix directories and their history: 1, 2, 3, 4.)
PS: Yes, this does mean that V7 mkdir
and rmdir
were setuid
root, as far as I know. They did do their own permission checking
in a perfectly V7-appropriate way, but in general, well, you really
don't want to think too hard about V7, directory creation and
deletion, and concurrency races.
In general and despite what I say about it sometimes, V7 made decisions that were appropriate for its time and its job of being a minimal system on a relatively small machine that was being operated in what was ultimately a friendly environment. Delegating proper maintenance of a core filesystem property like directory tree integrity to user code may sound very wrong to us now but I'm sure it made sense at the time (and it did things like reduce the kernel size a bit).