The history of readdir()
In the old days of V7 Unix, directories weren't quite files but
they were close enough that you could open and read()
them
directly, and they had a simple enough structure that there was no
library routine to parse their contents; programs like the V7 ls just did it
themselves. A good part of the reason for this was that filenames were
short (14 characters max), so directory entries could be fixed-sized
objects.
In 4BSD, Berkeley expanded the maximum length of filenames from 14
characters to much larger, and since that most filenames were still
short, they opted to save disk space by turning directory entries
into variable length objects. This made reading directory entries a
sufficiently complicated job that they introduced the readdir()
C
library function to do it for you; however, under the hood the C library
still read()
the directory as if it was a file, getting the raw
filesystem data. Because it is what's most useful for most programs,
readdir()
returned one directory entry at a time.
I believe that Sun is responsible for the next step, when they came
up with NFS. Sun realized that user-level code knowing the filesystem
format of directory entries wasn't really very appropriate for a
true network filesystem, so they introduced a new system call,
getdirents()
, to get directory entries in a filesystem independent
format. Although this was the only way to get directory entries from
NFS filesystems, you could still directly read()
directories on local
filesystems.
(Sun couldn't just make readdir()
be a system call because it was
already fixed at returning only one entry per call, which is usually
considered too inefficient for a system call. As its name suggests,
getdirents()
returns a bunch of entries (however many fit into the
buffer that you provide).)
Sun's good idea was gradually picked up by other people, including the
main BSD line of development that resulted in 4.4 BSD. (Note that some
Unixes use the name getdents()
for the actual system call, instead of
getdirents()
. Amusingly, this now includes Solaris, which doesn't even
have a getdirents()
compatibility routine.)
At some point, Linux took the extra step and forbade read()
on
directories, forcing you to use the system call (or more likely, using
readdir()
and letting it worry about things). This had the useful
result that you could no longer accidentally cat
a directory and get
all sorts of gibberish spewed on your screen, without requiring cat
(and everything else that reads files) to explicitly refuse to touch
directories. This feature does not seem to have spread to Solaris or
the *BSDs, at least as far as I can see.
(I was inspired to write this by the recent report of fixing a
long-standing seekdir()
bug.)
|
|