The history of file type information being available in Unix directories
The two things that Unix directory entries absolutely have to have
are the name of the directory entry and its 'inode', by which we
generically mean some stable kernel identifier for the file that
will persist if it gets renamed, linked to other directories, and
so on. Unsurprisingly, directory entries have had these since the
days when you read the raw bytes of directories with read()
, and for a long time that was all they had; if you
wanted more than the name and the inode number, you had to stat()
the file, not just read the directory. Then, well, I'll quote
myself from an old entry on a find
optimization:
[...], Unix filesystem developers realized that it was very common for programs reading directories to need to know a bit more about directory entries than just their names, especially their file types (
find
is the obvious case, but also consider things like 'ls -F
'). Given that the type of an active inode never changes, it's possible to embed this information straight in the directory entry and then return this to user level, and that's what developers did; on some systems,readdir(3)
will now return directory entries with an additionald_type
field that has the directory entry's type.
On Twitter, I recently grumbled about Illumos not having this
d_type
field.
The ensuing conversation wound up with me curious about exactly where
d_type
came from and how far back it went. The answer turns out to
be a bit surprising due to there being two sides of d_type
.
On the kernel side, d_type
appears to have shown up in 4.4 BSD.
The 4.4 BSD /usr/src/sys/dirent.h
has a struct dirent
that has a d_type
field, but the field
isn't documented in either the comments in the file or in the
getdirentries(2)
manpage; both of those admit only to the traditional
BSD dirent
fields. This 4.4 BSD d_type
was carried through
to things that inherited from 4.4 BSD (Lite), specifically FreeBSD,
but it continued to be undocumented for at least a while.
(In FreeBSD, the most convenient history I can find is here, and the d_type
field
is present in sys/dirent.h as far back as FreeBSD 2.0, which
seems to be as far as the repo goes for releases.)
Documentation for d_type
appeared in the getdirentries(2)
manpage in FreeBSD 2.2.0, where the manpage itself claims to have
been updated on May 3rd 1995 (cf).
In FreeBSD, this appears to have been part of merging 4.4 BSD
'Lite2', which seems to have been done in 1997. I stumbled over a
repo of UCB BSD commit history,
and in it the documentation appears in this May 3rd 1995 change,
which at least has the same date. It appears that FreeBSD 2.2.0 was
released some time in 1997, which is when this would have appeared
in an official release.
In Linux, it seems that a dirent structure with a d_type
member
appeared only just before 2.4.0, which was released at the start
of 2001. Linux took this long because the d_type
field only
appeared in the 64-bit 'large file support' version of the
dirent
structure, and so was only return by the new 64-bit
getdents64()
system call. This would have been a few years after
FreeBSD officially documented d_type
, and probably many years
after it was actually available if you peeked at the structure
definition.
(See here for an overview of where to get ancient Linux kernel history from.)
As far as I can tell, d_type
is present on Linux, FreeBSD,
OpenBSD, NetBSD, Dragonfly BSD, and Darwin (aka MacOS or OS X).
It's not present on Solaris and thus Illumos. As far as other
commercial Unixes go, you're on your own; all the links to manpages
for things like AIX from my old entry on the remaining Unixes appear to have rotted away.
Sidebar: The filesystem also matters on modern Unixes
Even if your Unix supports d_type
in directory entries, it
doesn't mean that it's supported by the filesystem of any specific
directory. As far as I know, every Unix with d_type
support has
support for it in their normal local filesystems, but it's not
guaranteed to be in all filesystems, especially non-Unix ones like
FAT32. Your code should always be prepared to deal with a file type
of DT_UNKNOWN
.
(Filesystems can implement support for file type information in directory entries in a number of different ways. The actual on disk format of directory entries is filesystem specific.)
It's also possible to have things the other way around, where you have a filesystem with support for file type information in directories that's on a Unix that doesn't support it. There are a number of plausible reasons for this to happen, but they're either obvious or beyond the scope of this entry.
|
|