== An interesting filesystem corruption problem Today we had a fun problem created by a combination of entirely rational _find_ optimizations and a corrupted, damaged filesystem. An important Linux server took some kind of hit that turned some files into directories (with contents, presumably stolen from some other poor directory). We found some but were pretty sure there were others lurking out there too, and wanted to do our best to find them. (If only to figure out what we needed to restore from the last good backups.) As it happens, most of the actual files on this filesystem have some sort of extension, and pretty much all directories don't. So, I made the obvious attempt: > _find /hier -name '[[*.*|]]' -type d -print_ Much to my surprise, this didn't report anything, not even the files we already knew about in _/hier/foo/bar_. Okay, first guess: I happened to know that _find_ optimizes directory traversals based on its knowledge of directory link counts, so if the count is off _find_ will miss seeing directories. A quick check showed that _/hier/foo/bar_ had the wrong link count (it only had two links, despite now having subdirectories). Usefully, _find_ has a '_-noleaf_' option to turn this off (it's usually used to deal with non-Unix filesystems that don't necessarily follow this directory link count convention). But that didn't work either. Fortunately I happened to know about the other optimization modern Unixes do for _find_: they have a field in directory entries called '((d_type))', which has the type of the file (although not its permissions). If files had gotten corrupted into directories, it would make sense that the ((d_type)) information in their directory entries would still show their old type and make _find_ skip them. A quick ((d_type)) dumper program showed that this was indeed the case. This also gave us a good way to hunt these files down: walk the filesystem, looking for entries with a mismatch between ((d_type)) and what _stat(2)_ returned. In retrospect, I have to thank _find_ for biting us with these optimizations; it led me to a better way to find the problem spots than I otherwise would have had. (And writing a brute force file tree walker, even in C, turns out to be not as much work as I thought it would be.) This is of course a great example of [[leaky abstractions http://www.joelonsoftware.com/articles/LeakyAbstractions.html]] and how knowing the low-level details can really matter. If I hadn't been well read enough about Unix geek stuff, I wouldn't have known about either _find_ optimization and things would have been a lot more hairy. (I might have found _-noleaf_ with sufficient study of the manpage, but that wouldn't have been enough.)