A handy diff argument handling feature that's actually very old

October 7, 2020

Some time ago I stumbled over a useful feature in the diff on our Linux machines (ie, GNU diff), where 'diff exim4.conf /etc/exim4/' is the same as 'diff exim4.conf /etc/exim4/exim4.conf'. As a sysadmin, I routinely diff versions of configuration files to do things like verify that my intended new changes are actually the only changes, so this feature routinely saves me from having to repeat the file name. I was all set to write a Wandering Thoughts entry about how this was a handy GNU diff addition, even if it's not quite pure in the Unix way, and then I decided to check what the Unix standard had to say, just to be sure. To my surprise, the standard's manpage for diff explicitly requires this behavior. Then I looked at the history of diff and got another surprise.

The standard describes it in the "Operands" section, in the usual sort of standards language:

If only one of file1 and file2 is a directory, diff shall be applied to the non-directory file and the file contained in the directory file with a filename that is the same as the last component of the non-directory file.

Once I looked, this diff behavior turned out to go back quite far in Unix history, much further than I thought. This behavior is first specifically mentioned in the V7 diff manpage:

If file1 (file2) is a directory, then a file in that directory whose file-name is the same as the file-name of file2 (file1) is used.

Diff itself seems to appear in V5 Unix (there's no diff manpage in the V4 manuals that tuhs.org has). However, the V5 and V6 manpage don't mention this behavior and the V6 diff source code doesn't seem to contain it on a casual look; it just directly opens the files you gave it and that's it.

(There are Unix V6 emulators online that run in your browser, and trying diff out in one of them suggests that this is how it really works. You can get some odd results, because you can actually read() directories in early Unixes.)

On the one hand, I'm amused and pleased that this handy feature of diff goes as far back as it does, all the way to V7. On the other hand, I wish that I'd noticed it earlier, since it's been there all this time.

(And this is a useful reminder to me that not all of the little nice convenience features found in modern Unix come from GNU.)

Written on 07 October 2020.
« Linux distributions have sensible reasons to prefer periodic releases
Sorting out what the Single Unix Specification is and covers »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Oct 7 00:28:56 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.