Wandering Thoughts archives

2020-10-08

Sorting out what the Single Unix Specification is and covers

I've linked to the Single Unix Specification any number of times, for various versions of it (when I first linked to it, it was at issue 6, in 2006; it's now up to a 2018 edition). But I've never been quite clear what it covered and didn't cover, and how it related to POSIX and similar things. After yesterday's entry got me looking at the SuS site again, I decided to try to sort this out once and for all.

My primary resources on this is the Wikipedia page (the SuS FAQ claims to be updated recently but is clearly out of date in important respects). Also useful is the page of the Austin Commons Standards Revision Group (also). The Wikipedia page has a helpful rundown of the history of the 'Single Unix Specification' and some things related to them.

As stated by various places, the core of the Single Unix Specification is POSIX, which is formally an IEEE standard and also an international ISO/IEC standard (IEEE 1003 and ISO/IEC 9945 respectively). POSIX incorporates by reference some vintage of ANSI C (I believe C99), since the Unix APIs it specifies are specified in C. The POSIX standard covers both C library APIs, commands that are executed through the shell (which is also specified in POSIX), and I believe things like some file paths. As far as I can tell, the only other standard in the Single Unix Specification is CURSES, which is not part of POSIX.

(See eg here Unix standards, the FAQ, and Wikipedia.)

This implies that if a Unix command or a non-Curses API is in the Single Unix Specification, it's also in POSIX. This matches what I've seen in the online Single Unix Specification that I keep linking to bits of; I've only ever noticed it talking about POSIX (aka IEEE 1003.1). For most purposes, then, I can just talk about 'POSIX' or 'Single Unix Specification' interchangeably, which is somewhat different than how I used to think it was.

(I originally thought that the SuS was a superset of POSIX that added significant extra commands and requirements that were not in POSIX. This appears to not be the case.)

Sidebar: Where my misunderstanding of SuS came from

How I thought the story went was that POSIX was a relatively minimal standard for 'Unix' that did not go far enough in practice, for various political reasons. This caused actual Unix vendors to get together and agree on an additional layer of things on top of POSIX that made up 'Unix in practice', creating the Single Unix Specification. Systems that were in no way Unix derived could be POSIX compliant if they tried (and so could be candidates for US government contracts that required 'POSIX', per the origins of POSIX as I learned them), but could not be Unix™, which was something that was defined by the Single Unix Specification.

Obviously this is not actually the case, or at least is not the case in modern versions of the SuS. This goes to show me, once again, the power of folklore (especially since I fell for it).

SingleUnixSpecificationWhat written at 00:38:03; Add Comment

2020-10-07

A handy diff argument handling feature that's actually very old

Some time ago I stumbled over a useful feature in the diff on our Linux machines (ie, GNU diff), where 'diff exim4.conf /etc/exim4/' is the same as 'diff exim4.conf /etc/exim4/exim4.conf'. As a sysadmin, I routinely diff versions of configuration files to do things like verify that my intended new changes are actually the only changes, so this feature routinely saves me from having to repeat the file name. I was all set to write a Wandering Thoughts entry about how this was a handy GNU diff addition, even if it's not quite pure in the Unix way, and then I decided to check what the Unix standard had to say, just to be sure. To my surprise, the standard's manpage for diff explicitly requires this behavior. Then I looked at the history of diff and got another surprise.

The standard describes it in the "Operands" section, in the usual sort of standards language:

If only one of file1 and file2 is a directory, diff shall be applied to the non-directory file and the file contained in the directory file with a filename that is the same as the last component of the non-directory file.

Once I looked, this diff behavior turned out to go back quite far in Unix history, much further than I thought. This behavior is first specifically mentioned in the V7 diff manpage:

If file1 (file2) is a directory, then a file in that directory whose file-name is the same as the file-name of file2 (file1) is used.

Diff itself seems to appear in V5 Unix (there's no diff manpage in the V4 manuals that tuhs.org has). However, the V5 and V6 manpage don't mention this behavior and the V6 diff source code doesn't seem to contain it on a casual look; it just directly opens the files you gave it and that's it.

(There are Unix V6 emulators online that run in your browser, and trying diff out in one of them suggests that this is how it really works. You can get some odd results, because you can actually read() directories in early Unixes.)

On the one hand, I'm amused and pleased that this handy feature of diff goes as far back as it does, all the way to V7. On the other hand, I wish that I'd noticed it earlier, since it's been there all this time.

(And this is a useful reminder to me that not all of the little nice convenience features found in modern Unix come from GNU.)

DiffOldArgumentsFeature written at 00:28:56; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.