Unix's 'test' program and the V7 Bourne shell

November 23, 2023

Recently I read Julio Merino's test, [, and [[ (via), which is in part about there being a real '[' binary and a 'test' binary to go along with it, and as part of that, Merino wonders why the name 'test' exists at all. I don't have any specific insight into this, but I can talk a bit about the history, which turns out to be more tangled and peculiar than I thought.

The existence of 'test' goes back to V7 Unix, which is also where the Bourne shell was introduced. In V7, the manual page for the program is test(1), which has no mention the '[' alternate name, and the source is cmd/test.c, which has a comment at the start about the '[' usage and code to support it. While 'test' is a much easier name to deal with in Unix than '[', there seems to be more to this than just convenience. There are a number of shell scripts, Makefiles, and so on in V7 Unix, and as far as I can tell all of them use 'test' and none of them use '['.

(For example, bin/nohup, bin/calendar, bin/lookbib, and usr/src/cmd/learn/makefile.)

Another source of information is S. R. Bourne's An Introduction to the Unix Shell (also PDF version and the V7 troff sources). In section 2.5, Bourne introduces the 'test' command under that name, and then goes on to use it with 'while' (section 2.6) and 'if' (section 2.7). As far as I can see there's no mention of the '[' alternate name.

In trawling through various sources of information, I can't actually find any clear sign that V7 ever had a '[' hard link for 'test'. The test source code is definitely ready for this, but such a hard link doesn't exist. 4BSD has a src/cmd/DESTINATIONS file that suggests that /usr/bin/[ existed at this point (along side /usr/bin/test), but that's the earliest trace I could find. In 4.1c BSD we finally have clear evidence of /usr/bin/[ in the form of src/bin/Makefile, which explicitly creates it as a hard link to /usr/bin/test.

However, there's something rather interesting in the V7 Bourne shell source code, in the form of vestigial, disabled support for a '[' builtin. In msg.c, there is a commented out section toward the bottom:

[...]
SYSTAB  commands {
      {"cd",          SYSCD},
      {"read",        SYSREAD},
/*
      {"[",           SYSTST},
*/
      {"set",         SYSSET},
[...]

Then in xec.c there's commented out code that would have handled SYSTST in the execute() function:

[...]
    case SYSREAD:
        exitval=readvar(&com[1]);
        break;

/*
    case SYSTST:
        exitval=testcmd(com);
        break;
*/
[...]

There's no actual 'testcmd()' function in the V7 Bourne shell source code, but we can guess what it might have done.

Given this disabled code and that the V7 'test' itself supported being used as '[', it seems possible that this syntax was Bourne's preference. It's possible that the builtin '[' was implemented and then removed in favor of '[' being a hardlink to 'test', and then for whatever reason other people in Bell Labs didn't use it and V7 wasn't distributed with such a hardlink set up (although individual installs could make it themselves and it appears that the result would work). However, this may have been the other way around, per this HN comment, with Bourne preferring the 'test' form over the '[' form.

As it happens, I don't think the 'test' command (and its syntax) appeared from nowhere in V7; instead I believe we can trace it to antecedents in V6 Unix. But that's going to take another entry to discuss, since this one is already long enough.


Comments on this page:

There is a story about this that is not very well understood: the Bourne shell is a language to handle values like processes, return codes, pipes, and rather secondarily strings, even if less so than the v6 shell.

So if and while test process returns codes, not numbers or strings. Nearly all operations other than on processes, return codes, pipes were supposed to be done using processes, so test and expr mainly (and grep, sed, awk, that is shell scripts are oriented at processing bulk streams with pipelines, else use perl as programming language).

The string exception is because of environment variables, and it major manifestation is the builtin case statement. I still cringe when I see silly things like:

if [ x$N -eq xyes ]; then
  ...
fi

instead of:

case "$N" in
[Yy][Ee][Ss]) ...;;
[Nn][Oo])     ...;;
 *) echo 1>&2 "$0: option N must be 'yes' or 'no' not '$N'"
    exit 1;;
 esac

Example: http://www.sabi.co.uk/Cfg/shell/sabidhclinux

By Anonymous at 2023-11-26 13:25:49:

if [ x$N = xyes ]; then

Not really on topic here, but: I've seen people do this frequently. Mostly by people that are - apparently - not very familiar with double quotes, so you can rewrite this as :

if [ "$N" = "yes" ]; then

which gives you the same (or better) results, but without the silly need of placing the additional and confusing two additional 'x'. (or simpler than your 'case' example, which might be unneeded depending on what you are trying to do).

PS: And I think that in your example, you might have meant '$var = yes' which is for string comparison, and not '$var -eq yes' which is for integer comparison.

By John Marshall at 2023-11-26 16:35:31:

I had misremembered the xyes thing as being a workaround for the case when $N is empty, so some implementations would treat this as an erroneous two-argument form. But it's not very plausible that even old broken shells would skip the "" entirely rather than parse it an an empty argument.

The autoconf manual (a useful compendium of shell lore) lists this as instead being a workaround for $N values like ! or -n which would otherwise lead to parsing confusion. Which is a reasonable concern if you might see such (malicious) values in your input.

My big problem in this area is remembering which of test $a = $b and test $a == $b is portable and which is Bash-specific.

By Anonymous at 2023-11-27 08:11:43:

I had misremembered the xyes thing as being a workaround for the case when $N is empty

Actually, I think this is exactly why people do this; it just seems they do not seem to know you don't need to do it when you doublequote things.

Also, the reference to the autoconf manual seems interesting (and I think i'll read it), thanks for mentioning that.

By avih at 2024-12-09 07:33:49:

According to posix (Documentation of "test", the "APPLICATION USAGE" section), the practice of prefixing x to the values is not about ensuring it's empty (if unquoted), but rather because apparently early versions could get confused if the first argument beging with "-":

https://pubs.opengroup.org/onlinepubs/9799919799/utilities/test.html

------------------>8----------------

Historical systems have also been unreliable given the common construct:

test "$response" = "expected string"

One of the following is a more reliable form:

test "X$response" = "Xexpected string" test "expected string" = "$response"

Note that the second form assumes that expected string could not be confused with any unary primary. If expected string starts with '-', '(', '!', or even '=', the first form should be used instead. Using the preceding rules, any of the three comparison forms is reliable, given any input. (However, note that the strings are quoted in all cases.)

Written on 23 November 2023.
« Understanding and sorting out ZFS pool features
A peculiarity of the GNU Coreutils version of 'test' and '[' »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Thu Nov 23 23:24:29 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.