Wandering Thoughts archives

2023-11-24

A peculiarity of the GNU Coreutils version of 'test' and '['

Famously, '[' is a program, not a piece of shell syntax, and it's also known as 'test' (which was the original name for it). On many systems, this was and is implemented by '[' being a hardlink to 'test' (generally 'test' was the primary name for various reasons). However, today I found out that GNU Coreutils is an exception. Although the two names are built from the same source code (src/test.c), they're different binaries and the '[' binary is larger than the 'test' binary. What is ultimately going on here is a piece of 'test' behavior that I had forgotten about, that of the meaning of running 'test' with a single argument.

The POSIX specification for test is straightforward. A single argument is taken as a string, and the behavior is the same as for -n, although POSIX phrases it differently:

string
True if the string string is not the null string; otherwise, false.

The problem for GNU Coreutils is that GNU programs like to support options like --help and --version. Support for these is specifically disallowed for 'test', where 'test --help' and 'test --version' must both be silently true. However, this is not disallowed by POSIX for '[' if '[' is invoked without the closing ']':

$ [ --version
[ (GNU coreutils) 9.1
[...]
$ [ foo
[: missing ‘]’
$ [ --version ] && echo true
true

As we can see here, invoking 'test' as '[' without the closing ']' as an argument is an error, and GNU Coreutils is thus allowed to interpret the results of your error however it likes, including making '[ --version' and so on work.

(There's a comment about it in test.c.)

The binary size difference is presumably because the 'test' binary omits the version and help text, along with the code to display it. But if you look at the relevant Coreutils test.c code, the relevant code isn't disabled with an #ifdef. Instead, LBRACKET is #defined to 0 when compiling the 'test' binary. So it seems that modern C compilers are doing dead code elimination on the 'if (LBRACKET) { ...}' section, which is a well established optimization, and then going on to notice that the called functions like 'usage()' are never invoked and dropping them from the binary. Possibly this is set with some special link time magic flags.

PS: This handling of a single argument for test goes all the way back to V7, where test was actually pretty smart. If I'm reading the V7 test(1) manual page correctly, this behavior was also documented.

PPS: In theory GNU Coreutils is portable and you might find it on any Unix. In practice I believe it's only really used on Linux.

linux/CoreutilsTestPeculiarity written at 22:46:09;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.