2018-12-13
Some new-to-me features in POSIX (or Single Unix Specification) Bourne shells
As I mentioned when I found Bourne shell arithmetic to be pretty pleasant, I haven't really paid attention to what things are now standard POSIX Bourne shell features. In fact, it's more than that; I don't think I've really appreciated that POSIX and then the Single Unix Specification actually added very much to the venerable Bourne shell. I knew that shell functions were standard, and then there was POSIX command substitution, but then I sort of stopped. In light of discovering that shell arithmetic is now POSIX standard, this view is clearly out of date, so I've decided to actually skim through the POSIX/SUS shell specification and see what new things I want to remember to use in the future. In the process I found an addition that surprises me.
First (and perhaps obviously), the various character classes such
as [:alnum:]
are officially supported in shell wildcard expansion
and matching. I'm not a fan of writing, say, '[[:upper:]]
' instead
of '[A-Z]
', but the latter has some dangerous traps in some
shells, including shells that
are commonly found as /bin/sh
in some environments.
The big new feature that I should probably plan to make use of
is the various prefix and suffix pattern substitutions, such as
'${var%%word}
'. To a fair extent these let you do in shell things
that you previously had to turn to programs like basename
and
dirname
. For instance, in a recent script I wanted the bare program
name without any full path, so I used:
prog="${0##*/}"
This feels one part clever and half a part perhaps too clever, but
I hope it's an idiom. Another use of this is to perform inline
pattern matching in an if
statement, for example to check if a
parameter is a decimal number:
if [ -n "${OPTARG##*[!0-9]*" ]; then echo "$prog: argument not a number" 1>&2 exit 1 fi
I previously would have turned to case
statements for this, which
is more awkward. Again, hopefully this is not too clever.
(I learned this trick from Stackoverflow answers, perhaps this one or this one.)
The Single Unix Specification actually has some useful and interesting examples for the prefix and suffix pattern substitutions, along with some of the other substitutions.
Next, as pointed in a comment back in 2011 here, POSIX arithmetic supports hex numbers
with a leading 0x
, which means that it can be used as a quick hex
to decimal converter in addition to hex math calculations. I don't
know if there's any way to do decimal to hex output with builtins
alone; I suspect that the best way is with printf
. The arithmetic
operators are available are actually pretty extensive,
including 'a ? b : c
' for straightforward conditionals.
Unfortunately, while POSIX sh has string length (with '${#var}
'),
it doesn't seem to have either a way to count the number of $IFS
separated words in a variable or to trim off an arbitrary number
of leading or trailing spaces from one. You can get both through
brute force with simple shell functions, but I'm probably better
off avoiding situations when I need either.
The one feature in POSIX sh that genuinely surprises me is tilde
username expansion.
I knew this was popular for interactive use in shells but I would
have expected POSIX to not care and to primarily focus on shell
scripts, where I at least have the impression it's not very common.
But there it is, and the description doesn't restrict it to interactive
sessions either; you can use '~<someuser>
' in your shell scripts
if you want to. I probably won't, though, especially since we
have our own local solution for this.
(The version of the Single Unix Specification that I'm looking at here is the 2017 version, which will no doubt go out of data just as the 2008, 2013, and 2016 versions did.)