Wandering Thoughts archives

2018-12-13

Some new-to-me features in POSIX (or Single Unix Specification) Bourne shells

As I mentioned when I found Bourne shell arithmetic to be pretty pleasant, I haven't really paid attention to what things are now standard POSIX Bourne shell features. In fact, it's more than that; I don't think I've really appreciated that POSIX and then the Single Unix Specification actually added very much to the venerable Bourne shell. I knew that shell functions were standard, and then there was POSIX command substitution, but then I sort of stopped. In light of discovering that shell arithmetic is now POSIX standard, this view is clearly out of date, so I've decided to actually skim through the POSIX/SUS shell specification and see what new things I want to remember to use in the future. In the process I found an addition that surprises me.

First (and perhaps obviously), the various character classes such as [:alnum:] are officially supported in shell wildcard expansion and matching. I'm not a fan of writing, say, '[[:upper:]]' instead of '[A-Z]', but the latter has some dangerous traps in some shells, including shells that are commonly found as /bin/sh in some environments.

The big new feature that I should probably plan to make use of is the various prefix and suffix pattern substitutions, such as '${var%%word}'. To a fair extent these let you do in shell things that you previously had to turn to programs like basename and dirname. For instance, in a recent script I wanted the bare program name without any full path, so I used:

prog="${0##*/}"

This feels one part clever and half a part perhaps too clever, but I hope it's an idiom. Another use of this is to perform inline pattern matching in an if statement, for example to check if a parameter is a decimal number:

if [ -n "${OPTARG##*[!0-9]*" ]; then
  echo "$prog: argument not a number" 1>&2
  exit 1
fi

I previously would have turned to case statements for this, which is more awkward. Again, hopefully this is not too clever.

(I learned this trick from Stackoverflow answers, perhaps this one or this one.)

The Single Unix Specification actually has some useful and interesting examples for the prefix and suffix pattern substitutions, along with some of the other substitutions.

Next, as pointed in a comment back in 2011 here, POSIX arithmetic supports hex numbers with a leading 0x, which means that it can be used as a quick hex to decimal converter in addition to hex math calculations. I don't know if there's any way to do decimal to hex output with builtins alone; I suspect that the best way is with printf. The arithmetic operators are available are actually pretty extensive, including 'a ? b : c' for straightforward conditionals.

Unfortunately, while POSIX sh has string length (with '${#var}'), it doesn't seem to have either a way to count the number of $IFS separated words in a variable or to trim off an arbitrary number of leading or trailing spaces from one. You can get both through brute force with simple shell functions, but I'm probably better off avoiding situations when I need either.

The one feature in POSIX sh that genuinely surprises me is tilde username expansion. I knew this was popular for interactive use in shells but I would have expected POSIX to not care and to primarily focus on shell scripts, where I at least have the impression it's not very common. But there it is, and the description doesn't restrict it to interactive sessions either; you can use '~<someuser>' in your shell scripts if you want to. I probably won't, though, especially since we have our own local solution for this.

(The version of the Single Unix Specification that I'm looking at here is the 2017 version, which will no doubt go out of data just as the 2008, 2013, and 2016 versions did.)

unix/PosixShellNewFeatures written at 00:12:16;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.