Why I have a little C program to filter a $PATH (more or less)

February 15, 2025

I use a non-standard shell and have for a long time, which means that I have to write and maintain my own set of dotfiles (which sometimes has advantages). In the long ago days when I started doing this, I had a bunch of accounts on different Unixes around the university (as was the fashion at the time, especially if you were a sysadmin). So I decided that I was going to simplify my life by having one set of dotfiles for rc that I used on all of my accounts, across a wide variety of Unixes and Unix environments. That way, when I made an improvement in a shell function I used, I could get it everywhere by just pushing out a new version of my dotfiles.

(This was long enough ago that my dotfile propagation was mostly manual, although I believe I used rdist for some of it.)

In the old days, one of the problems you faced if you wanted a common set of dotfiles across a wide variety of Unixes was that there were a lot of things that potentially could be in your $PATH. Different Unixes had different sets of standard directories, and local groups put local programs (that I definitely wanted access to) in different places. I could have put everything in $PATH (giving me a gigantic one) or tried to carefully scope out what system environment I was on and set an appropriate $PATH for each one, but I decided to take a more brute force approach. I started with a giant potential $PATH that listed every last directory that could appear in $PATH in any system I had an account on, and then I had a C program that filtered that potential $PATH down to only things that existed on the local system. Because it was written in C and had to stat() things anyways, I made it also keep track of what concrete directories it had seen and filter out duplicates, so that if there were symlinks from one name to another, I wouldn't get it twice in my $PATH.

(Looking at historical copies of the source code for this program, the filtering of duplicates was added a bit later; the very first version only cared about whether a directory existed or not.)

The reason I wrote a C program for this (imaginatively called 'isdirs') instead of using shell builtins to do this filtering (which is entirely possible) is primarily because this was so long ago that running a C program was definitely faster than using shell builtins in my shell. I did have a fallback shell builtin version in case my C program might not be compiled for the current system and architecture, although it didn't do the filtering of duplicates.

(Rc uses a real list for its equivalent of $PATH instead of the awkward ':' separated pseudo-list that other Unix shells use, so both my C program and my shell builtin could simply take a conventional argument list of directories rather than having to try to crack a $PATH apart.)

(This entry was inspired by Ben Zanin's trick(s) to filter out duplicate $PATH entries (also), which prompted me to mention my program.)

PS: rc technically only has one dotfile, .rcrc, but I split my version up into several files that did different parts of the work. One reason for this split was so that I could source only some parts to set up my environment in a non-interactive context (also).

Sidebar: the rc builtin version

Rc has very few builtins and those builtins don't include test, so this is a bit convoluted:

path=`{tpath=() pe=() {
        for (pe in $path)
           builtin cd $pe >[1=] >[2=] && tpath=($tpath $pe)
        echo $tpath
       } >[2]/dev/null}

In a conventional shell with a test builtin, you would just use 'test -d' to see if directories were there. In rc, the only builtin that will tell you if a directory exists is to try to cd to it. That we change directories is harmless because everything is running inside the equivalent of a Bourne shell $(...).

Keen eyed people will have noticed that this version doesn't work if anything in $path has a space in it, because we pass the result back as a whitespace-separated string. This is a limitation shared with how I used the C program, but I never had to use a Unix where one of my $PATH entries needed a space in it.

Written on 15 February 2025.
« The profusion of things that could be in your $PATH on old Unixes
The HTTP status codes of responses from about 21 hours of traffic to here »

Page tools: View Source.
Search:
Login: Password:

Last modified: Sat Feb 15 21:07:35 2025
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.