Our small tools for running commands on multiple machines
A while back I wrote about the personal shell scripts I had for running commands on multiple machines. At the time, they were only personal scripts that I used myself; however, over time they kept informally creeping into worklog entries that documented what we actually did and even some shell scripts we have to pre-write the commands we need for convoluted operations like migrating ZFS filesystems from server to server. Eventually we decided to adopt them as actual official scripts, put in our central location for such scripts.
My own versions were sort of slapped together, especially the
machines
script to print out the names of machines that fall into
various categories, so making them into production-worthy tools
meant cleaning that up. The oneach
script needed only moderate
reforms and as a result the new version is only slightly improved
over my old personal version; in day to day usage, I probably
couldn't notice any difference if I switched back to using my old
one.
(The big difference is that the production version has more options
for things like extra verbosity and a dryrun mode that just reports
the ssh
commands that would be run.)
The machines
command got completely redone from scratch, because
I realized that my hack approach just wouldn't work. For a start,
I couldn't ask my co-workers to edit a script every time we added
a machine; there would have been a revolt. So I wrote a new version
in Python that parsed a
configuration file. This new production version is a drastic
improvement over my shell script hack; because I wrote it in Python,
I was able to include significantly more features, in addition to
making it more convenient and regular (since it's parsing a
configuration file). The most important one is support for 'AND'
and 'EXCEPT' operations, so you can express machine categories like
'all machines with some feature that are also Ubuntu 16.04 machines'
or 'all Ubuntu 14.04 machines except ...'. This is supported both
in the configuration file, where it sees a little bit of use, and
on the command line, where I take advantage of it periodically.
(The configuration file format is nothing special and basically duplicates what I've seen other similar programs use. Although I didn't consciously set out to duplicate their approach, it feels like we wound up in the same spot because there's only so many good solutions for the problem.)
Using a configuration file doesn't just make things more convenient and maintainable; it also makes them more consistent, in several senses. It's now much harder for me to accidentally forget to add machines to categories they should be in (or not remove them from categories that no longer apply). A good part of the reason is that the configuration file is mostly inverted from how my script used to do it. Rather than list machines that are in categories, it mostly lists the categories that a machine is in:
apps0 apps ubuntu1604 allnfs users
There are a few categories that are explicitly specified, but even then they tend to be in terms of other categories:
all=ubuntu1604 ubuntu1404
This approach wouldn't have been feasible in my original simple shell script, but it's a natural one once you have a configuration file (especially if you want to make adding new machines easy and obvious; for the most part you can copy an existing line and change the initial host name).
In theory I could have done all of these improvements in my own
personal versions, and writing the Python version of machines
didn't take too long (even writing a Go version for my own use only added a modest amount of
time). In practice it took the push of knowing that these had to
now be generally usable and maintainable by my co-workers to get
me to spend the time. Would it have been wrong to spend the time
on this when they were just personal scripts? Probably, and even
if not I doubt I could have persuaded myself of that. After all,
they worked well enough as they were originally.
|
|