== Another piece of my environment: running commands on multiple machines It's my belief that every sysadmin who has a middling number of machines (more than one or two and less than a large fleet) sooner or later winds up with a set of tools to let them run commands on each of those machines or on subsets of those machines. I am no exception, and over the course of my career I have adopted, used, and built several iterations of this. (People with large fleets of machines generally never run commands on them by hand but have some sort of automated fleet management system based around Puppet, Chef, CFEngine, Salt, Ansible, or the like. Sometimes these systems come with the ability to do this for you.) My current iteration is based around simple shell scripts. The starting point is a shell script called _machines_; it prints out the names of machines that fall into various categories. The categorization of machines is entirely hand maintained, which has both problems and advantages. As a result the whole thing looks like: .pn prewrap on > mach() { > for i in "$@" do > case "$i" in > apps) echo apps0 apps1 apps2 apps3 testapps;; > .... > ubuntu) echo `mach apps comps ...` ...;; > .... > *) echo $i;; > esac > done > } > > mach "$@" | tr '\012' ' '; echo (I put all of the work in a shell function so that I could call it recursively, for classes that are defined partly in terms of other classes. The 'ubuntu' class here is an example of that.) So far we have few enough machines and few enough categories of machines that I'm interested in that this approach has not become unwieldy. (There is also a script called _mminus_ for doing subtractive set operations, so I can express 'all X machines except Y machines' or the like. This comes in handy periodically.) The main script for actually doing things is called _oneach_, which does what you might think: given a list of machines it runs a command line on each of them via _ssh_. You can ask it to run the command in a pseudo-tty and without any special output handling, but normally it runs the command just with '_ssh machine command_' and it prefixes all output with the name of the machine; you can see an example of that in [[this _awk_-based reformatting problem ../unix/AbusingAwkOnTheFly]] (an _oneach_ run produced the input for my problem). Because I like neat formatting, _oneach_ has an option to align the starting column of all output and I usually use that option (via a cover script called _onea_, because I'm lazy). The _oneach_ script doesn't try to do anything fancy with concurrent execution or the like, it just does one _ssh_ after the other. Finally, I've found it useful to have another script that I call _replicate_. Replicate uses _rsync_ to copy one or more files to destination machines or machine classes (it can also use _scp_ for some obscure cases). _replicate_ is handy for little things like pushing changes to dotfiles or scripts out to all of the machines where I have copies of them. As a side note, _machines_ has become a part of [[my _dmenu_ environment ToolsDmenu]]. I use its list of machines as one of the the inputs to _dmenu_'s autocompletion (both for normal logins and for special '@' logins as root), which makes it really quick and convenient to log into most of our machines (this large list of machines is part of the things I hide from _dmenu_'s initial display of completion in [[a little UI tweak that turned out to be quite important for me ../programming/SmallUITweaksImportance]]). Note that I don't necessarily suggest that you adopt my approach for running commands on your machines, which is one reason I'm not currently planning to put these scripts up in public. There are a lot of ways to solve this particular problem, many of them better and more scalable than what I have. I just think that you should get something better than manual _for_ loops (which is what I was doing before I gave in and wrote _machines_, _oneach_, and so on).