Python programs as wrappers versus filters of other Unix programs
Sometimes I wind up in a situation, such as using smartctl's JSON output, where I want to use a Python program to process and transform the output from another Unix command. In a situation like this, there are two ways of structuring things. I can have the Python program run the other command as a subprocess, capture its output, and process it, or I can have a surrounding script run the other command and pipe its output to the Python program, with the Python program acting as a (Unix) filter. I've written programs in both approaches depending on the situation.
Which sort of begs the question, namely what sort of situation makes me choose one option or the other? One reason for choosing the wrapper approach is the ease of copying the result places; a Python wrapper is only one self-contained thing to copy around to our systems, while a shell script that runs a Python filter is at least two things (and then the shell script has to know where to find the Python program). And in general, a Python wrapper program makes the whole thing feel like there are fewer moving parts (that it runs another Unix command as the program's starting point is sort of an implementation detail that people don't have to think about).
(The self contained nature of wrappers pushes me toward wrappers for things that I expect to copy to systems only on an 'as needed' basis, instead of having them installed as part of system setup or the like.)
One reason I reach for the filter approach is if I have a certain amount of logic that's most easily expressed in a shell script, for example selecting what disks to report SMART data on and then iterating over them. Shell scripts make expanding file name glob patterns very easy; Python requires more work for this. I have to admit that how the idea evolved also plays a role; if I started out thinking I had a simple job of reformatting output that could be done entirely in a shell script, I'm most likely to write the Python as a filter that drops into it, rather than throw the shell script away and write a Python wrapper. Things that start out clearly complex from the start are more likely to be a Python wrapper instead of a filter used by a shell script.
(The corollary of this is if I'm running the other command once with more or less constant arguments, I'm much more likely to write a wrapper program instead of a filter.)
I believe that there are (third party) Python packages that are intended to make it easy to write shell script like things in Python (and I think I was even pointed at one once, although I can't find the reference now). In theory I could use these and native Python facilities to write more Python programs as wrappers; in practice, I'm probably going to take the path of least resistance and continue to do a variety of things as shell scripts with Python programs as filters.
I don't know if writing this entry is going to get me to be more systematic and conscious about making this choice between a wrapper and a filter, but I can hope so.
PS: Another aspect of the choice is that it feels easier (and better known) to adjust the settings of a shell script by changing commented environment variables at the top of the script than making the equivalent changes to global variables in the Python program. I suspect that this is mostly a cultural issue; if we were more into Python, it would probably feel completely natural to us to do this to Python programs (and we'd have lots of experience with it).