2018-04-18
The sensible way to use Bourne shell 'here documents' in pipelines
I was recently considering a shell script where I might want to feed a Bourne shell 'here document' to a shell pipeline. This is certainly possible and years ago I wrote an entry on the rules for combining things with here documents, where I carefully wrote down how to do this and the general rule involved. This time around, I realized that I wanted to use a much simpler and more straightforward approach, one that is obviously correct and is going to be clear to everyone. Namely, putting the production of the here document in a subshell.
( cat <<EOF your here document goes here with as much as you want. EOF ) | sed | whatever
This is not as neat and nominally elegant as taking advantage of the full power of the Bourne shell's arcane rules, and it's probably not as efficient (in at least some sh implementations, you may get an extra process), but I've come around to feeling that that doesn't matter. This may be the brute force solution, but what matters is that I can look at this code and immediately follow it, and I'm going to be able to do that in six months or a year when I come back to the script.
(Here documents are already kind of confusing as it stands without adding extra strangeness.)
Of course you can put multiple things inside the (...)
subshell,
such as several here documents that you output only conditionally
(or chunks of always present static text mixed with text you have
to make more decisions about). If you want to process the entire
text you produce in some way, you might well generate it all inside
the subshell for convenience.
Perhaps you're wondering why you'd want to run a here document
through a pipe to something. The case that frequently comes up for
me is that I want to generate some text with variable substitution
but I also want the text to flow naturally with natural line lengths,
and the expansion will have variable length. Here, the natural way
out is to use fmt
:
( cat <<EOF My message to $NAME goes here. It concerns $HOST, where $PROG died unexpectedly. EOF ) | fmt
Using fmt
reflows the text regardless of how long the variables
expand out to. Depending on the text I'm generating, I may be fine
with reflowing all of it (which means that I can put all of the
text inside the subshell), or I may have some fixed formatting that
I don't want passed through fmt
(so I have to have a mix of fmt
'd
subshells and regular text).
Having written that out, I've just come to the obvious realization
that for simple cases I can just directly use fmt
with a here
document:
fmt <<EOF My message to $NAME goes here. It concerns $HOST, where $PROG died unexpectedly. EOF
This doesn't work well if there's some paragraphs that I want to include only some of the time, though; then I should still be using a subshell.
(For whatever reason I apparently have a little blind spot about using here documents as direct input to programs, although there's no reason for it.)
A CPU's TDP is a misleading headline number
The AMD Ryzen 1800X in my work machine and the Intel Core i7-8700K in my home machine are both 95 watt TDP processors. Before I started measuring things with the actual hardware, I would have confidently guessed that they would have almost the same thermal load and power draw, and that the impact of a 95W TDP CPU over a 65W TDP CPU would be clearly obvious (you can see traces of this in my earlier entry on my hardware plans). Since it's commonly said that AMD CPUs run hotter than Intel ones, I'd expect the Ryzen to be somewhat higher than the Intel, but how much difference would I really expect from two CPUs with the same TDP?
Then I actually measured the power draws of the two machines, both at idle and under various different sorts of load. The result is not even close; the Intel is clearly using less power even after accounting for the 10 watts of extra power the AMD's Radeon RX 550 graphics card draws when it's lit up. It's ahead at idle, and it's also ahead under full load when the CPU should be at maximum power draw. Two processors that I would have expected to be fundamentally the same at full CPU usage are roughly 8% different in measured power draw; at idle they're even further apart on a proportional basis.
(Another way that TDP is misleading to the innocent is that it's not actually a measure of CPU power draw, it's a measure of CPU heat generation; see this informative reddit comment. Generally I'd expect the two to be strongly correlated (that heat has to come from somewhere), but it's possible that something that I don't understand is going on.)
Intellectually, I may have known that a processor's rated TDP was merely a measure of how much heat it could generate at maximum and didn't predict either its power draw when idle or its power draw under load. But in practice I thought that TDP was roughly TDP, and every 95 watt TDP (or 65 watt TDP) processor would be about the same as every other one. My experience with these two machines has usefully smacked me in the face with how this is very much not so. In practice, TDP apparently tells you how big a heatsink you need to be safe and that's it.
(There are all sorts of odd things about the relative power draws of the Ryzen and the Intel under various different sorts of CPU load, but that's going to be for another entry. My capsule summary is that modern CPUs are clearly weird and unpredictable beasts, and AMD and Intel must be designing their power-related internals fairly differently.)
PS: TDP also doesn't necessarily predict your actual observed CPU temperature under various conditions. Some of the difference will be due to BIOS decisions about fan control; for example, my Ryzen work machine appears to be more aggressive about speeding up the CPU fan, and possibly as a result it seems to report lower CPU temperatures under high load and power draw.
(Really, modern PCs are weird beasts. I'm not sure you can do more than putting in good cooling and hoping for the best.)