Wget is not welcome here any more (sort of)
Today, someone at a large chipmaker that will go unnamed decided
(or apparently decided) that they would like their own archived
copy of Wandering Thoughts. So they did what one does
here; they got out
pointed it at the front page of the blog, and let it go. I was lucky
in a way; they started this at 18:05 EST and I coincidentally looked
at my logs around 19:25, at which point they had already made around
3,000 requests because that's what wget does when you turn it loose.
This is not the first time that people have had the bright idea to
just turn to wget to copy part or all of Wandering Thoughts (someone
else did it in early October, for example), and it will not be the
last time. However, it will be the last time they're going to be
even partially successful, because I've now blocked wget's default
I'm not doing this because I'm under any illusions that this will stop people from grabbing a copy of Wandering Thoughts, and in fact I don't care if people do that; if nothing else, there are plenty of alternatives to wget (starting with, say, curl). I'm doing this because wget's spidering options are dangerous by default. If you do the most simple, most obvious thing with wget, you flood your target site and perhaps even spill over from it to other sites. And, to be clear and in line with my general views, these unfortunate results aren't the fault of the people using wget. The people using wget to copy Wandering Thoughts are following the obvious path of least resistance, and it is not their fault that this is actually a bad idea.
(I could hope that someday wget will change its defaults so
that they're not dangerous, but given the discussion in its manual
about options like
--random-wait, I am not going to hold my breath
on that one.)
Wget is a power tool without adequate safeguards for today's web, so if you are going to use it on Wandering Thoughts, all I can do is force you to at least slow down, go out of your way a little bit, and perhaps think about what you're doing. This doesn't guarantee that people who want to use wget on Wandering Thoughts will actually set it up right so that it behaves well, but there is now at least a chance. And if they configure wget so that it works but don't make it behave well, I'm going to feel much less charitable about the situation; these people will have chosen to deliberately climb over a fence, even if it is a low fence.
As a side note, one reason that I'm willing to do this at all is that I've checked the logs here going back a reasonable amount of time and found basically no non-spidering use of wget. There is a trace amount of it and I am sorry for the people behind that trace amount, but. Please just switch to curl.
(I've considered making my wget block send a redirect to a page
that explains the situation, but that would take more energy and
more wrestling with Apache
.htaccess than I currently have.
Perhaps if it comes up a lot.)
PS: The people responsible for the October incident actually emailed me and were quite apologetic about how their wget usage had gotten away from them. That it did get away from them despite them trying to do a reasonable job shows just how sharp-edged a tool wget can be.
PPS: I'm somewhat goring my own ox with this, because I have a set of little wget-based tools and now I'm going to have to figure out what I want to do with them to keep them working on here.
Linux disk IO stats in Prometheus
Suppose, not hypothetically, that you have a shiny new Prometheus setup and you are running the Prometheus host agent on your Linux machines, some of which have disks whose IO statistics might actually matter (for example, we once had a Linux Amanda backup server with a very slow disk). The Prometheus host agent provides a collection of disk IO stats, but it is not entirely clear where they come from and what they mean.
The good news is that the Prometheus host agent gives you the raw Linux kernel disk statistics and they're essentially unaltered. You get statistics only for whole disks, not partitions, but the host agent includes stats for software RAID devices and other disk level things. I've written about what these stats cover in my entry on what stats you get and also on what information you can calculate from them, which includes an aside on disk stats for software RAID devices and LVM devices on modern Linux kernels.
(The current version of the host agent makes two alterations to the stats; it converts the time based ones from milliseconds into seconds, and it converts the sector-based ones into bytes using the standard Linux kernel thing where one sector is 512 bytes. Both of these are much more convenient in a Prometheus environment.)
The mapping between Linux kernel statistics and Prometheus metrics names is fortunately straightforward, and it is easy to follow because the host agent's help text for all of the stats is pretty much their description in the kernel's Documentation/iostats.txt. There are a few changes, but they are pretty obvious. For example, the kernel description of field 10 is '# of milliseconds spent doing I/Os'; the host agent's corresponding description of node_disk_io_time_seconds_total is 'Total seconds spent doing I/Os'.
(In the current host agent the help text is somewhat inconsistent here; for instance, some of it talks about 'milliseconds'. This will probably be fixed in the future.)
Since Prometheus exposes all of the Linux kernel disk stats, you
can generate all of the derived stats that I discussed in my entry
on this. Actually calculating them will involve a
lot of use of
for pretty much every stat, you'll have to start out calculations
by taking the
rate() of it and then performing the relevant
calculations from there. This is a bit annoying for several reasons,
but Prometheus is Prometheus.
There are two limitations of these stats. First, as always, they're
averages with everything that that implies (see here and here).
Second, they're going to be averages over appreciable periods of
time. At the limit, you're unlikely to be pulling stats from the
Prometheus host agent more than once every 10 or 15 seconds, and
sometimes less frequently than that. Very short high activity bursts
will thus get smeared out into lower averages over your 10 or 15
or 30 second sample resolution. To get a second by second view that
captures very short events, you're going to need to sit there on
the server with a tool like mxiostat, or
if you must.
You can get around at least the issue of averages with something like the Cloudflare eBPF exporter (see also Cloudflare's blog post on it). If other burst events matter to you, you could probably build some infrastructure that would capture them in histograms in a similar way.
(Histograms that capture down to single exceptional events are really the way to go if you care a lot about this, because even a second by second view is still an average over that second. However you're a lot more likely to see things in a second by second view than in a 15, 30, or 60 second one, assuming that you can spot the exceptions as they flow by.)