Wandering Thoughts archives

2021-06-17

Apache directory indexes will notice and exclude blocked URLs in them

Today I learned about an interesting and nice little Apache feature. If you have Apache generating its own automatic index pages for filesystem directories, and you block access to some things in a directory with <Location> blocks, Apache's generated index won't include what you blocked. It's as if the object doesn't exist. This is what you want (since attempts to access those things will fail), but it's more than I expected.

That sounds abstract, so let me make it concrete. We have an old legacy FTP site, which we've recently made available as a HTTPS site because browsers are removing support for FTP. For historical reasons, this FTP site has some symbolic links that create recursive structures; for example, it has a public_html symlink in the root that points to '.' (the current directory). Unfortunately, web spiders just love recursive structures and will crawl through them incessantly, with ever lengthening URLs.

(Web spider operators will probably tell you that they don't like recursive link situations like this. I have to go by observed behavior, which is that any number of web spiders don't appear to notice that /public_html/ is exactly the same content as / and /public_html/public_html/ and so on.)

We don't want to remove the symbolic links from the actual directory tree that's the FTP site, for various reasons (maybe they're there for some necessary reason, or at least have become embedded in historical FTP URLs). But the HTTPS site is new and we can drop whatever URLs we want from it. So I did the obvious simple thing:

<Location "/public_html">
  Order deny,allow
  Deny from all
</Location>

When I was verifying that this worked, I noticed that the top level index page for the FTP site no longer showed any public_html entry. Testing showed that this happened for other entries in the directory as well, if I temporarily added them.

The mod_autoindex documentation suggests that this is a standard feature that does general permission checks, based on the documentation for the ShowForbidden option to the IndexOptions directive. However, I haven't tested this with more complex situations, such as <Directory> instead of <Location> or more complicated permissions.

web/ApacheIndexesSeeBlocks written at 23:47:20; Add Comment

In Prometheus queries, on and ignoring don't drop labels from the result

Today I learned that one of the areas of PromQL, the query language for Prometheus that I'm a still a bit weak on is when labels will and won't get dropped from metrics as you manipulate them in a query. So I'll start with the story.

Today I wrote an alert rule to make sure that the network interfaces on our servers hadn't unexpectedly dropped down to 100 Mbit/second (instead of 1Gbit/s or for some servers 10Gbit/s). We have a couple of interfaces on a couple of servers that legitimately are at 100M (or as legitimately as a 100M connection can be in 2021), and I needed to exclude them. The speed of network interfaces is reported by node_exporter in node_network_speed_bytes, so I first wrote an expression using unless and all of the labels involved:

node_network_speed_bytes == 12500000 unless
  ( node_network_speed_bytes{host="host1",device="eno2",...} or
    node_network_speed_bytes{host="host2",device="eno1",...} )

However, most of the standard labels you get on metrics from the host agent (such as job, instance, and so on) are irrelevant and even potentially harmful to include (the full set of labels might have to change someday). The labels I really care about are the host and the device. So I rewrote this as:

node_network_speed_bytes == 12500000 unless on(host,device) [....]

When I wrote this expression I wasn't sure if it was going to drop all other labels beside host and device from the filtered end result of the PromQL expression. It turns out that it didn't; the full set of labels for node_network_speed_bytes is passed through, even though we're only matching on some of them in the unless.

(The host and the device are all that I needed for the alert message so it wouldn't have been fatal if the other labels were dropped. But it's better to retain them just in case.)

Aggregation operators discard labels unless you use without or by, as covered by their documentation (although it's not phrased that way), since aggregating over labels is their purpose. As I've found out, careless use of aggregation operators can lose labels that are valuable for alerts (which may be what left me jumpy about this case). Aggregation over time keeps all labels, though, because it's aggregating over time instead of over some or all labels. But as I was reminded today (since I'm sure I've seen it before), vector matching using on and ignoring don't drop labels, they merely restrict what labels are used in the matching (and then it's up to you to make sure you still have a one to one vector match or at least a match that you expect; I've made mistakes there).

(You can also explicitly pull in additional labels from other metrics.)

There may be other cases in PromQL where labels are dropped, but if so I can't think of them right now. My overall moral is that I still need to test my assumptions and guesses in order to be sure about this stuff.

Sidebar: Why I used unless (... or ...) in this query

In many cases, the obvious way to exclude some things from an alert rule expression is to use negative label matches. However, these can't match on the combination of several labels instead of the value of a single label. As far as I know, if you want to exclude only certain label combinations (here 'host1 and eno2' and 'host2 and eno1') where the individual label elements can occur separately (so host1 and host2 both have other network interfaces, and other hosts have eno1 and eno2 interfaces), you're stuck with more awkward construction I used. This construction is unfortunately somewhat brute force.

sysadmin/PrometheusOnIgnoringAndLabels written at 00:40:10; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.