Web servers should refuse requests for random, unnecessary URLs

July 4, 2023

We periodically check our own networks with an (open source) vulnerability scanner, whose rules get updated from time to time. Recently a scan report lit up with a lot of reports to the effect of 'a home directory is accessible via this web server' for our machines. The web servers in question were all on port 9100, and the reason the security scanner triggered this alert is that it could successfully request a couple of URLs like '/.bash_history' from them.

As you might guess, this is a false positive. On our machines, TCP port 9100 is where the Prometheus host agent listens so that it can be scraped by our Prometheus server, and it definitely wasn't serving anyone's home directories (although the host agent is a HTTP server, because HTTP is basically the universal protocol at this point). What was happening instead is that the Prometheus host agent's HTTP server code will give you a HTTP 200 answer (with a generic front page) for any URL except the special URL for its metrics endpoint. Since the security scanner asked for various URLs like '/.bash_history' and got a HTTP 200 response, it decided each of the machines it checked on port 9100 had that vulnerability.

Neither party is exactly wrong here, but the result is not ideal. Given that security scanners and other things like them aren't uncommon, my view is that web servers should try to be more selective. A web server like this can actually be selective without even changing the HTML served; all it would need to do is only give a HTTP 200 response for '/' and then a 404 (with the same HTML) for everything else that it answers with the generic front page. This would have the same functional result (visitors would get a page with the URL of the metrics endpoint), but avoid false positives from security scanners and anything else poking around.

(In practice, web browsers and people mostly don't care about or notice the HTTP return code. The browser presentation of the HTML of a HTTP error page is generally identical to the presentation of the same HTML from an URL that had a HTTP 200 success reply.)

Ideally, the APIs of web service libraries would make it easy to do this. Here, the Go net/http ServeMux API is less than ideal, since there's no simple way to register something that handles only the root URL and not everything under it. Instead, your request handler has to specifically check for this case (as covered in the example for ServeMux's Handle() method).

PS: Security scanners and other tools could adopt various heuristics to detect this sort of situation and reduce false positives, but ultimately they're only heuristics, which means they'll always be incomplete and sometimes may be wrong. Dealing with this in the web server is the better way.


Comments on this page:

FWIW, there is a discussion about making the Go ServeMux easier to use in this regard: https://github.com/golang/go/discussions/60227

Neither party is exactly wrong here, but the result is not ideal.

To me it seems each party is wrong here. 🙂

all it would need to do is only give a HTTP 200 response for '/' and then a 404 (with the same HTML) for everything else

Why even respond 200 at “/” though? The only reason I can think of is if you specifically wish for the links on the default response page to get crawled by search engines, which does not seem to me to apply here. So the fix amounts to a change of one constant.

By cks at 2023-07-05 17:03:35:

In the case of the Prometheus host agent, the front page tells any human visitor what's on the port, what the running version is, and gives a link to the metrics endpoint URL. I think these are all sensible things to provide; as a system administrator I certainly appreciate being able to identify 'what HTTP thing is on this port on server X' by asking it.

(Some of the Prometheus agents report more elaborate things on their front page, but announcing what you are seems to be relatively standard behavior (although not universal).)

This reminds me of a personal bugbear with the RHEL httpd package, which is the inverse situation: OOTB it’s configured to serve a “hello” page on / via an error handler, so you get an error code for a success.

Written on 04 July 2023.
« Our Python fileserver management code has been quite durable over the years
The mere 'presence' of an URL on a web server is not a good signal »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Tue Jul 4 22:28:53 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.