Web servers should refuse requests for random, unnecessary URLs

July 4, 2023

We periodically check our own networks with an (open source) vulnerability scanner, whose rules get updated from time to time. Recently a scan report lit up with a lot of reports to the effect of 'a home directory is accessible via this web server' for our machines. The web servers in question were all on port 9100, and the reason the security scanner triggered this alert is that it could successfully request a couple of URLs like '/.bash_history' from them.

As you might guess, this is a false positive. On our machines, TCP port 9100 is where the Prometheus host agent listens so that it can be scraped by our Prometheus server, and it definitely wasn't serving anyone's home directories (although the host agent is a HTTP server, because HTTP is basically the universal protocol at this point). What was happening instead is that the Prometheus host agent's HTTP server code will give you a HTTP 200 answer (with a generic front page) for any URL except the special URL for its metrics endpoint. Since the security scanner asked for various URLs like '/.bash_history' and got a HTTP 200 response, it decided each of the machines it checked on port 9100 had that vulnerability.

Neither party is exactly wrong here, but the result is not ideal. Given that security scanners and other things like them aren't uncommon, my view is that web servers should try to be more selective. A web server like this can actually be selective without even changing the HTML served; all it would need to do is only give a HTTP 200 response for '/' and then a 404 (with the same HTML) for everything else that it answers with the generic front page. This would have the same functional result (visitors would get a page with the URL of the metrics endpoint), but avoid false positives from security scanners and anything else poking around.

(In practice, web browsers and people mostly don't care about or notice the HTTP return code. The browser presentation of the HTML of a HTTP error page is generally identical to the presentation of the same HTML from an URL that had a HTTP 200 success reply.)

Ideally, the APIs of web service libraries would make it easy to do this. Here, the Go net/http ServeMux API is less than ideal, since there's no simple way to register something that handles only the root URL and not everything under it. Instead, your request handler has to specifically check for this case (as covered in the example for ServeMux's Handle() method).

PS: Security scanners and other tools could adopt various heuristics to detect this sort of situation and reduce false positives, but ultimately they're only heuristics, which means they'll always be incomplete and sometimes may be wrong. Dealing with this in the web server is the better way.

Written on 04 July 2023.
« Our Python fileserver management code has been quite durable over the years
The mere 'presence' of an URL on a web server is not a good signal »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Jul 4 22:28:53 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.