Wandering Thoughts archives

2024-10-28

The question of whether to still allow HTTP/1.0 requests or block them

Recently, I discovered something and noted it on the Fediverse:

There are still a small number of things making HTTP/1.0 requests to my techblog. Many of them claim to be 'Chrome/124.<something>'. You know, I don't think I believe you, and I'm not sure my techblog should still accept HTTP/1.0 requests if all or almost all of them are malicious and/or forged.

The pure, standards-compliant answer to this is that of course you should still allow HTTP/1.0 requests. It remains a valid standard, and apparently some things may still default to it, and one part of the web's strength is its backward compatibility.

The pragmatic answer starts with the observation that HTTP/1.1 is now 25 years old, and any software that is talking HTTPS to you is demonstrably able to deal with standards that are more recent than that (generally much more recent, as sites require TLS 1.2 or better). And as a practical matter, pure HTTP/1.0 clients can't talk to many websites because such websites are name-based virtual hosts where the web server software absolutely requires a HTTP Host header before it will serve the website to you. If you leave out the Host header, at best you will get some random default site, perhaps a stub site.

(In a HTTPS context, web servers will also require TLS SNI and some will give you errors if the HTTP Host doesn't match the TLS SNI or is missing entirely. These days this causes HTTP/0.9 requests to be not very useful.)

If HTTP/1.0 requests were merely somewhere between a partial lie (in that everything that worked was actually supplying a Host header too) and useless (for things that didn't supply a Host), you could simply leave them be, especially if the volume was low. But my examination suggests strongly that approximately everything that is making HTTP/1.0 requests to Wandering Thoughts is actually up to no good; at a minimum they're some form of badly coded stealth spiders, quite possibly from would-be comment spammers that are trawling for targets. On a spot check, this seems to be true of another web server as well.

(A lot of the IPs making HTTP/1.0 requests provide claimed User-Agent headers that include ' Not-A.Brand/99 ', which appears to have been a Chrome experiment in putting random stuff in the User-Agent header. I don't see that in modern real Chrome user-agent strings, so I believe it's been dropped or de-activated since then.)

My own answer is that for now at least, I've blocked HTTP/1.0 requests to Wandering Thoughts. I'm monitoring what User-Agents get blocked, partly so I can perhaps exempt some if I need to, and it's possible I'll rethink the block entirely.

(Before you do this, you should certainly look at your own logs. I wouldn't expect there to be very many real HTTP/1.0 clients still out there, but the web has surprised me before.)

web/HTTP10BlockQuestion written at 22:28:11;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.