2016-11-04
Caution is a mistake in modern web servers and apps
When I wrote DWiki (the code behind Wandering Thoughts),
I made it pretty cautious and conservative about what it accepted
instead of rejecting, and how it handled any number of things.
POST
s to GET
-only URLs, or GET
s of POST
-only URLs? Rejected
with errors. Unexpected query parameters on GET
s? Rejected with
an error. An If-Modified-Since
time that didn't exactly match the
resource's current modification timestamp? Well, better declare
that a conditional GET
miss and give you the full resource (cf). And so on.
If I was doing a web app from scratch today, I wouldn't do that.
Send me the wrong sort of operation or a nonsensical one that I
don't understand? Send me query parameters I've never heard of?
Whatever, have a HTTP redirection to the canonical URL. Maybe this
is not what you actually wanted, but if so it's not my problem;
you're the one sending the broken request, and I'm giving you an
answer that's more useful (to a person) than a 4xx or a 5xx. Send
me a random If-Modified-Since
? I'll do a time-based comparison
on it if I can, and if that results in you not realizing there's a
new version of the page, well, you could have used an ETag
based
check instead.
You might wonder why I say this. Well, sadly it's simple. The reality is that on the modern web, being cautious about what you accept is a mistake.
The modern web is a mess in practice and part of that mess is that people will cheerfully write and distribute software that shoves all sorts of crazy, sloppy, stupid things at your server. Some times it's just an accident, which goes unnoticed and unfixed for the usual reason (namely that almost nothing else notices or complains). Some times it's a deliberate choice because they can usually get away with it and use it for something useful (to them). Some times it's folklore that people are blindly following. And honestly, it doesn't really matter why it happens, just that it does and it affects real software used by real people.
(Some of it is the malicious attempting to attack your server, but so what? Everyone gets attacked all the time on today's Internet.)
As you might guess, there is a precipitating incident that led me
to write this entry. To wit, today I saw some POST
requests (to
GET
-only pages) with the content type of text/ping
. This is
apparently a new (proposed) standard.
Yes, really.
I could write a lot here, as I have in the past (I'll spare you
links to the entries), but there's no point. I give up. You win,
modern web, or more exactly you have simply run everyone over like
a giant lawnmower. Given that there are browsers out there
implementing this today and sending these POST
requests to my
pages here, my views are irrelevant.
(With that said, I don't have any plans to change DWiki's current cautious behaviors. That would be more work than just leaving it alone.)