What you should do about extra query parameters on your URLs

September 8, 2020

My entry on how web server laxness created a de facto requirement to accept arbitrary query parameters on your URLs got a number of good comments, so I want to agree with and magnify the suggestion about what to do about these parameters. First off, you shouldn't reject web page requests with extra query parameters. I also believe that you shouldn't just ignore them and serve the regular version of your web page. Instead, as said by several commentators, you should answer with a HTTP redirect to the canonical URL of the web page, which will be stripped of at least the extra query parameters.

(I think that this should be a permanent HTTP redirect instead of a temporary one for reasons that don't fit within the margins of this entry. Also, this assumes that you're dealing with a GET or a HEAD request.)

Answering with a HTTP redirect instead of the page has two useful or important effects, as pointed out by commentators on that entry. First, any web search engines that are following those altered links won't index duplicate versions of your pages and get confused about which is the canonical one (or downrate you in results for having duplicate content). Second, people who copy and reshare the URL from their browser will be sharing the canonical URL, not the messed up version with tracking identifiers and other gunk. This assumes that you don't care about those tracking identifiers, but I think this is true for most of my readers.

(In addition, you can't count on other people's tracking identifiers to be preserved by third parties when your URLs get re-shared. If you want to track that sort of stuff, you probably need to add your own tracking identifier. You might care about this if, for example, you wanted to see how widely a link posted on Facebook spread.)

However, this only applies to web pages, not to API endpoints. Your API endpoints (even GET ones) should probably error out on extra query parameters unless there is some plausible reason they would ever be usefully shared through social media. If your API endpoints never respond with useful HTML to bare GETs, this probably doesn't apply. If you see a lot of this happening with your endpoints, you might make them answer with HTTP redirects to your API documentation or something like that instead of some 4xx error status.

(But you probably should also try to figure out why people are sharing the URLs of your API endpoints on social media, and other people are copying them. You may have a documentation issue.)

PS: As you might suspect, this is what DWiki does, at least for the extra query parameters that it specifically recognizes.

Written on 08 September 2020.
« Why Fedora version upgrades are complicated and painful for me
Permanent versus temporary redirects when handling extra query parameters on your URLs »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Sep 8 23:41:59 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.