Redirecting paths that start with two slashes in Apache

July 8, 2021

Suppose that you have a dynamic web application of some sort that sits behind Apache, and an URL for one of your application's pages where the path starts with two slashes instead of one starts going around. In other words, people are sharing 'https://example.org//app/page', instead of the version with one slash at the start of the path. You can support this in your web app (with some potential cautions), but you would prefer to have Apache redirect the non-ideal URL to the canonical URL.

Under normal circumstances, this sort of selective redirection should be straightforward using mod_rewrite. If you wanted to rewrite only a single specific bad path instead of all potential ones, I'd expect something like the following to work:

RewriteCond %{REQUEST_URI} "=//app/page"
RewriteRule ^.* https://example.org/app/page

(This is one of the cases where exact string matching in RewriteCond is a useful thing.)

However, this appears not to work in at least a .htaccess file, and I suspect it won't work in the Apache configuration file either. Although Apache's %{REQUEST_URI} gives you the full URL path even in a .htaccess, it appears that Apache canonicalizes it for the purposes of things like RewriteCond and so turns the two leading slashes into one. This canonicalization isn't passed through to CGIs, though; they will see the original "//app/page" version.

(This canonicalization appears to apply to any / in the URL path, not just the first ones. If you write a condition for "/app/dir/page", it will match for URLs with any amount of additional slashes, eg "//app////dir///page" will match.)

Instead, the only way I found to do this was with Apache's special %{THE_REQUEST} variable for the full HTTP request line. As the mod_rewrite documentation covers, this value has not been escaped, which may cause you heartburn if people get clever or you want to do general matching. So the rule I wound up with looks like:

RewriteCond %{THE_REQUEST} "^GET //app/page "
RewriteRule ^.* https://example.org/app/page

This match is very specific, since we're doing a very specific HTTP redirection. You'd want to be more complicated if you need to handle query variables, for example. But it works, unlike the other option.

Possibly I'm missing a clever trick that enables a better version of this. I don't really like matching things so specifically, but it seems to be what you have to reach for in this unusual situation.

Written on 08 July 2021.
« A semi-surprise with Python's urllib.parse and partial URLs
University computer accounts are often surprisingly complicated »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jul 8 23:57:15 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.