Some notes on rewrites in Apache .htaccess files

May 19, 2009

Since I keep rediscovering this every so often, here's what I know about rewrite rules in .htaccess files so that I can just read it here the next time around.

Some basics:

  • you need a 'RewriteEngine on' statement, even if the rewrite engine is already on in the main configuration.

  • the 'URLs' that you match against in RewriteRule are relative to the directory the .htaccess is in. However, Apache variables like %{REQUEST_FILENAME} that you use in RewriteCond are the full real URLs, not URLs relative to the directory. This makes sense, but does mean one has to keep track of it all.

Suppose that you want to have a 'directory' that is actually a CGI-BIN. There are two ways to do this:

  • make an actual directory, and put a .htaccess in it that has:
    RewriteRule ^(.*)$ /cgis/my-cgi/$1 [PT]

    Apache itself will then handle generating a redirect for people who ask for the directory without the trailing slash; your CGI-BIN does not have to worry about it.

  • put a .htacces in the directory that is one level up. This should have something like:
    RewriteRule ^foo$ /cgis/my-cgi [PT]
    RewriteRule ^foo/(.*)$ /cgis/my-cgi/$1 [PT]

    Your CGI will have to generate the redirect when people ask for the directory without the trailing slash (or, well, do whatever you want with their requests); Apache won't do anything special for you.

It is common to implement the latter approach with a single rewrite rule:

RewriteRule ^foo(.*)$ /cgis/my-cgi/$1 [PT]

However, this is incorrect because it matches too much; it will send any URL in that directory that starts with foo off to your CGI-BIN, including things like a request for 'foobar'.

(You may not care about this. I do, partly because I don't like handing my CGIs URLs that they're not actually supposed to be handling.)

PS: the very similar looking destination '/cgis/my-cgi$1' is very much not what you want; in fact, I believe that it's a security risk, as I think it means that Apache can be tricked into running things like '/cgis/my-cgi.old' with a suitable request.


Comments on this page:

From 78.35.25.18 at 2009-05-19 04:37:56:
RewriteRule ^foo(/(.*))?$ /cgis/my-cgi/$2 [PT]

Aristotle Pagaltzis

By cks at 2009-05-19 11:29:10:

That's a nicely clever rewrite rule. It does mean that $PATH_INFO will always have a /, even if the user didn't supply one, but you probably don't want to look at $PATH_INFO anyways, since I think that $REQUEST_URI is both superior and always present.

From 78.35.25.18 at 2009-05-19 13:04:26:

Yes; I see that PATH_INFO canonicalisation as a feature. If you don’t need/want that, the rule actually gets simpler:

RewriteRule ^foo(/.*)?$ /cgis/my-cgi$1 [PT]

Aristotle Pagaltzis

By cks at 2009-05-19 17:11:33:

I actively want my CGI applications that imitate directories to force a redirection if the user asks for a URL without the slash, instead of just presenting the normal top page.

(In some situations I think that you have to, in order to get relative URLs in your generated page to act right, assuming you use relative URLs at all.)

Written on 19 May 2009.
« One reason for Unix's permission checking timing
Why directory URLs have to have trailing slashes »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue May 19 00:37:56 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.