My mistake in forgetting how Apache .htaccess
files are checked
Every so often I get to have a valuable learning experience about
some aspect of configuring and operating Apache. Yesterday I got
to re-learn that Apache .htaccess
files are checked and evaluated
in multiple steps, not strictly top to bottom, directive by directive.
This means that certain directives can block some later directives
while other later directives still work, depending on what sort of
directives they are.
(This is the same as the main Apache configuration file, but it's easy to lose sight of this for various reasons, including that Apache has a complicated evaluation order.)
This sounds abstract, so let me tell you the practical story.
Wandering Thoughts sits behind an Apache .htaccess
file,
which originally was for rewriting the directory hierarchy to a
CGI-BIN but then grew to also be used for blocking
various sorts of significantly undesirable things. I also have some Apache redirects to
fix a few terrible mistakes in URLs that I accidentally made.
(All of this place is indeed run through a CGI-BIN in a complicated setup.)
Over time, my .htaccess
grew bigger and bigger as I added new
rules, almost always at the bottom of the file (more or less).
Things like bad web spiders are mostly recognized and blocked through
Apache rewriting rules, but I've also got a bunch of 'Deny from
..
' rules because that's the easy way to block IP addresses and
IP ranges.
Recently I discovered that a new rewrite-based block that I had added
wasn't working. At first I thought I had some aspect of the syntax
wrong, but in the process of testing I discovered that some other
(rewrite-based) blocks also weren't working, although some definitely
were. Specifically, early blocks in my .htaccess
were working but
not later ones. So I started testing block rules from top to bottom,
reading through the file in the process, and came to a line in the
middle:
RewriteRule ^(.*)?$ /path/to/cwiki/$1 [PT]
This is my main CGI-BIN rewrite rule, which matches everything. So of course no rewrite-based rules after it were working because the rewriting process never got to them.
You might ask why I didn't notice this earlier. Part of the answer
is that not everything in my .htaccess
after this line failed to
take effect. I had both 'Deny from ...
' and 'RedirectMatch
'
rules after this line, and all of those were working fine; it was
only the rewrite-based rules that were failing. So every so often
I had the reassuring experience of adding a new block and looking
at the access logs to see it immediately rejecting an active bad
source of traffic or the like.
(My fix was to move my general rewrite rule to the bottom and then put in a big comment about it, so that hopefully I don't accidentally start adding blocking rules below it again in the future.)
PS: It looks like for a while the only blocks I added below my
CGI-BIN rewrite rule were 'Deny from
' blocks. Then at some point
I blocked a bad source by both IP address and then its (bogus)
HTTP referer in a
rewrite rule, and at that point the gun was pointed at my foot.
|
|