Apache directory indexes will notice and exclude blocked URLs in them

June 17, 2021

Today I learned about an interesting and nice little Apache feature. If you have Apache generating its own automatic index pages for filesystem directories, and you block access to some things in a directory with <Location> blocks, Apache's generated index won't include what you blocked. It's as if the object doesn't exist. This is what you want (since attempts to access those things will fail), but it's more than I expected.

That sounds abstract, so let me make it concrete. We have an old legacy FTP site, which we've recently made available as a HTTPS site because browsers are removing support for FTP. For historical reasons, this FTP site has some symbolic links that create recursive structures; for example, it has a public_html symlink in the root that points to '.' (the current directory). Unfortunately, web spiders just love recursive structures and will crawl through them incessantly, with ever lengthening URLs.

(Web spider operators will probably tell you that they don't like recursive link situations like this. I have to go by observed behavior, which is that any number of web spiders don't appear to notice that /public_html/ is exactly the same content as / and /public_html/public_html/ and so on.)

We don't want to remove the symbolic links from the actual directory tree that's the FTP site, for various reasons (maybe they're there for some necessary reason, or at least have become embedded in historical FTP URLs). But the HTTPS site is new and we can drop whatever URLs we want from it. So I did the obvious simple thing:

<Location "/public_html">
  Order deny,allow
  Deny from all
</Location>

When I was verifying that this worked, I noticed that the top level index page for the FTP site no longer showed any public_html entry. Testing showed that this happened for other entries in the directory as well, if I temporarily added them.

The mod_autoindex documentation suggests that this is a standard feature that does general permission checks, based on the documentation for the ShowForbidden option to the IndexOptions directive. However, I haven't tested this with more complex situations, such as <Directory> instead of <Location> or more complicated permissions.

Written on 17 June 2021.
« In Prometheus queries, on and ignoring don't drop labels from the result
The Unix background of Linux's 'file-max' and nr_open kernel limits on file descriptors »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jun 17 23:47:20 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.