2009-08-23
You should not use HTTP request parameters as filenames
One corollary of the danger of over-powerful introspection is that you should not use HTTP request parameters directly as filenames for any reason, not just to avoid hard-coding allowed commands in your web application. Any time that you use a request parameter as a filename, you create an opportunity for an attacker to escape from whatever directory your application is supposed to get its files from and thus to rummage all over your system.
(In fact, you should not directly use the URL for a filename either,
because someday an attacker may feed you a HTTP request for a URL
with embedded /../ components. In theory your general web server
environment should reject or fix up such URLs without passing them
to your application. In practice, this may not happen in some server
environments, so it's unwise to count on it in general.)
It pretty much doesn't matter exactly what you're doing with the file; an attacker can probably find some way of exploiting whatever you're doing. (If nothing else, there are usually files that cause undesirable side effects when opened or read from, including blocking your application. Are you prepared to deal with an endless stream of random data, for example?)
The reasonably cautious way to deal with this is to carefully validate
the request parameter to reject bad things like /../. Part of the
difficulty is always doing this (you may want to put it in some
low-level 'open a file' routine), and part of the difficulty is making
sure that you get all of the special cases on all of the platforms that
you support (or that your code will get used on).
The really cautious approach is to work the other way. Instead of ultimately validating the request parameter by trying to open a filename based on it, get a directory listing of the directory where all the files are and check that the request parameter matches one of the entries. Although it's more work, this automatically defeats pretty much any weirdness that an attacker can attempt.
2009-08-21
The danger of powerful generality, illustrated
When you are coding the framework for a web application, there is a lot of annoying component registration boilerplate. It is tempting to get rid of the boilerplate by using some introspection-based abstraction, and in some ways it's the right thing to do; 'don't repeat yourself' is one of the tenets of modern programming.
When you do this, it's tempting to make it straightforwardly file-based. For example, a web application with various operations could work by having the framework work out the operation for each request, then attempting to load a file of the same name from a directory somewhere. To add a new operation, just code up all of the necessary routines in an appropriately named file in the directory and you're done, with no additional configuration needed. General, powerful, easy to use, what could be better?
Then some joker on the Internet sends you a HTTP request that specifies an operation of '../../../../proc/self/environ%00', and your framework dutifully goes off, uses it as the filename to look up, loads the file, and does something useful (for the attacker) with it.
The problem (or one of the problems) is that it is too easy to be too general this way. You don't really want to use arbitrary files, you want to restrict it to just files in a specific directory, but to do this takes extra code, and to write the extra code you have to realize that the vulnerability exists in the first place.
(One might think that people would never make this mistake, but our web server logs strongly suggest otherwise; we've been getting a bunch of (failed) requests like this lately. The ones we've seen give the operation as a query parameter.)
2009-08-04
A downside to syndication feed readers respecting permanent HTTP redirects
In theory, well behaved syndication feed readers are supposed to notice when the URL that they are polling changes; when they see a permanent HTTP redirect, they should update the feed URL. I recently ran into a downside to this behavior.
One of the online webcomics sites that I read bits of recently failed to renew its domain name in time. While the domain name was overdue for renewal, the registrar redirected all URLs on the domain's website to a more or less generic domain parking and advertising page on a different domain. They apparently used permanent redirects, because my feed reader dutifully updated all of its feed URLs (and then complained about the new URL being an unparseable feed).
When the domain was renewed and everything started working again, this left me a pretty annoying situation to straighten out; I had to hunt down the original (real) URLs to each feed and re-enter them in my feed reader by hand to correct the situation.
I'm not sure what the right solution to this is, especially because all of them seem likely to require more complicated coding (and data storage). The full bore solution is to keep a history of apparent feed URLs, with some sort of ordering for what one is the presumed currently canonical one; at a minimum you wouldn't switch this until the new feed URL actually gave you a valid (or semi-valid) feed.
(The smaller solution is to not switch the feed URL, despite any redirections, until the new URL gives you a valid feed. This has issues in some obscure situations, but it is an obvious check on the reasonableness of a permanent redirection.)
PS: this situation has bonus irony, because my feed reader didn't used to respect redirections due to a coding bug. Except that I fixed the bug in my own copy when I noticed it due to some other debugging I was doing.
(The sense of the code's 'update URL on redirection' test was reversed, so it was 'updating' the URLs for feeds when they weren't being redirected and reporting this in the debug log, which made me scratch my head and go digging).