How not to use Apache's ProxyPass directive

March 7, 2012

Periodically we need to set up reverse proxies with Apache's ProxyPass directive (to support our solution to the multiuser PHP problem). On the surface doing this fairly simple and straightforward; however, the important devil is in this spotlighted bit in the documentation:

If the first argument ends with a trailing /, the second argument should also end with a trailing / and vice versa. Otherwise the resulting requests to the backend may miss some needed slashes and do not deliver the expected results.

Since I have now stubbed my toe on this thoroughly, here are several ways to not use ProxyPass for this, all of which fall afoul of the above warning (some in less than obvious ways).

To start with, the basic template of ProxyPass is 'ProxyPass /a/path http://somewhere/else'. When Apache sees any URL that starts with /a/path, it removes /a/path from the front of the URL, puts whatever remains on the end of the second URL, and tries to fetch the resulting URL.

In all of the following examples, we want /url/ to be reverse proxied as a directory; the target has a page at the top level with a relative link to a.html.

First mistake:

ProxyPass /url/ http://localhost:8080

The top level page works and the link to a.html shows as link to /url/a.html, but attempts to follow the link fail with Apache being unable to fetch the URL http://localhost:8080a.html. This shows that Apache is effectively forming the URL by text substitution and then interpreting it later; because there is no / at the end of the second argument, it simply glued the raw text of everything past /url/ onto it and the result fails badly.

(This also doesn't do anything to handle a request for just '/url', but one can get around that with other tricks.)

Second mistake:

ProxyPass /url http://localhost:8080

If you request /url/ everything works. But if you request just /url you still get the page (instead of a redirection to the version with a version with a / on the end) and the relative link to a.html comes out as a link to /a.html (which doesn't exist and in any case is not reverse proxied) instead of /url/a.html, because your browser sees /url as a page in / instead of a directory.

This case is the tricky case because it's not obvious that we're breaking the rule from the documentation; after all, everything looks right since neither argument ends with a /. The problem is that when you make a bare request for http://localhost:8080, as you do when you ask for '/url', Apache implicitly adds a / on the end (because it has to; it must GET something from the server at localhost:8080). This implicit / means you have a / on the end of the second argument but not on the end of the first argument and have thus broken the rule.

My belief is that there is no simple way for whatever is behind the reverse proxy to fix this. Without peeking at special request headers that Apache reverse proxying supplies, it cannot tell whether a request for / is from someone who asked for '/url/' (and is okay) or someone who asked for '/url' (and should get redirected to /url/).

Third mistake:

ProxyPass /url http://localhost:8080/

If you ask for /url/ or anything under /url/, the reverse proxied web server receives a request for the (local) URL // or something that starts with that. Many web servers are unhappy about this. If you ask for just /url you get a page, but the relative links on the page are broken as before because it's still not redirected to /url/.

(However, now a suitably crazy web app can actually tell the difference between the two requests.)

As far as I can tell the only proper way to use ProxyPass in this situation is as follows:

ProxyPass /url/ http://localhost:8080/

This follows the rules and does not result in doubled /'s. It doesn't handle requests for /url at all, but I believe that you can arrange for /url to be redirected to /url/ by having a real /url directory in an appropriate place in your filesystem.

(In our environment most of these redirections are for user home pages, where /~user will already get redirected appropriately.)

Written on 07 March 2012.
« Web frameworks should be secure by default
The hard part of custom environments on Fedora (or any Linux) »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Mar 7 02:24:29 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.