How software makes reverse proxying hardOur user run webservers rely on the ability to run various web applications that people want to use behind a reverse proxy. Well, the theoretical ability, because it turns out that there are a couple of things that programs do to make reverse proxying hard (and that they could do differently to make things easier). First is that they should be willing to use the HTTP proxy headers added by Apache to get certain bits of information about the request, most notably the IP origin address. For obvious reasons, they should do this only when specifically configured to do so. (Possibly there is an Apache setting for lying to CGIs and other applications about this sort of stuff, but if so we haven't stumbled across it.) The less obvious thing is that applications need to distinguish between what I will call 'input' URLs, or at least a URL prefix, and 'output' URLs. Input URLs are what you see on requests after they have been remapped by the proxying process; output URLs are the external, pre-proxying, public URLs that should appear in your output (in HTML, in redirects, in Atom feeds, etc). Applications with no such distinction are, unfortunately, very common. We've tried a couple of ways to hack around it:
(This works even when the absolute path has a '/~user/' component.
If you disable UserDirs, Apache is perfectly happy to have a literal
I'm honestly surprised that more web applications don't make it easy to use them behind a reverse proxy; I had the impression that various forms of reverse proxies were relatively common in high load environments. Maybe they're deliberately set up to be more transparent than ours is, to look more like load balancers than actual reverse proxies. (2 comments.)
|
These are my WanderingThoughts GettingAround This is part of CSpace, and is written by ChrisSiebenmann. * * * Atom feeds are available; see the bottom of most pages. Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web |