2009-07-24
The usefulness of a syndication feed of your blog's comments
One of the accidental smart decisions that I made when I was writing DWiki's code for syndication feeds was that I should create syndication feeds for comments as well as entries. (This built on an earlier decision that I should have some way of getting a list of comments in chronological order, just as I had one for the standard wiki 'recently edited pages' feature.)
It isn't that I think this is an important feature to offer other people; as far as I know, none of the visitors to WanderingThoughts have ever paid it any attention. Instead, it (and the equivalent actual page) have turned out to be very handy for me, because it has two very useful effects. First and most obviously, it means that I can easily see all new comments to anything, anywhere, even if the entry is an old one. This is handy for more than the spam patrol, since every so often real people leave useful comments on old entries.
(It also drastically reduces any temptation to close old entries to comments, the way that some blog systems do.)
Second, I can shove the syndication feed into a feed reader and use all of the features to keep track of comments that need replies when I have the time and energy, or that I want to read carefully (or read what they've pointed me to carefully). This throughly beats all the alternate methods of keeping track of this stuff; if I had to do it by hand, I probably wouldn't do it at all.
(Of course, this doesn't mean that I actually write replies. As some people may have noticed, I'm generally terrible at writing comment replies that require substantive amounts of thought and effort, partly because of limited time.)
2009-07-03
How software makes reverse proxying hard
Our user run webservers rely on the ability to run various web applications that people want to use behind a reverse proxy. Well, the theoretical ability, because it turns out that there are a couple of things that programs do to make reverse proxying hard (and that they could do differently to make things easier).
First is that they should be willing to use the HTTP proxy headers added by Apache to get certain bits of information about the request, most notably the IP origin address. For obvious reasons, they should do this only when specifically configured to do so.
(Possibly there is an Apache setting for lying to CGIs and other applications about this sort of stuff, but if so we haven't stumbled across it.)
The less obvious thing is that applications need to distinguish between what I will call 'input' URLs, or at least a URL prefix, and 'output' URLs. Input URLs are what you see on requests after they have been remapped by the proxying process; output URLs are the external, pre-proxying, public URLs that should appear in your output (in HTML, in redirects, in Atom feeds, etc).
Applications with no such distinction are, unfortunately, very common. We've tried a couple of ways to hack around it:
- Apache's ProxyPassReverse directive is a very, very limited attempt
to patch up this problem. In my opinion, it actually does more harm
than good in most situations, since it papers over only part of the
problem; better to have no papering over at all, so that everything
breaks immediately.
- one can often make the absolute path on the user-run web server the same as it is on the real web server; this leaves you with just the port being different. If you're willing to do some hacking, you can configure Apache to lie about that too.
(This works even when the absolute path has a '/~user/' component.
If you disable UserDirs, Apache is perfectly happy to have a literal
~user/ directory in your document root and to serve things from it.)
I'm honestly surprised that more web applications don't make it easy to use them behind a reverse proxy; I had the impression that various forms of reverse proxies were relatively common in high load environments. Maybe they're deliberately set up to be more transparent than ours is, to look more like load balancers than actual reverse proxies.