The backend for our recent mirroring

March 21, 2006

Since I alluded to it in passing in an earlier entry, I might as well describe what I know about how THEMIS set up their systems to handle the load. (Disclaimer: this is second and third hand.)

To cope with the visitors to their regular web site, THEMIS put their ordinary web servers behind eight Squid proxies, which were in turn behind a load balancer box. This apparently held up very well to the quite a few millions of extra visitors from Google Mars. The main movie page was on their regular site (and thus behind the Squid proxies), but the links to the movies pointed to video.mars.asu.edu.

All video.mars.asu.edu did was serve up HTTP redirections to the URLs of the various mirror locations, more or less in rotating through them to distribute the load. To be as fast and light as possible its web server didn't bother to look at the HTTP request, so to mirror a second file the THEMIS people had to run a second server, which they did by making the .wmv format movie live on video.mars.asu.edu:81.

THEMIS ran an automated monitoring system to detect overloaded or dead mirrors. It worked by running through the list of mirror URLs every so often, making HEAD requests to each; if there wasn't a good response fast enough, that URL got left out of the list used by the redirector until it came back to life.

Using HTTP redirects meant that the mirroring could be very simple. It didn't need to worry about DNS round robin or having people set up virtual hosts or anything; all it needed was a list of current mirror URLs. (The disadvantage of HTTP redirects is that the mirroring is semi-exposed to your visitors. I don't think THEMIS cared under the circumstances; they were more concerned that demand for the movies would overwhelm ASU's Internet connection.)

Sidebar: why such a primitive HTTP redirector?

Why not parse the HTTP requests on video.mars, instead of having to run two servers and so on? The THEMIS people were concerned that video.mars would see a huge connection rate and wanted it to be as light-weight and reliable as possible. You could build a pretty lightweight solution with something like lighttpd's built in FastCGI gateway going to a local FastCGI server, but it would have had more moving parts and thus been more risky to build on the spot.

Written on 21 March 2006.
« An optimization thought
Google Desktop and conditional GET (part 2) »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Mar 21 02:49:38 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.