Preparing a high load web mirror setup
I spent a chunk of this weekend preparing a mirror for a high load
environment. The mirror only needs to serve a couple of large
video clips, but they're
going to be linked from a high traffic website, so we expect a lot of (simultaneous)
connections and a lot of outgoing bandwidth.
I made a generic mirror url, using a new hostname that I had to put into
a new sub-zone in our DNS. Right now it has two
records, each with a five minute TTL. Each is on a different 100 Mbit/s
subnet; because our subnet uplinks to the university backbone are only
100 mbit, this is the only way I can do over 100 mbit/s outbound.
The machines run lighttpd, serving just the mirrored files, with enough memory to keep the files in cache. Lighttpd is small and easy to install (nice when you're in a rush), plus as a single process server without threads or forking it can't really kill a machine no matter how many simultaneous connections it gets. (I chose lighttpd over thttpd for reasons I may go into later.)
Looking at this after writing it up, it's surprising to me how little stuff is actually involved. Hopefully this setup will work fine in practice; I'll likely find out soon.
(The setup passed stress tests, but that's not the same as having real load show up.)
Sidebar: lighttpd configuration
Lighttpd has a helpful web page on performance improvements. I turned keep-alives off; as far as I can see, keep-alives are useless for serving unrelated static files and I wanted to maximize how many simultaneous connections I could handle.
In testing, I discovered I had to increase lighttpd's
server.max-fds parameter from 1024. A little thought led me to a doh
moment about this: of course I needed to bump it, because every incoming
connection needs two file descriptors; one for the network socket and
one for the file being sent out. So with 1024 file descriptors the web
server could only handle about 500 simultaneous connections.