An irony of web serving
One of the small paradoxes of the web is that it is often the connections with the least bandwidth that put the largest load on your web server.
This is because each connection consumes a certain amount of server resources, ranging from kernel data structures for socket buffers up to an entire thread or process on a dynamic website. The slower someone's connection, the longer they tie up up this stuff on your end as you slowly feed them data. Conversely, people on fast connections get in, get their data, and get out fast, letting you release those resources.
Among other little effects, this means that it's not enough for load testing to pick a connections per second rate that you should be able to deal with. Ten new connections a second where each client takes a tenth of a second to download its content is rather easier to deal with than ten connections a second where each client takes ten seconds. (And if you test across a local LAN you are far more likely to get the former than the latter.)
There are some ways around parts of this effect:
- web servers based around asynchronous IO generally have far lower
per-connection overhead, which is one reason they're so popular.
- reverse proxy web servers (including in a way Apache running CGI
programs) offer you a way of rapidly sucking the content out of
your high-overhead dynamic website system and parking it in a
low-overhead frontend web server while it trickles out to the
slow clients, instead of having the slow clients hold down an
expensive connection directly with the dynamic website bits.
(This only works well if your generated content is small enough to get sucked completely into the frontend, but this is the usual case.)
- some websites just disconnect clients after a certain amount of time, whether or not they are still transferring data. This is most popular for bulk downloads, where it's cheap for the server to start again if (or when) the client reconnects to resume the transfer.