Apache 2.4's event MPM and oddities with
I mentioned recently that we were hopefully going to move away from the Apache prefork MPM when we upgraded our primary web server from Ubuntu 18.04 to Ubuntu 22.04. We had tried to switch over to the event MPM back in 18.04, but had run into problems and had reverted to the prefork MPM as a quick fix. Specifically, we had run into Apache stopping serving requests and reporting the following in the error log:
[mpm_event:error] [pid <nnn>:tid <large number>] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit.
(The message really doesn't have a space between the two sentences. Maybe someday it will.)
As covered in the MaxRequestWorkers documentation, the starting or minimum number of server processes you need is MRW divided by your ThreadsPerChild value. If you set MRW to 300 and TPC to 25 (as we do), this is 12 server processes. However, the ServerLimit documentation also says:
With event, increase this directive if the process number defined by your MaxRequestWorkers and ThreadsPerChild settings, plus the number of gracefully shutting down processes, is more than 16 server processes (default).
How many gracefully shutting down processes are you likely to have? Who knows, although it may be (strongly) influenced by your setting for MaxConnectionsPerChild, if you have one. This is one possible source of our problem with the event MPM on Ubuntu 18.04, although at the time we also found a suggestive Apache bug and generally didn't trust the event MPM enough to keep trying with it.
Yesterday, when we upgraded our central web server to Ubuntu 22.04 and an Apache configuration that uses the event MPM, we got an unpleasant surprise. Our problem from 18.04 came back overnight, and in fact the error message I quoted above comes from our 22.04 error.log. This time around we haven't reverted back to the prefork MPM; instead we're trying various things to make the event MPM work.
One thing we're trying is that, well, maybe the event MPM doesn't have a bug here, it just has more processes that are shutting down gracefully than we expect. So we've raised the ServerLimit from the default of 16 to 32. The other thing we're trying is that we've turned off our use of Apache's mod_qos. Although there have been other reasons for using it in the past, today we use it to deal with our file serving problem, where we have a few sets of large files that are requested in bulk by often slow clients. One of the reasons we wanted to switch to the event MPM is that it should handle these much better than the prefork MPM (which must use a process for each of them). If our theory is correct, we can afford to operate without the ratelimits from mod_qos.
(If our theory is wrong, our alerts are going to let us know the next time these files are unusually popular, as opposed to their regular 25 Mbyte/second level of popularity.)