One problem with "shared Unix hosting" was the lack of resource limits

February 18, 2025

I recently read Comments on Shared Unix Hosting vs. the Cloud (via), which I will summarize as being sad about how old fashioned shared hosting on a (shared) Unix system has basically died out, and along with it web server technology like CGI. As it happens, I have a system administrator's view of why shared Unix hosting always had problems and was a down-market thing with various limitations, and why even today people aren't very happy with providing it. In my view, a big part of the issue was the lack of resource limits.

The problem with sharing a Unix machine with other people is that by default, those other people can starve you out. They can take up all of the available CPU time, memory, process slots, disk IO, and so on. On an unprotected shared web server, all you need is one person's runaway 'CGI' code (which might be PHP code or etc) or even an unusually popular dynamic site and all of the other people wind up having a bad time. Life gets worse if you allow people to log in, run things in the background, run things from cron, and so on, because all of these can add extra load. In order to make shared hosting be reliable and good, you need some way of forcing a fair sharing of resources and limiting how much resources a given customer can use.

Unfortunately, for much of the practical life of shared Unix hosting, Unixes did not have that. Some Unixes could create various sorts of security boundaries, but generally not resource usage limits that applied to an entire group of processes. Even once this became possibly to some degree in Linux through cgroup(s), the kernel features took some time to mature and then it took even longer for common software to support running things in isolated and resource controlled cgroups. Even today it's still not necessarily entirely there for things like running CGIs from your web server, never mind a potential shared database server to support everyone's database backed blog.

(A shared database server needs to implement its own internal resource limits for each customer, otherwise you have to worry about a customer gumming it up with expensive queries, a flood of queries, and so on. If they need separate database servers for isolation and resource control, now they need more server resources.)

My impression is that the lack of kernel supported resource limits forced shared hosting providers to roll their own ad-hoc ways of limiting how much resources their customers could use. In turn this created the array of restrictions that you used to see on such providers, with things like 'no background processes', 'your CGI can only run for so long before being terminated', 'your shell session is closed after N minutes', and so on. If shared hosting had been able to put real limits on each of their customers, this wouldn't have been as necessary; you could go more toward letting each customer blow itself up if it over-used resources.

(How much resources to give each customer is also a problem, but that's another entry.)


Comments on this page:

There kind of was. There was setrlimit which you could control with ulimit in the shell.

https://man.freebsd.org/cgi/man.cgi?query=getrlimit&sektion=2&apropos=0&manpath=4.3BSD+Reno

Configuring that for a system was complicated IIRC. I remember at my university an admin set the maximum number of processes a user could run. Which really didn't work well with the nn USENET news reader.

By kfischer at 2025-02-19 11:46:31:

Configuring that for a system was complicated IIRC.

Hell, configuring for a single process is complicated. It's just a broken interface.

RLIMIT_CPU limits the CPU time to a specified number of seconds; the hard limit will kill the process, whereas the soft limit will trigger a SIGXCPU. (And few people can handle signals safely. Pre-2024 POSIX didn't even allow handlers to read global variables: "the behavior is undefined if the signal handler refers to any object other than errno with static storage duration other than by assigning a value to an object declared as volatile sig_atomic_t". This raises the question of how one was meant to use the ostensibly-kind-of-async-signal-safe longjmp() from a handler.)

Anyway, who knows how many seconds anything should take? What if you want sub-second resolution? How does any of this apply to child, child-descendant, and re-parented processes? But I guess this is where "your CGI can only run for so long before being terminated" came from.

Then there's RLIMIT_STACK: "the maximum size, in bytes, of the stack segment for a process". Kind of useless, because CPUs haven't used (non-trivial) segments for like 40 years. I can malloc() a stack and change my stack pointer to reference it, completely ignoring that limit. RLIMIT_DATA has the same problem: who needs sbrk() anymore? mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0) will bypass that check.

Maybe RLIMIT_RSS is a bit better, but some processes, like Webkit—probably not relevant in a hosting environment though—over-commit memory and will fail if restricted; it's also unclear whether memory-mapped files (including those from shm_open()) should or will count. POSIX adds RLIMIT_AS, but who really cares how much address space a process uses? I guess it's an indirect way to limit space used for page tables.

RLIMIT_FSIZE and RLIMIT_CORE are fine to avoid accidental wastes of disk space, but there's no limit to the number of maximum-size files created. So, 0 is the only setting with security value. RLIMIT_NOFILE can be set to 0 to prevent processes from opening files, which makes it useful for sandboxing; otherwise, the number's more for compatibility with fixed-size select() buffers.

If all of the above were fixed, we'd likely still want to add a way to apply limits to existing processes.

One frustrating thing is that it's actually really easy to write a fair-share CPU scheduler. I and everyone in my university operating systems class did so for Linux (dividing equally by user ID, special-casing 0), as probably countless others did. But then there are memory usage and I/O bandwidth to be handled. Memory bandwidth can probably be ignored. Most Unix-like systems did have some form of disk quota. Still, in the end, it seems to have been mostly easier to build various forms of virtual machines than to change existing kernels.

By Anonymous at 2025-02-19 12:24:02:

IBM's AIX has the 'maxuproc' kernel parameter which controls the maximum number of processes per user, but that is a system-wide setting; you cannot change this on any specific per user basis. The setting has been there since at least AIX 4.3.3, which was released in 1999, but I cannot comment on older versions since I never had to administer older versions than that.

Written on 18 February 2025.
« More potential problems for people with older browsers
Shared (Unix) hosting and the problem of managing resource limits »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Tue Feb 18 23:04:00 2025
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.