Issues in fair share scheduling of RAM via resident set sizes
Yesterday I talked about how fair share allocation of things needs a dynamic situation and how memory was not necessarily all that dynamic and flow-based. One possible approach to fair share allocation of memory is to do it on Resident Set Size. If you look at things from the right angle, RSS is sort of a flow in that the kernel and user programs already push it back and forth dynamically.
(Let's ignore all of the complications introduced on modern systems by memory sharing.)
While there has been some work on various fair share approaches to RSS, I think that one issue limiting the appeal here is that significantly constraining RSS often has significant undesirable side effects. Every program has a 'natural' RSS, which is the RSS at which it only infrequently or rarely has to ask for something that's been removed from its set of active memory. If you clamp a program's RSS below this value (and actually evict things from RAM), the program will start trying to page memory back in at a steadily increasing rate. Eventually you can clamp the program's RSS so low that it makes very little forward progress in between all of the page-ins of things it needs.
Up until very recently, all of this page-in activity had another serious effect: it ate up a lot of IO bandwidth to your disks. More exactly, it tended to eat up your very limited random IO capacity, since these sort of page-ins are often random IO. So if you pushed a program into having a small enough RSS, the resulting paging would kill the ability of pretty much all programs to get IO done. This wasn't technically swap death, but it might as well have been. To escape this, the kernel probably needs to limit not just the RSS but also the paging rate; a program that was paging heavily would wind up going to sleep more and more of the time in order to keep its load impact down.
(These days a SSD based system might have enough IO bandwidth and IOPS to not care about this.)
All of this is doable but it's also complicated, and it doesn't get you the sort of more or less obviously right results that fair share CPU scheduling does. I suspect that this has made fair share RSS allocation much less attractive than simpler things like CPU scheduling.