Why SELinux is inherently complex
The root of SELinux's problems is that SELinux is a complex security mechanism that is hard to get right. Unfortunately this complexity is not (just) simply an implementation artifact of the current SELinux code; instead, it's inherent in what SELinux is trying to do.
What SELinux is trying to do is understand 'valid' program behavior and confine programs to it at a fine grained level in an environment where all of the following are true:
- Programs are large, complex, and can legitimately do many things
(this is especially so because we are really talking about entire
assemblages of programs, not just single binaries). After all,
SELinux is intended to secure things like web servers, database
engines, and mailers, all of which have huge amounts of functionality.
- Programs legitimately access things that are spread all over the
system and intermingled tightly with things that they should not
be able to touch. This requires fine-grained selectivity about
what programs can and cannot access.
- Programs use and rely on outside libraries that can have unpredictable, opaque, and undocumented internal behavior, including about what resources those libraries access. Since we're trying to confine all of the program's observed behavior, this necessarily includes the behavior of the libraries that it uses.
All of this means that thoroughly understanding program behavior is very hard, yet such a thorough understanding is the core prerequisite for a SELinux policy that is both correct and secure. Even when you've got a thorough understanding once, the issue with libraries means that it can be kicked out from underneath you by a library update.
(Such insufficient understanding of program behavior is almost certainly the root cause of a great many of the SELinux issues that got fixed here.)
This complexity is inherent in trying to understand program behavior
in the unconfined environment of a general Unix system, where
programs can touch devices in
/dev, configuration files under
/etc, run code from libraries in
/lib, run helper programs from
/usr/bin, poke around in files in various places in
/var, maybe read things from
network calls to various services, and so on. All the while they're
not supposed to be able to look at many things from those places
or do many 'wrong' operations. Your program that does DNS lookups
likely needs to be able to make TCP connections to port 53, but you
probably don't want it to be able to make TCP connections to port
25 (or 22). And maybe it needs to make some additional connections
to local services, depending on what NSS libraries got loaded by
glibc when it parsed
(Cryptography libraries have historically done some really creative
and crazy things on startup in the name of trying to get some
additional randomness, including reading
/etc/passwd and running
netstat. Yes, really (via).)
SELinux can be simple, but it requires massive reorganization of a
typical Linux system and application stack. For example, life would
be much simpler if all confined services ran inside defined directory
trees and had no access to anything outside their tree (ie everything
chroot()'d or close to it); then you could write
really simple file access rules (or at least start with them).
Similar things could be done with services provided to applications
(for example, 'all logging must be done through this interface'),
requirements to explicitly document required incoming and outgoing
network traffic, and so on.
(What all of these do is make it easier to understand expected program behavior, either by limiting what programs can do to start with or by requiring them to explicitly document their behavior in order to have it work at all.)
Sidebar: the configuration change problem
The problem gets much worse when you allow system administrators to substantially change the behavior of programs in unpredictable ways by changing their configurations. There is no scalable automated way to parse program configuration files and determine what they 'should' be doing or accessing based on the configuration, so now you're back to requiring people to recreate that understanding of program behavior, or at least a fragment of it (the part that their configuration changes affected).
This generously assumes that all points where sysadmins can change program configuration come prominently marked with 'if you touch this, you need to do this to the SELinux setup'. As you can experimentally determine today, this is not the case.