Why upstreams can't document their program's behavior for us

July 16, 2017

In reaction to SELinux's problem of keeping up with app development, one obvious suggestion is to have upstreams do this work instead. A variant of this idea is what DrScriptt suggested in a comment on that entry:

I would be interested in up stream app developers publishing things about their application, including what it should be doing. [...]

Setting aside the practical issue that upstream developers are not interested in spending their time on this, I happen to believe that there are serious and probably unsolvable problems with this idea even in theory.

The first issue is that the behavior of a sophisticated modern application (which are what we most care about confining well) is actually a composite of at least four different sources of behavior and behavior changes: the program itself, the libraries it uses, how a particular distribution configures and builds both of these, and how individual systems are configured. Oh, and as covered, this is really not 'the program' and 'the libraries', but 'the version of the program and the libraries used by a particular distribution' (or when the app was built locally).

In most Linux systems, even simple looking operations can go very deep here. Does your program call gethostbyname()? If so, what files it will access and what network resources it attempts to contact cannot be predicted in advance without knowing how nsswitch.conf (and other things) are configured on the specific system it's running on. The only useful thing that the upstream developers can possibly tell you is 'this calls gethostbyname(), you figure out what that means'. The same is true for calls like getpwuid() or getpwnam(), as well as any number of other things.

The other significant issue is that when prepared by an upstream, this information is essentially a form of code comments. Without a way for upstreams to test and verify the information, it's more or less guaranteed to be incomplete and sometimes outright wrong (just as comments are incomplete and periodically wrong). So we're asking upstreams to create security sensitive documentation that can be predicted in advance to be partly incorrect, and we'd also like it to be detailed and comprehensive (since we want to use this information as the basis for a fine-grained policy on things like what files the app will be allowed access to).

(I'm completely ignoring the very large question of what format this information would be in. I don't think there's any current machine-readable format that would do, which means either trying to invent a new one or having people eventually translate ad-hoc human readable documentation into SELinux policies and other things. Don't expect the documentation to be written with specification-level rigor, either; if nothing else, producing that grade of documentation is fairly expensive and time-consuming.)

Written on 16 July 2017.
« Some people feel that all permanent SMTP failures are actually temporary
Why I think Emacs readline bindings work better than Vi ones »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jul 16 01:18:05 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.