Wandering Thoughts archives

2007-12-19

Why setuid scripts are fundamentally a bad idea

The real problem with setuid scripts on Unix is not that writing secure shell scripts is challenging and obscure, it is that they are fundamentally insecure because of how the kernel runs them. While the kernel runs programs by directly loading them into memory, it runs scripts by running the script's interpreter with the filename of the script, leaving it up to the interpreter to read and execute the script itself. As is normal on Unix, there is nothing that keeps what file the filename points to the same between these two steps.

In other words, there is no way to guarantee that what the interpreter reads is the same script that the kernel gave setuid permissions to; it might be some other script that an attacker put in place in the time between the kernel starting the (setuid) interpreter and the interpreter opening and reading the file.

Since this is a direct consequence of sensible and long-standing decisions about how to run scripts, Unix can't work around the problem in general without creating incompatibilities. Nor can the problem be fixed in the interpreters alone by having them fstat() the opened script's file descriptor and refusing to work unless it has appropriate privileges, because this breaks exec()'ing scripts from a setuid program.

The best solution would be for the kernel to directly pass the file descriptor of the script that it already has to the interpreter. The command line filename would remain, but in fd-aware interpreters would only be used for $0 or the equivalent. However, this would require new fd-aware interpreters, which would be specific to the Unix variant that did this, and the demand for general setuid script support is low (to put it one way).

unix/WhyNotSetuidScripts written at 23:16:27; Add Comment

How x86 Linux executes ELF programs

Yesterday I said that the kernel directly executes programs in place. Because I feel like walking through the details, here is what the kernel does to start ELF programs on x86 Linux; for simplicity, I'm going to talk about 32-bit programs.

  • First, the kernel maps the program's text, data, and BSS into memory. Almost all programs require these to be mapped at fixed addresses starting from 0x08048000 (128 Mb) and going on up.

  • if the program is dynamically linked, the kernel also maps the dynamic linker's text, data, and BSS into memory. Dynamic linkers are generally willing to be loaded anywhere in memory, so they get wedged into the first spot the kernel considers available.

    (ELF executables specify the full path of their dynamic linker, which is confusingly called the 'ELF interpreter' in various places.)

  • the kernel sticks an 'auxiliary table' of various information on the top of the stack.
  • the environment and the arguments are copied into the stack.

If the program is statically linked, the kernel sets the user-level program counter for the process to the start address in the program's ELF header, which is somewhere after 0x08048000. When the kernel returns back to user space, the program will wind up running directly.

(What the start address is depends on how much stuff has to go at the start of the program's text area, so it varies from program to program.)

If the program is dynamically linked, the kernel instead sets the program counter to the start address of the dynamic linker, and the process will start running the dynamic linker's code directly. The dynamic linker uses information in the auxiliary table to find the real program's code and data, and eventually start it.

(From this we can see how calling the dynamic linker an 'interpreter' is a misnomer; it works nothing like an interpreter for a script, although it is a regular ELF executable.)

Technically, you could make dynamically linked ELF executables that contained no actual machine code but instead had a 'dynamic linker' that actually was an interpreter. However, this would be tricky to pull off, because dynamic linkers cannot themselves be dynamically linked, so your interpreter would need either to not use any shared libraries (including the normal C and Unix runtime) or to bootstrap the regular dynamic linker somehow.

linux/HowProgramsExecute written at 00:53:10; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.