2007-12-24
The difference between shells that do job control and shells that don't
Apart from having job control, the difference between a job control shell and a non job control shell is that job control shells put each command into a difference process group. They do this even for commands running in the foreground, because you may later want to ^Z the command and push it into the background, and process groups are the core mechanism of doing job control.
(Process groups do two things in job control. First, they let the shell
reliably send stop or start signals to all of the processes spawned
by a command. Second, they control what processes can do IO with the
terminal; only processes in the foreground process group can read from
the terminal, and you can use 'stty tostop' to make it so that only
the foreground process group can write to the terminal.)
Shells that don't do job control just ignore process groups and thus leave all commands in the current process group (which is also the foreground process group).
This matters because of what happens when you end a session. When a
terminal closes (either a pty or a real terminal), only the current
foreground process group is sent a SIGHUP; other process groups are
left alone. Thus, background processes started by commands run from a
job control shell are insulated from SIGHUP, because they inherit the
command's process group and when the command finished, that is no longer
the foreground process group. This allows commands to be somewhat
sloppy in how they start long running
background processes.
(So how do suspended processes get cleaned up when you log out? There's
a second mechanism: orphaned process groups that have one or more
suspended processes are sent SIGHUP and then SIGCONT.)
Since there are very few shells these days that don't do job control,
this may be a hard error for people to see in testing. But it's really
not hard to see in code; the rule is that if you want a process to
survive the session ending, you must take explicit steps to insulate
it from SIGHUP et al. If you don't, you're being lazy and counting on
the side effects of running under a job control shell.
(A recent correction got me interested in all of this.)
2007-12-19
Why setuid scripts are fundamentally a bad idea
The real problem with setuid scripts on Unix is not that writing secure shell scripts is challenging and obscure, it is that they are fundamentally insecure because of how the kernel runs them. While the kernel runs programs by directly loading them into memory, it runs scripts by running the script's interpreter with the filename of the script, leaving it up to the interpreter to read and execute the script itself. As is normal on Unix, there is nothing that keeps what file the filename points to the same between these two steps.
In other words, there is no way to guarantee that what the interpreter reads is the same script that the kernel gave setuid permissions to; it might be some other script that an attacker put in place in the time between the kernel starting the (setuid) interpreter and the interpreter opening and reading the file.
Since this is a direct consequence of sensible and long-standing
decisions about how to run scripts, Unix can't work around the problem
in general without creating incompatibilities. Nor can the problem be
fixed in the interpreters alone by having them fstat() the opened
script's file descriptor and refusing to work unless it has appropriate
privileges, because this breaks exec()'ing scripts from a setuid
program.
The best solution would be for the kernel to directly pass the file
descriptor of the script that it already has to the interpreter. The
command line filename would remain, but in fd-aware interpreters would
only be used for $0 or the equivalent. However, this would require new
fd-aware interpreters, which would be specific to the Unix variant that
did this, and the demand for general setuid script support is low (to
put it one way).
2007-12-17
What is a script language on Unix
There's a lot of argument in general about what is (merely) a 'scripting language' and what is a fully fledged programming language, deserving to hold its head up high besides grown up languages like C, Java, and Pascal.
Unix is a simpler place, because it has a simple and very clear
definition of what is a script versus what is a program. To wit: if the
kernel can directly exec() you in place as is, you are a program. If
not, you are a script (sometimes called an 'interpreter script') and
actually get processed by your interpreter, not the kernel.
(Note that in-place execution doesn't preclude the use of a helper; almost every program you run on a modern Unix requires the help of the dynamic loader, which is not part of each executable.)
While technical and somewhat picky, this distinction is important. Among other issues, there are a number of places where a program can be used that a script cannot be.
As an immediate corollary, on Unix a language that cannot be used to make programs is a script language, and there are some things that programs written in that language will never be able to do directly. Note that this says nothing about their suitability for serious jobs.
(While scripts usually start with '#! <interpreter>', some Unixes
have ways to run scripts without that; Linux has binfmt_misc, for example.)