2009-07-31
Using SystemTap to trace the system calls of setuid programs on Linux
Suppose that you have a setuid program that is failing mysteriously
and you want to see what it's doing. With normal programs you can use
strace
, but not even root can strace
a setuid program (if you try,
the program runs non-setuid).
(Yes, strace
has the -u
option, but it doesn't help if the setuid
program is being run as part of a whole chain of processes in a specific
environment and you can't just run it directly. It would be nice if root
could use 'strace -f ...
' for this, but alas it doesn't work.)
On a Solaris system you could use DTrace for this. SystemTap is the rough Linux equivalent and, although much less polished and not as well documented, it does work. Here is the crude SystemTap script that I used:
probe syscall.* { en = execname(); ui = uid(); eui = euid(); if (en == "<redacted>") { printf("%s(%d): %s(%s)", en, pid(), name, argstr); if (ui != eui) { printf(" as %d/%d ", ui, eui); } else { printf(" as %d ", ui); } } } probe syscall.*.return { en = execname(); if (en == "<redacted>") { printf("= %s\n", retstr); } }
This produces output with system call arguments and return values helpfully decoded for you; it looks like:
<redacted>(14087): open("/etc/passwd", O_RDONLY) as 2315/0 = 3
[...]
<redacted>(14087): close(1) as 2315/0 = -9 (EBADF
)
(In some ways this is nicer than DTrace. But the lack of documentation
on what sort of information you can get about system calls and so on
really hurts; I had to read the source for the syscall tapset in order
to find out about name
, argstr
, retstr
, and so on.)
Note that, despite the presence of the PID in the output, this isn't
really useful for tracing if more than one instance of the program is
running at once. That would take more SystemTap magic than I know so far
(or worse output and some postprocessing). Also, since stap
is kind
of slow you'll want to run it with the -v
flag so that you know when
it's actually finished checking, compiling, and enabling your tracing.
One of the things that the documentation isn't very clear about is that
the execname()
function returns the bare command name of the current
process and not its full path. (There is probably a way to extract the
full path if you need it. I didn't, so I didn't go digging.)
All in all, I would have to score my first real exposure to SystemTap as a reasonably pleasant experience. Although there were a bunch of frustrating bits, it did work, it gave me what I wanted to know, and it wasn't particularly difficult to do or to work out how to do it (and it didn't take particularly long).
How fast various ssh ciphers are
Periodically it surprises people to learn this, but ssh is not
necessarily very fast (in the bandwidth sense). It's plenty fast for
normal interactive use, but this speed issue can matter if you are
making large transfers with scp
, rsync
, or the like; depending on
your environment, ssh can go significantly slower than wire speed.
Ssh is slow because it has to encrypt and decrypt everything that goes over the wire, and this is a CPU-bound operation. How much time this takes depends on how fast the machines at each end are (the faster the better) and on which cipher ssh picks, because they vary significantly in speed.
Citing numbers is dangerous since yours are going to vary a lot, but here's some representative ones from Dell 2950s running 32-bit Ubuntu 8.04 with gigabit Ethernet:
- the fastest cipher is
arcfour
, at a transfer rate of about 90 Mbytes/sec;arcfour128
andarcfour256
are about as fast within the probable margins for error of my testing.(This is still less than 80% of the full TCP/IP wire speed, and you can get gigabit wire speed on machines with much less CPU power than 2950s.)
- the slowest cipher is
3des-cbc
, at 19 Mbytes/sec. aes128-cbc
, the normal OpenSSH default cipher, is reasonably fast at 75 Mbytes/sec; this is the fastest non-arcfour speed.
That ssh's default cipher is among the fastest ones means that you can probably not worry about this unless you are transferring a lot of data and need it to go as fast as possible (in which case you should explicitly use arcfour).
(And of course all of this is relevant only if the rest of the system can read and write the data fast enough.)
All of this is with no compression. Since compression trades CPU usage for lower bandwidth, you should only turn it on if you're bandwidth-constrained to start with. (And on a multi-core machine you should consider doing the compression yourself, so that one core can be compressing while ssh is using the other core to do the ciphering.)