Wandering Thoughts archives

2009-07-31

How fast various ssh ciphers are

Periodically it surprises people to learn this, but ssh is not necessarily very fast (in the bandwidth sense). It's plenty fast for normal interactive use, but this speed issue can matter if you are making large transfers with scp, rsync, or the like; depending on your environment, ssh can go significantly slower than wire speed.

Ssh is slow because it has to encrypt and decrypt everything that goes over the wire, and this is a CPU-bound operation. How much time this takes depends on how fast the machines at each end are (the faster the better) and on which cipher ssh picks, because they vary significantly in speed.

Citing numbers is dangerous since yours are going to vary a lot, but here's some representative ones from Dell 2950s running 32-bit Ubuntu 8.04 with gigabit Ethernet:

  • the fastest cipher is arcfour, at a transfer rate of about 90 Mbytes/sec; arcfour128 and arcfour256 are about as fast within the probable margins for error of my testing.

    (This is still less than 80% of the full TCP/IP wire speed, and you can get gigabit wire speed on machines with much less CPU power than 2950s.)

  • the slowest cipher is 3des-cbc, at 19 Mbytes/sec.

  • aes128-cbc, the normal OpenSSH default cipher, is reasonably fast at 75 Mbytes/sec; this is the fastest non-arcfour speed.

That ssh's default cipher is among the fastest ones means that you can probably not worry about this unless you are transferring a lot of data and need it to go as fast as possible (in which case you should explicitly use arcfour).

(And of course all of this is relevant only if the rest of the system can read and write the data fast enough.)

All of this is with no compression. Since compression trades CPU usage for lower bandwidth, you should only turn it on if you're bandwidth-constrained to start with. (And on a multi-core machine you should consider doing the compression yourself, so that one core can be compressing while ssh is using the other core to do the ciphering.)

SshSpeed written at 02:00:15; Add Comment

2009-07-27

Why you should do code reviews for sysadmin scripts

Through my experiences over the past while, I've come around to the view that you should try to have code reviews for sysadmin shell scripts. There's two reasons for this, and they both have to do with the fact that the Bourne shell is not really a programming language.

First, you want code reviews so that other people can convince you that your clever Bourne shell idioms are a little too clever. People's tastes and standards for this vary widely, and you're writing scripts not just for yourself but for your co-workers as well; black-box scripts (ones that no one but you can touch) ultimately don't help anyone.

(And some times you have strong disagreements over the best way to do something and need to come to an agreement on what the local style will be.)

Second and more importantly, because there is a lot of Bourne shell arcana (and beartraps) that people don't know, especially junior people. There are all sorts of clever but not obvious ways of doing things, or conditions that it's not obvious that you need to handle. If you don't know the particular trick that you need, you wind up writing shell scripts that do things inefficiently or miss cases or have subtle bugs.

(It's not just 'junior people' who miss shell idioms; for example, I've learned a number of new-to-me ones through WanderingThoughts, both in writing entries and in the comments people have written.)

Code review of people's scripts gives you an opportunity to pass on these tricks and arcana (and in a way that people may find easier to remember than some great big list of 'nifty shell tricks'), and in the process you improve the quality of your local scripts. As a bonus, you may even learn new ones yourself.

(This is especially good if it means that you fix an overlooked condition in a script before it goes off in someone's face. No one likes to make a mistake in a script that causes problems in production, and it's probably especially demoralizing for junior people and, to put it one way, not a great way to convince them that they're competent to write scripts and should keep on doing so.)

ScriptCodeReviews written at 00:39:24; Add Comment

2009-07-16

Another reason to safely update files that are looked at over NFS

Suppose that you are writing a script on one system but testing it on another (perhaps the first system is the one that has your full editing environment setup). You go along in your cycle of edit, save, run, edit, save,

./testscript: Stale NFS file handle

What just happened?

You've run into the issue of safely updating files that are read over NFS, even though you weren't reading the file at the time you saved it.

In theory, every time an NFS client needs to turn a name into an NFS filehandle it should go off and ask the server. In practice, for efficiency NFS clients generally cache this name to filehandle mapping information for some amount of time (how long varies a lot). Usually no one notices, but you got unlucky; when you tried to run the script, the second machine had cached the filehandle for the old version of the file, which no longer exists, and when it tried to read the file the NFS server told it 'go away, that's a stale NFS filehandle'.

Running scripts isn't the only thing that can get stale filehandle errors because of cached mappings, it's just one of the more obvious ones because you actually get error messages. I believe that test is another case (although I haven't yet demonstrated this in a controlled test):

if [ test -f /some/nfs/file ]; then
  ...
fi

I believe that this will silently fail if the client's cache is out of date, as the client kernel winds up doing a GETATTR on a now-invalid NFS filehandle (because test will stat() the file to see if it's a regular file or not).

SafelyUpdatingNFSFilesII written at 00:52:21; Add Comment

2009-07-11

What can go wrong in making NFS mounts

Now that we know what goes on in NFS mounts, we can see that there are any number of moving parts that can go wrong:

  • the RPC portmapper refuses to talk to you (possibly because a firewall gets in the way, possibly because it has been set up with tcpwrappers based restrictions).
  • the NFS mount daemon refuses to talk to you, possibly because it usually insists that clients use a reserved port, or there could be another firewall problem since it uses a different port than the portmapper.

  • the NFS mount daemon thinks that you don't have sufficient permissions, so it refuses to give you an NFS filehandle.

  • the kernel NFS server refuses to talk to you, possibly because of yet another firewall issue.

  • the NFS filehandle you get back is broken.
  • the kernel NFS server refuses to accept the filehandle that the NFS mount daemon gave you. This is especially fun, because sometimes mount will claim that the filesystem was successfully mounted but any attempt to do anything to it will fail or hang.

Most versions of mount will at least give you different error messages in the various sorts of cases, generally of increasing peculiarity and opaqueness as you move down this list.

(Thus, I have seen mount report 'invalid superblock' on NFS mount attempts.)

NFSMountMalfunctions written at 02:01:21; Add Comment

2009-07-06

How you could do a shared root directory with NFS

In a previous entry I made an offhand comment that diskless clients still needed a separate / filesystem for each client. This is true for how diskless clients were generally implemented, but technically not true in general; it's possible to build a diskless client environment with even a shared root directory.

The truth is that most of the contents of / are common between all machines; there is just not that much system-specific information in the root filesystem, especially if your diskless machines were generic (which they usually were). So all you need for a shared root is to put all of that system-specific information in a separate filesystem (well, a separate directory hierarchy) and then arrange to mount that filesystem in a fixed place very early on in the boot process.

(Then you make all of the system-specific files in / be symlinks that point into the fixed mountpoint for the system-specific directory.)

How do the generic boot scripts in the generic / know which system's directory to mount? Clearly you need a piece of system-specific information to know what system you are, but fortunately diskless machines already have one, namely their IP address, which they know by the time they can NFS-mount the root filesystem.

I doubt that this is a novel idea, so why didn't any Unix vendor do this back in the days when diskless systems were big? I don't know for sure, but I suspect that it was a combination of there being a number of painful practical issues that would have to be solved, plus there's probably not all that much disk space to be saved. Using separate / filesystems for each diskless client was simpler enough to win.

(You could also get most of the savings with hardlinks and cleverness, although I don't know if any Unix vendor officially supported that.)

SharedNFSRoot written at 01:38:29; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.