Wandering Thoughts archives

2017-03-13

OpenSSH's IdentityFile directive only ever adds identity files (as of 7.4)

In some complicated scenarios (especially with 2FA devices), even IdentitiesOnly can potentially give you too many identities between relatively generic Host ... entries and host-specific ones. Since there is only so far it's sensible to push Host ... entries with negated hostnames before you wind up with a terrible mess, there are situations where it would be nice to be able to say something like:

Host *.ourdomain
  IdentityFile ...
  IdentityFile ...
  [...]

Host something-picky.ourdomain
  IdentityFile NONE
  IdentityFile /u/cks/.ssh/identities/specific
  IdentitiesOnly yes
  [...]

Here, you want to offer a collection of identities from various sources to most hosts, but there are some hosts that both require very specific identities and will cut your connection off if you offer too many identities (as mentioned back here).

I have in the past said that 'as far as I knew' IdentityFile directives were purely cumulative (eg in comments on this entry). This held out a small sliver of hope that there was some way of doing this that I either couldn't see in the manpages or that just wasn't documented. As it happens, I recently decided to look at the OpenSSH source code for 7.4 (the latest officially released version) to put this to rest once and for all, and the bad news is that I have to stop qualifying my words. As far as I can tell from the source code, there is absolutely no way of wiping out existing IdentityFile directives that have been added by various matching Host stanzas. There's an array of identities (up to the maximum 100 that's allowed), and the code only ever adds identities to it. Nothing removes entries or resets the number of valid entries in the array.

Oh well. It would have been nice, and maybe someday the OpenSSH people will add some sort of feature for this.

In the process of reading bits of the OpenSSH code, I ran across an interesting comment in sshconnect2.c's pubkey_prepare():

/*
 * try keys in the following order:
 * 	1. certificates listed in the config file
 * 	2. other input certificates
 *	3. agent keys that are found in the config file
 *	4. other agent keys
 *	5. keys that are only listed in the config file
 */

(IdentitiesOnly does not appear to affect this order, it merely causes some keys to be excluded.)

To add to an earlier entry of mine, keys supplied with -i fall into the 'in the config file' case, because what that actually means is 'keys from -i, from the user's configuration file, and from the system configuration file, in that order'. They all get added to the list of keys with the same function, add_identity_file(), but -i is processed first.

(This means that my earlier writeup of the SSH identity offering order is a bit incomplete, but at this point I'm sufficiently tired of wrestling with this particular undocumented SSH mess that I'm not going to carefully do a whole bunch of tests to verify what the code comment says here. Having skimmed the code, I believe the comment.)

sysadmin/SSHNoIdentityFileOverride written at 23:05:53;

What should it mean for a system call to time out?

I was just reading Evan Klitzke's Unix System Call Timeouts (via) and among a number of thoughts about it, one of the things that struck me is a simple question. Namely, what should it mean for a Unix system call to time out?

This question may sound pointlessly philosophical, but it's actually very important because what we expect a system call timeout to mean will make a significant difference in how easy it would be to add system calls with timeouts. So let's sketch out two extreme versions. The first extreme version is that if a timeout occurs, the operation done by the system call is entirely abandoned and undone. For example, if you call rename("a", "b") and the operation times out, the kernel guarantees that the file a has not been renamed to b. This is obviously going to be pretty hard, since the kernel may have to reverse partially complete operations. It's also not always possible, because some operations are genuinely irreversible. If you write() data to a pipe and time out partway through doing so (with some but not all data written), you cannot reach into the pipe and 'unwrite' all of the already sent data; after all, some of it may already have been read by a process on the other side of the pipe.

The second extreme version is that having a system call time out merely causes your process to stop waiting for it to complete, with no effects on the kernel side of things. Effectively, the system call is shunted to a separate thread of control and continues to run; it may complete some time, or it may error out, but you never have to wait for it to do either. If the system call would normally return a new file descriptor or the like, the new file descriptor will be closed immediately when the system call completes. In practice implementing a strict version of this would also be relatively hard; you'd need an entire infrastructure for transferring system calls to another kernel context (or more likely, transplanting your user-level process to another kernel context, although that has its own issues). This is also at odds with the existing system calls that take timeouts, which generally result in the operation being abandoned part way through with no guarantees either way about its completion.

(For example, if you make a non-blocking connect() call and then use select() to wait for it with a timeout, the kernel does not guarantee that if the timeout fires the connect() will not be completed. You are in fact in a race between your likely close() of the socket and the connection attempt actually completing.)

The easiest thing to implement would probably be a middle version. If a timeout happens, control returns to your user level with a timeout indication, but the operation may be partially complete and it may be either abandoned in the middle of things or completed for you behind your back. This satisfies a desire to be able to bound the time you wait for system calls to complete, but it does leave you with a messy situation where you don't know either what has happened or what will happen when a timeout occurs. If your mkdir() times out, the directory may or may not exist when you look for it, and it may or may not come into existence later on.

(Implementing timeouts in the kernel is difficult for the same reason that asynchronous IO is hard; there is a lot of kernel code that is much simpler if it's written in straight line form, where it doesn't have to worry about abandoning things part way through at essentially any point where it may have to wait for the outside world.)

unix/SystemCallTimeoutMeaning written at 01:03:40;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.