2009-04-07
Handling ssh to generic hostnames
(This idea is not from me, it's from R Francis Smith. It is just sufficiently nifty and wrong that I'm going to write it up for posterity.)
Suppose that you have a generic hostname, a hostname that either is
multiple machines (with multiple IP addresses) or a virtual host that
gets pointed to different physical machines from time to time. Further
suppose that inside your environment, your users ssh
to that machine,
or at least want to. The traditional problem with this is that for
good reasons ssh's host key checking will start
screaming about mismatching host keys the moment that you wind up
talking to a different physical machine that has, of course, different
host keys.
So, the ingenious evil solution for this problem is to have a Host
stanza for the generic hostname in your /etc/ssh/ssh_config
that
turns off the various ssh host key verification options, so that ssh
never even notices the mismatched host keys and thus never complains
about them. Yes, this is kind of unpleasant, but it is better than
the alternative (which is very close to not having useful generic
hostnames), and you can make it less risky by turning off password-based
authentication methods and other dangerous things.
This is a somewhat limited solution to the problem, since it only works within your systems. But that's probably the only place that you want it to work anyways.
(The simple evil solution to the problem is to give all of the physical hosts for the generic hostname the same host key. You probably don't want to do this.)
Sidebar: how to turn off ssh's host key checking
The options that you want are:
StrictHostKeyChecking no UserKnownHostsFile /dev/null
With these set, ssh can do all of the host key checking it wants to but it's never going to get anywhere, and so never gets in the way.
(I will assume that the generic hostname is not in your global known hosts file, because there is no reason to put it there since it doesn't have a constant key.)
The technical problems with 'sender stores messages' schemes
For some reason, people have an enduring like for new schemes for email where the sender stores the message until the recipient wants to read it (the most well known is D. J. Bernstein's Internet Mail 2000). Such schemes tend to handwave the social problems involved in a transition, but let's set that aside (along with why they won't stop spam) and talk about the practical technical problems, because from my perspective they are pretty bad.
If the sender stores the messages until they are retrieved by the recipient, both sides face a daunting series of problems:
- in a straightforward implementation, the user experience is going to
be terrible. Think of a version of IMAP where message retrieval has
random delays (more so than currently) and sometimes fails entirely.
- with SMTP, a sending machine can control its load when it is sending
out a lot of email; it only sends so many messages at once and so on.
With 'sender stores', the sending machine's load is controlled by all
of the receivers; send out a popular message that everyone wants to
read at once and, well, you have problems. And in general your load
will be much less predictable and controllable.
- the sender has a much harder time load-balancing their mail across
multiple machines; either you need sophisticated reverse proxying
load balancers, or you have to fix the server that specific recipients
will pull the message from at the time that you send the message.
Even this is not perfect load balancing, as a single receiver may
forward on their 'copy' of the message, bound to a specific server,
to more people than you expect or want.
- as a receiver, you lose spam filtering information; you get only
very minimal information until someone actually retrieves the
message. I think that this makes malicious content quite dangerous,
because the sending machine can generate the content on the fly
at the last moment for up to the moment customized malware.
(And you really, really don't want the access protocol to carry any information about what client the receiver is using.)
- the sender gains significant information about the habits of the
receivers. At a minimum the sender learns when they read email;
at the worst, the sender can determine their network location
(and in the process 'see through' forwarding). Applications to
phish spam are left as an exercise, especially in an environment
where the sender can generate the message on the fly.
- in general, both receivers and senders expose a much greater attack surface to each other than they do today. The receiver is now talking directly to the sender in more or less real time (tunneling through your firewall in the process), and the sender is now running the rough equivalent of an IMAP or POP3 server that is necessarily exposed to the entire Internet.
Really, in a sender storage world the most sensible thing to do as a receiver is to have your mail server immediately retrieve a copy of everything that you're sent unless it fails basic checks. Which is basically equivalent to checking the SMTP envelope information that mail servers have available today. (How equivalent it is depends on how much additional information, if any, is sent in the initial notification message.)
Another way to put this is that if you want to do anything with a message beyond discarding it based on very minimal information, you must retrieve the message. Thus, at the best all that 'sender stores message' can do is defer the message's transfer, and in the process it makes a bunch of things worse and more complicated.