Wandering Thoughts archives

2016-02-27

Sometimes brute force is the answer, Samba edition

Like many places, we have a Samba server so that users with various sorts of laptop and desktop machines can get at their files. For good reason the actual storage does not try to live on the Samba server but instead lives on our NFS fileservers. For similarly good reasons, people don't have separate Samba credentials; they use their regular Unix login and password. However, behind the scenes Samba has a separate login and password system, so we are actually creating and maintaining two accounts for people; a Unix one, used for most things, and a Samba one, used for Samba. This means that when we create a Unix account, we must also create a corresponding Samba account, which is done by using 'smbpasswd -a -n' (the password will be set later).

For a long time we've had an erratic problem with this, in that occasionally the smbpasswd -a would fail. Not very often, but often enough to be irritating (since fixing it took noticing and then manual intervention). Our initial theory was that our /etc/passwd propagation system was not managing to update the Samba server's /etc/passwd with the new login by the time we ran smbpasswd. To deal with this we wrote a wrapper around smbpasswd that explicitly waited until the new login was visible in /etc/passwd and dumped out copious information if something (still) went wrong. Surely we had solved the problem, we figured.

You can guess what happened next: no, we hadn't. Oh, it was clear that some of the problem was /etc/passwd propagation delays, because every so often we could see the wrapper script report that it had needed to wait. But sometimes smbpasswd still failed, reporting simply:

Unable to modify TDB passwd: NT_STATUS_UNSUCCESSFUL!
Failed to add entry for user <newuser>. 

We could have spent a lot of time trying to figure out what was going wrong in the depths of Samba and then how to avoid it, staring at logs, perhaps looking at strace output, maybe reading source, and so on and so forth. But we decided not to do that. Instead we decided to take a much simpler approach. We'd already noticed that every time this happened we could later run the 'smbpasswd -a -n <newuser>' command by hand, so we just updated our wrapper script so that if smbpasswd failed it would wait a second or two and try again.

This is a brute force solution, or really more like a brute force workaround. We haven't at all identified the cause or what we really need to do to fix it; we've simply identified a workaround that we can execute automatically without actually understanding the real problem. But it works (so far) and it did not involve us staring at Samba for a long time; instead we could immediately move on to productive work.

Sometimes brute force and pragmatics are the right answer, under the circumstances.

(It helps that account creation is a rare event for us.)

BruteForceSambaAccountCreation written at 00:27:37; Add Comment

2016-02-20

My two usage cases for Let's Encrypt certificates

As I mentioned yesterday, we unfortunately can't use Let's Encrypt certificates in production here. That doesn't mean I have no use for LE certificates, though. Instead I have two different ones.

My first usage case for LE certificates is as the first stop for temporary certificates for test machines at work. I not infrequently need to set up test versions of TLS-based services for various reasons, including testing configuration changes, operating system upgrades, and even whether or not I can make some random idea actually work. All of these cases need real, valid certificates because an ever increasing amount of software refuses to deal with self-signed certificates (at least in any reasonable way). Since it's very unlikely that I'll run a test server for anywhere close to 90 days, various sorts of LE certificate renewal issues are of little or no importance.

LE's rate limits mean that I may not be able to get a certificate from them when I want one (or renew an existing one if I'm about to recycle one of my generic virtual machines to test something else), but this is more than made up by the fact that I can try to get a LE certificate in minutes with absolutely no bureaucracy. If it works, great, I can go on with my real work; if not, either I put this particular project on the back burner for a few days or I get us to buy a commercial certificate and forget about the issue for a year.

(And when I can get a LE certificate for a general host name, I'm good for the next 90 days no matter what I'm doing with the host. Even though it's a little bit ugly, there's usually nothing I'm testing that requires a specific host name, or at least nothing that can't be fixed by hand editing a few configuration files for testing purposes.)

My second usage case is as the regular TLS certificates for my personal site, which is basically the canonical Let's Encrypt situation. Here I'm unlikely to run into rate limits and since I'm the only person getting certificates, I can coordinate with myself if it ever comes up. I do care about certificate renewal working smoothly, but on the other hand there are few enough certificates involved that if something doesn't work I can do things by hand and in an extreme case, even go back to my previous source for free TLS certificates. I'm also willing to run odd software in a custom configuration if it works for me, since I don't have to maintain things across a fleet of machines with co-workers; 'it works here for me' is good enough.

(And, while I care about my personal site, it is not 'production' in the way that work machines are. I can take risks with it that I wouldn't even dream of for work, or simply do things as experiments to see how they pan out. This is partly what Let's Encrypt is for me right now.)

These two usage cases wind up leaving me interested in different Let's Encrypt clients for each of them, but that's once again a subject for another entry.

LetsEncryptMyUsage written at 03:13:04; Add Comment

2016-02-19

We can't use Let's Encrypt on our production systems right now

I really like Let's Encrypt, the new free and automated non-profit TLS Certificate Authority. Free is hard to beat, especially around here, and automatically issued certificates that don't require tedious interaction with websites are handy. And in general I love people who're striking a blow against the traditional CA racket. Unfortunately, despite all of that, there's basically no prospect of us using LE certificates in production around here.

The problem is not any of the traditional ones you might think of. Browsers trust the LE certificates, and that LE only does basic 'Domain Validation' (DV) certificates is not an issue because those are what we use anyways. And I have no qualms about using a free CA; CAs are in a commodity business and LE is easier to deal with than the alternatives due to their automation. It's not even the short 90-day duration on their certificates (although that's a potential issue).

The problem for us is that Let's Encrypt (currently) has relatively low rate limits, and especially it has a limit of five certificates per domain per week. Even if LE interpreted this very liberally (applying it to just our department's subdomain instead of the entire university), this is probably nowhere near enough for our usage. We have more than five different servers doing TLS ourselves, never mind all of the web servers run by research groups or even individual graduate students. This isn't just an issue of having to carefully schedule asking for certificates (and the resulting certificate renewals); it's also a massive coordination problem among all of the disparate people who could request certificates. As far as I can tell, using LE certificates in production here would mean giving a very large number of people the power to stop us from being able to renew (production) certificates. That's just not a risk we can take, especially since you have to renew LE certificates fairly often.

(Sure, we'd renew well ahead of time and if there were problems we could buy a commercial TLS certificate to replace the LE one. But if we're going to have problems very often we can save ourselves the heartburn and the fire drill by just buying commercial certificates in the first place. The university may not value staff time very highly in general but our time is still worth some actual money, and commercial certificates are cheap.)

I do feel sad about this, as I'd certainly like to be able to use LE certificates in production here (and I'd prefer to use them, especially with automatically handled renewal). But I suspect that a big university is always going to be a corner case that LE's rate limits simply won't deal with. If the university got seriously into 'TLS for all web sites', we're probably talking about at least thousands of separate servers.

(This doesn't mean that I have no use for LE certificates here. But that's another entry.)

Sidebar: my views on multiple names on the same certificate

TLS certificates can be issued with multiple names by using SANs, which means that you can theoretically cut down the number of distinct certificates you need by cramming a bunch of names on to one certificate. LE is especially generous with how many SANs you can attach to one certificate.

My personal dividing line is that I'm only willing to put multiple names into a TLS certificate when all of the names will be used on the same server. If I'm putting fifteen virtual host names into a certificate that will be used on a single web server, that's fine. If I'm jamming fifteen different web servers into one TLS certificate and so I'm going to have fifteen copies of it (and its key) on fifteen hosts, that's not fine. I should get separate certificates, so that the damage is more limited if one of those hosts gets compromised.

LetsEncryptNoProduction written at 01:49:30; Add Comment

2016-02-11

My current views on using OpenSSH with CA-based host and user authentication

Recent versions of OpenSSH have support for doing host and user authentication via a local CA. Instead of directly listing trusted public keys, you configure a CA and then trust anything signed by the CA. This is explained tersely primarily in the ssh-keygen manpage and at somewhat more length in articles like How to Harden SSH with Identities and Certificates (via, via a comment by Patrick here). As you might guess, I have some opinions on this.

I'm fine with using CA certs to authenticate hosts to users (especially if OpenSSH still saves the host key to your known_hosts, which I haven't tested), because the practical alternative is no initial authentication of hosts at all. Almost no one verifies the SSH keys of new hosts that they're connecting to, so signing host keys and then trusting the CA gives you extra security even in the face of the fundamental problem with the basic CA model.

I very much disagree with using CA certs to sign user keypairs and authenticate users system-wide because it has the weakness of the basic CA model, namely you lose the ability to know what you're trusting. What keys have access? Well, any signed by this CA cert with the right attributes. What are those? Well, you don't know for sure that you know all of them. This is completely different from explicit lists of keys, where you know exactly what you're trusting (although you may not know who has access to those keys).

Using CA certs to sign user keypairs is generally put forward as a solution to the problem of distributing and updating explicit lists of them. However this problem already has any number of solutions, for example using sshd's AuthorizedKeysCommand to query a LDAP directory (see eg this serverfault question). If you're worried about the LDAP server going down, there are workarounds for that. It's difficult for me to come up with an environment where some solution like this isn't feasible, and such solutions retain the advantage that you always have full control over what identities are trusted and you can reliably audit this.

(I would not use CA-signed host keys as part of host-based authentication with /etc/shosts.equiv. It suffers from exactly the same problem as CA-signed user keys; you can never be completely sure what you're trusting.)

Although it is not mentioned much or well documented, you can apparently set up a personal CA for authentication via a cert-authority line in your authorized_keys. I think that this is worse than simply having normal keys listed, but it is at least less damaging than doing it system-wide and you can make an argument that this enables useful security things like frequent key rollover, limited-scope keys, and safer use of keys on potentially exposed devices. If you're doing these, maybe the security improvements are worth being exposed to the CA key-issue risk.

(The idea is that you would keep your personal CA key more or less offline; periodically you would sign a new moderate-duration encrypted keypair and transport them to your online devices via eg a USB memory stick. Restricted-scope keys would be done with special -n arguments to ssh-keygen and then appropriate principals= requirements in your authorized_keys on the restricted systems. There are a bunch of tricks you could play here.)

Sidebar: A CA restriction feature I wish OpenSSH had

It would make me happier with CA signing if you could set limits on the duration of (signed) keys that you'd accept. As it stands right now, it is only ssh-keysign with the CA that enforces any expiry on signed keys; if you can persuade the CA to sign with a key-validity period of ten years, well, you've got a key that's good for ten years unless it gets detected and revoked. It would be better if the consumer of the signed key could say 'I will only accept signatures with a maximum validity period of X weeks', 'I will only accept signatures with a start time after Y', and so on. All of these would act to somewhat limit the damage from a one-time CA key issue, whether or not you detected it.

SSHWithCAAuthenticationViews written at 01:07:55; Add Comment

2016-02-06

You can have many matching stanzas in your ssh_config

When I started writing my ssh_config, years and years ago, I basically assumed that how you used it was that you had a 'Host *' stanza that set defaults and then for each host you might have a specific 'Host <somehost>' stanza (perhaps with some wildcards to group several hosts together). This is the world that looks like:

Host *
   StrictHostKeyChecking no
   ForwardX11 no
   Compression on

Host github.com
   IdentityFile /u/cks/.ssh/ids/github

And so on (maybe with a generic identity in the default stanza).

What I have only belatedly and slowly come to understand is that stanzas in ssh_config do not have to be used in just this limited way. Any number of stanzas can match and apply settings, not just two of them, and you can exploit this to do interesting things in your ssh_config, including making up for a limitation in the pattern matching that Host supports.

As the ssh_config manpage says explicitly, the first version of an option encountered is the one that's used. Right away this means that you may want to have two 'Host *' stanzas, one at the start to set options that you never, ever want overridden, and one at the end with genuine defaults that other entries might want to override. Of course you can have more 'Host *' stanzas than this; for example, you could have a separate stanza for experimental settings (partly to keep them clearly separate, and partly to make them easy to disable by just changing the '*' to something that won't match).

Another use of multiple stanzas is to make up for an annoying limitation of the ssh_config pattern matching. Here's where I present the setup first and explain it later:

Host *.cs *.cs.toronto.edu
  [some options]

Host * !*.*
  [the same options]

Here what I really want is a single Host stanza that applies to 'a hostname with no dots or one in the following (sub)domains'. Unfortunately the current pattern language has no way of expressing this directly, so instead I've split it into two stanzas. I have to repeat the options I'm setting, but this is tolerable if I care enough.

(At this point one might suggest that CanonicalizeHostname could be the solution instead. For reasons beyond the scope of this entry I prefer for ssh to leave this to my system's resolver.)

There are undoubtedly other things one can do with multiple Host entries (or multiple Match entries) once you free yourself from the shackles of thinking of them only as default settings plus host specific settings. I know I have to go through my .ssh/config and the ssh_config manpage with an eye to what I can do here.

SSHConfigMultipleStanzas written at 01:19:39; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.