Wandering Thoughts archives

2016-01-31

The tradeoffs of having ssh-agent hold all of your SSH keys

Once you are following the initial good practices for handling SSH keys, you have a big decision to make: will you access all of your encrypted keys via ssh-agent, or will at least some of them be handled only by ssh? I don't think that this is a slam dunk decision, so I want to write down both sides of this (and then give my views at the end).

The first and biggest thing that might keep you from using ssh-agent for everything is if you need to use SSH agent forwarding. The problem with agent forwarding is twofold. First and obviously, it gives the remote host (where you are forwarding the agent to) the ability to authenticate with all of the keys in ssh-agent, protected only by whatever confirmation requirements you've put on them. This is a lot of access to give to a potentially compromised machine. The second is that it gives the remote host a direct path to your ssh-agent process itself, a path that an attacker may use to attack ssh-agent in order to exploit, say, a buffer overflow or some other memory error.

(Ssh-agent is not supposed to have such vulnerabilities, but then ssh itself wasn't supposed to have CVE-2016-0777.)

In general, there are two advantages of using ssh-agent for everything. The first is that ssh itself never holds unencrypted private keys (and you can arrange for ssh to have no access to even the encrypted form, cf). As we saw in CVE-2016-0777, ssh itself is directly exposed to potentially hostile input from the network, giving an attacker an opportunity to exploit any bugs it has. Ssh-agent is one step removed from this, giving you better security through more isolation.

The second is that ssh-agent makes it more convenient to use encrypted keys and therefor makes it more likely that you'll use them. Without ssh-agent, you must decrypt an encrypted key every time you use it, ie for every ssh and scp and rsync and so on. With ssh-agent, you decrypt it once and ssh-agent holds it until it expires (if ever). Some people are fine with constant password challenges, but others very much aren't (me included). Encrypted keys plus ssh-agent is clearly more secure than unencrypted keys.

The general drawback of putting all your keys into ssh-agent is that ssh-agent holds all of your keys. First, this makes it a basket with a lot of eggs; a compromise of ssh-agent might compromise all keys that are currently loaded, and they would be compromised in unencrypted form. You have to bet and hope that the ssh-agent basket is strong enough, and you might not want to bet all of your keys on that. The only mitigation here is to remove keys from ssh-agent on a regular basis and then reload them when you next need them, but this decreases the convenience of ssh-agent.

The second drawback is that while ssh-agent holds your keys, anyone who can obtain access to it can authenticate things via any key it has (subject only to any confirmation requirements placed on a given key). Even if you don't forward an agent connection off your machine, you are probably running a surprisingly large amount of code on your machine. This is not just, eg, browser JavaScript but also anything that might be lurking in things like Makefiles and program build scripts and so on.

(I suppose any attacker code can also attempt to dump the memory of the ssh-agent process, since that's going to contain keys in decrypted form. But it might be easier for an attacker to get access to the ssh-agent socket and then talk to it.)

Similar to this, unless you have confirmations on key usage, you yourself can easily do potentially dangerous operations without any challenges. For example, if you have your Github key loaded, publishing something on Github is only a 'git push' away; there is no password challenge to give you a moment of pause. Put more broadly, you've given yourself all the capabilities of all of the keys you load into ssh-agent; they are there all the time, ready to go.

(You can mitigate this in various ways (cf), but you have to go out of your way to do so and it makes ssh-agent less convenient.)

My personal view is that you should hold heavily used keys in ssh-agent for convenience but that there are potentially good reasons for having less used or more dangerous keys kept outside of ssh-agent. For example, if you need your Github key only rarely, there is probably no reason to have it loaded into ssh-agent all the time and it may be easier to never load it, just use it directly via ssh. There is a slight increase in security exposure here, but it's mostly theoretical.

sysadmin/SSHAgentTradeoffs written at 02:37:28; Add Comment

2016-01-30

Some good practices for handling OpenSSH keypairs

It all started with Ryan Zezeski's question on Twitter:

Twitter friends: I want to better manage my SSH keys. E.g. different pairs for different things. Looking for good resources. Links please.

I have opinions on this (of course) but it turns out that I've never actually written them all down for various reasons, including that some of them feel obvious to me by now. So this is my shot at writing up what I see as good practices for OpenSSH keypairs. This is definitely not the absolutely best and most secure practice for various reasons, but I consider it a good starting point (but see my general cautions about this).

There are some basic and essentially universal things to start with. Use multiple SSH keypairs, with at least different keypairs for different services; there is absolutely no reason that your Github keypair should be the keypair that you use for logging in places, and often you should have different keypairs for logging in to different places. The fundamental mechanic for doing this is a .ssh/config with IdentityFile directives inside Host stanzas; here is a simple example.

(My personal preference is to have different keypairs for each separate machine I'll ssh from, but this could get hard to manage in a hurry if you need a lot of keypairs to start with. Consider doing this only for keypairs that give you broad access or access to relatively dangerous things.)

Encrypt all of your keys. Exactly what password(s) you should use are a tradeoff between security and convenience, but simply encrypting all keys stops or slows down many attacks. For instance, the recent OpenSSH issue would only have exposed (some of) your encrypted keys, which are hopefully relatively hard to crack.

Whenever possible, restrict where your keys are accepted from. This is a straightforward way to limit the damage of a key compromise at the cost of some potential inconvenience if you suddenly need to access systems from an abnormal (network) location. In addition, if you have some unencrypted keys because you need some automated or unattended scripts, consider restricting what these keys can do on the server by using a 'command=' setting in their .ssh/authorized_keys line; an example where we do this is here (see also this). You probably also want to set various no-* options, especially disabling port forwarding.

At this point we're out of truly universal things, as the path splits depending on whether you will access all of your keys via ssh-agent or whether at least some of them will be handled only by ssh (with passphrase challenges every time you use them). There is no single right answer (and covering the issues needs another entry), but for now I'll assume that you'll access all keys via ssh-agent. In this case you'll definitely want to read the discussion of what identities get offered to a remote server and use IdentitiesOnly to limit this.

If you need to ssh to hosts that are only reachable via intermediate hosts, do not forward ssh-agent to the intermediate hosts. Instead, use ProxyCommand to reach through the intermediates. This is sometimes called SSH jump hosts and there are plenty of guides on how to do it. Note that modern versions of OpenSSH have a -W argument for ssh that makes this easy to set up (you no longer need things like netcat on the jumphost).

(There are some cases that need ssh agent forwarding, but plain 'I have to go through A to get to B' is not one of them.)

With lots of keys loaded, your ssh-agent is an extremely large basket of eggs. There are several things you can do here to reduce the potential damage of an attacker gaining access to its full authentication power, although all of them come with convenience tradeoffs:

  • Set infrequently used or dangerous keys so that you'll have to confirm it before they can be used, by loading them with ssh-add's -c 'have ssh-agent ask for confirmation' argument.

  • Treat some keys basically like temporary sudo privileges by loading them into ssh-agent with a timeout via ssh-add's -t argument. This will force you to reload the key periodically, much as you have to sudo and then re-sudo periodically.

  • Arrange to wipe all keys from ssh-agent when you suspend, screenlock, or otherwise clearly leave your machine; my setup for this is covered here.

    (This is good practice in general, but it becomes really important when access to ssh-agent is basically the keys to all the kingdoms.)

You'll probably want to script some of these things to make them more convenient; you might have an 'add-key HOST' command or the like that runs ssh-add on the right key with the right -c or -t parameters. Such scripts will make your life a lot easier and thus make you less likely to throw up your hands and add everything to ssh-agent in permanent, unrestricted form.

(Also, check your ssh_config manpage to see if you have support for AddKeysToAgent. This can be used to create various sorts of convenient 'add to ssh-agent on first use' setups. This is not yet in any released version as far as I know but will probably be in OpenSSH 7.2.)

PS: You probably also want to set HashKnownHosts to yes. I feel conflicted about this, but it's hard to argue that it doesn't increase security and most people won't have my annoyances with it.

PPS: My personal views on SSH key types are that you should use ED25519 keys when possible and otherwise RSA keys (I use 4096 bits just because). Avoid DSA and ECDSA keys; the only time you should generate one is if you have to connect to a device that only supports DSA (and then the key should be specific to the device).

sysadmin/SSHKeyGoodPractices written at 01:54:16; Add Comment

2016-01-29

What SSH identities will be offered to a remote server and when

I've already written an entry on what SSH keys in your .ssh/config will be offered to servers, but it wasn't quite complete and I still managed to confuse myself about this recently. So today I'm going to try to write down in one place more or less everything I know about this.

Assuming that you're using ssh-agent and you don't have IdentitiesOnly set anywhere, the following is what keys will be offered to the remote server:

  1. All keys from ssh-agent, in the order they were loaded into ssh-agent.
  2. The key from a -i argument, if any.
  3. Any key(s) from matching Host or Match stanzas in .ssh/config, in the order they are listed (and matched) in the file. Yes, all keys from all matching stanzas; IdentityFile directives are cumulative, which can be a bit surprising.

    (If there are no IdentityFile matches in .ssh/config, OpenSSH will fall back to the default .ssh/id_* keys if they exist.)

(If you aren't using ssh-agent, only #2 and #3 are applicable and you can pretty much ignore the rest of this entry.)

If there is a 'IdentitiesOnly yes' directive in any matching .ssh/config stanza (whether it is in a 'Host *' or a specific 'Host <whatever>'), the only keys from ssh-agent that will be offered in #1 are the keys that would otherwise be offered in both #2 and #3. Unfortunately IdentitiesOnly doesn't change the order that keys are offered in; keys in ssh-agent are still offered first (in #1) and in the order they were loaded into ssh-agent, not the order that they would be offered in if ssh-agent wasn't running.

Where the 'IdentitiesOnly yes' directive comes from makes no difference, as you'd pretty much expect. The only difference between having it in eg 'Hosts *' versus only (some) specific 'Host <whatever>' entries is how many connections it applies to. This leads to an important observation:

The main effect of a universal IdentitiesOnly directive is to make it harmless to load a ton of keys into your ssh-agent.

OpenSSH servers have a relatively low limit on how many public keys they will let you offer to them; usually it's six or less (technically it's a limit on total authentication 'attempts', which can wind up including eg a regular password). Since OpenSSH normally offers all keys from your ssh-agent, loading too many keys into it can cause authentication problems (how many problems you have depends on how many places you can authenticate to with the first five or six keys loaded). Setting a universal 'IdentitiesOnly yes' means that you can safely load even host-specific keys into ssh-agent and still have everything usable.

(This is the sshd configuration directive MaxAuthTries.)

Note that specifying -i does not help if you're offering too many keys through ssh-agent, because the ssh-agent keys are offered first. You must enable IdentitiesOnly as well, either in .ssh/config or as a command line option. Even this may not be a complete cure if your .ssh/config enables too many IdentityFile directives and those keys are loaded into ssh-agent so that they get offered first.

If the key for -i is loaded into your ssh-agent, OpenSSH will use the ssh-agent version for authentication. This will cause a confirmation check if the key was loaded with 'ssh-add -c' (and yes, this still happens even if the -i key is unencrypted).

(ssh-agent confirmation checks only happen when the key is about to be used to authenticate you, not when it is initially offered to the server.)

PS: you can see what keys you're going to be offering in what order with 'ssh -v -v ...'. Look for the 'debug2: key: ...' lines, and also 'debug1: Offering ...' lines. Note that keys with paths and marked 'explicit' may still come from ssh-agent; that explicit path just means that they're known through an IdentityFile directive.

Sidebar: the drawback of a universal IdentitiesOnly

The short version is 'agent forwarding from elsewhere'. Suppose that you are on machine A, with a ssh-agent collection of keys, and you log into machine B with agent forwarding (for whatever reason). If machine B is set up with up with universal IdentitiesOnly, you will be totally unable to use any ssh-agent keys that machine B doesn't know about. This can sort of defeat the purpose of agent forwarding.

There is a potential half way around this, which is that IdentityFile can be used without the private key file. Given a stanza:

Host *
  IdentitiesOnly yes
  IdentityFile /u/cks/.ssh/ids/key-ed2

If you have a key-ed2.pub file but no key-ed2 private key file, this key will still be offered to servers. If you have key-ed2 loaded into your ssh-agent through some alternate path, SSH can authenticate you to the remote server; otherwise ssh will offer the key, have it accepted by the server, and then discover that it can't authenticate with it because there is no private key. SSH will continue to try any remaining authentication methods, including more identities.

(This is the inverse of how SSH only needs you to decrypt private keys when it's about to use them.)

However, this causes SSH to offer the key all the time, using up some of your MaxAuthTries even in situations where the key is not usable. Unfortunately, as far as I can tell there is no way to tell SSH 'offer this key only if ssh-agent supports it', which is what we really want here.

sysadmin/SSHIdentitiesOffered written at 02:24:59; Add Comment

2016-01-28

Modern Django makes me repeat myself in the name of something

One of the things that basically all web frameworks do is URL routing, where they let you specify how various different URL patterns are handled by various different functions, classes, or whatever. Once you have URL routing, you inevitably wind up wanting reverse URL routing: given a handler function or some abstract name for it (and perhaps some parameters), the framework will generate the actual URL that refers to it. This avoids forcing you to hard-code URLs into both code (for eg HTTP redirections) and templates (for links and so on), which is bad (and annoying) for all sorts of reasons. As a good framework, Django of course has both powerful URL routing and powerful reverse URL generation.

Our Django web app is now about five years old and has not been substantially revised since it was initially written for Django 1.2. Back in Django 1.2, you set up URL routing and reversed routing something like this:

urlpatterns = patterns('',
   (r'^request/$', 'accounts.views.makerequest'),
   [...]

And in templates, you got reverse routing as:

<a href="{% url "accounts.views.makerequest" %}"> ... </a>

Here accounts.views.makerequest is the actual function that handles this particular view. This is close to the minimum amount of information that you have to give the framework, since the framework has to know the URL pattern and what function (or class or etc) handles it.

Starting in Django 1.8 or so, Django changed its mind about how URL reversing should work. The modern Django approach is to require that all of your URL patterns be specifically named, with no default. This means that you now write something like this:

urlpatterns = [
   url(r'^request/$', accounts.views.makerequest, name="makerequest"),
   [...]

And in templates and so on you now use the explicit name, possibly with various levels of namespaces.

Now, Django has a few decent reasons for wanting support for explicit reverse names on URL patterns; you can have different URL patterns map to the same handler function, for example, and you may care about which one your template reverses to. But in the process of supporting this it has thrown the baby out with the bathwater, because it has made there be no default value for the name. If you want to be able to do reverse URL mapping for a pattern, you must explicitly name it, coming up with a name and adding an extra parameter.

Our Django web app has 38 different URL patterns, and I believe all of them get reversed at some point or another. All of them have unique handler functions, which means that all of them have unique names that are perfectly good. But modern Django has no defaults, so now Django wants me to add a completely repetitive name= parameter to every one of them. In short, modern Django wants me to repeat myself.

(It also wants me to revise all of the templates and other code to use the new names. Thanks, Django.)

As you may guess, I am not too happy about this. I am in fact considering ugly hacks simply because I am that irritated.

(The obvious ugly hack is to make a frontend function for url() that generates the name by looking up the __name__ of the handler function.)

PS: Django 1.8 technically still supports the old approach, but it's now officially deprecated and will be removed in Django 1.10.

python/DjangoUrlReversingRepeatingMyself written at 00:12:58; Add Comment

2016-01-26

Why my home backup situation is currently a bit awkward

In this recent entry I mentioned that my home backup strategy is an awkward subject. Today I want to talk about why that is so, which has two or perhaps three sides; the annoyances of hardware, that disks are slow, and that software doesn't just do what I want, partly because I want contradictory things.

In theory, the way to good backups is straightforward. You buy an external disk drive enclosure and a disk for it, connect it to your machine periodically, and 'do a backup' (whatever that is). Ideally you will be disciplined about how frequently you do this. And indeed, relatively early on I set myself up to do this, except that back then I made a mistake; rather than get an external enclosure with both USB and eSATA, I got one with just USB because I had (on my machine at the time) no eSATA ports. To be more precise I got an enclosure with USB 2.0, because that's what was available at the time.

If you know USB 2.0 disk performance, you are now wincing. USB 2.0 disks are dog slow, at least on Linux (I believe I once got a benchmark result on the order of 15 MBytes/sec), and they also usually hammer the responsiveness of your machine into the ground. On top of that I didn't really trust the heat dissipation of the external drive case, which meant that I was nervous about leaving the drive powered on and running overnight or the like. So I didn't do too many backups to that external enclosure and drive. It was just too much of a pain for too long.

With my second external drive case and drive, I learned better (at least in theory); I bought a case with USB and eSATA. Unfortunately only USB 2.0, and then something in the combination of the eSATA port on my new machine and the case didn't work really reliably. I've been able to sort of work around that but the workaround doesn't make me really happy to have the drive connected, there's still a performance impact from backups, and the heat concerns haven't gone away.

(My replacement for the eSATA port is to patch a regular SATA port through the case. This works but makes me nervous and I think I've seen it have some side effects on the machine when the drive connects or disconnects. In general, eSATA is probably not the right technology here.)

This brings me to slow disks. I can't remember how fast my last backup run went, but between the overheads of actually making backups (in walking the filesystem and reading files and so on) and the overheads of writing them out, I'd be surprised if they ran faster than 50 MBytes/sec (and I suspect they went somewhat slower). At that rate, it takes an hour to back up only 175 GB. With current disks and hardware, backups of lots of data are just going to be multi-hour things, which does not encourage me to do them regularly at the best of times.

(Life would be different if I could happily leave the backups to run when I wasn't present, but I don't trust the heat dissipation of the external drive case that much, or for that matter the 'eSATA' connection. Right now I feel I have to actively watch the whole process.)

As I wrote up in more detail here, my ideal backup software would basically let me incrementally make full backups. Lacking something to do that, the low effort system I've wound up with for most things uses dump. Dump captures exact full backups of extN filesystems and can be compressed (and I can keep multiple copies), but it's not something you can do incrementally. Running dump against a filesystem is an all or nothing affair; either you let it run for as many hours as it winds up taking, or you abort it and get nothing. Using dump also requires manually managing the process, including keeping track of old filesystem backups and removing some of them to make space for new ones.

(Life would be somewhat different if my external backup disk was much larger than my system disk, but as it happens it isn't.)

This is far from an ideal situation. In theory I could have regular, good backups; in practice there is enough friction in all of the various pieces that I have de facto bad ones, generally only made when something makes me alarmed. Since I'm a sysadmin and I preach the gospel of backups in general, this feels especially embarrassing (and awkward).

(I think I see what I want my situation to look like moving forwards, but this entry is long enough without trying to get into that.)

linux/HomeBackupHeadaches written at 23:18:37; Add Comment

Low level issues can have quite odd high level symptoms (again)

Let's start with my tweet from yesterday:

So the recently released Fedora 22 libreswan update appears to have broken IPSec tunnels for me. Good going. Debugging this will be hell.

This was quite consistent: if I installed the latest Fedora 22 update to libreswan, my IPSec based point to point tunnel stopped working. More specifically, my home end (running Fedora 22) could not do an IKE negotiation with my office machine. If I reverted back to the older libreswan version, everything worked. This is exactly the sort of thing that is hell to debug and hell to get as a bug report.

(Fedora's libreswan update jumped several point versions, from 3.13 to 3.16. There could be a lot of changes in there.)

Today I put some time into trying to narrow down the symptoms and what the new libreswan was doing. It was an odd problem, because tcpdump was claiming that the initial ISAKMP packets were going out from my home machine, but I didn't see them on my office machine or even on our exterior firewall. Given prior experiences I suspected that the new version of libreswan was setting up IPSec security associations that were blocking traffic and making tcpdump mislead me about whether packets were really getting out. But I couldn't see any sign of errant SPD entries and using tcpdump at the PPPoE level suggested very strongly that my ISAKMP packets really were being transmitted. But at the same time I could flip back and forth between libreswan versions, with one working and the other not. So in the end I did the obvious thing: I grabbed tcpdump output from a working session and a non-working session and started staring at them to see if anything looked different.

Reading the packet dumps, my eyes settled on this (non-working first, then working):

PPPoE [ses 0xdf7] IP (tos 0x0, ttl 64, id 10253, offset 0, flags [DF], proto UDP (17), length 1464)
   X.X.X.X.isakmp > Y.Y.Y.Y.isakmp: isakmp 2.0 msgid 00000000: parent_sa ikev2_init[I]:
   [...]

PPPoE [ses 0xdf7] IP (tos 0x0, ttl 64, id 32119, offset 0, flags [DF], proto UDP (17), length 1168)
   X.X.X.X.isakmp > Y.Y.Y.Y.isakmp: isakmp 2.0 msgid 00000000: parent_sa ikev2_init[I]:
   [...]

I noticed that the packet length was different. The working packet was significantly shorter and the non-working one was not too far from the 1492 byte MTU of the PPP link itself. A little light turned on in my head, and some quick tests with ping later I had my answer: my PPPoE PPP MTU was too high, and as a result something in the path between me and the outside world was dropping any too-big packets that my machine generated.

(It's probably the DSL modem and DSL hop, based on some tests with traceroute.)

The reason things broke with the newer libreswan was that the newer version added several more cipher choices, which pushed the size of the initial ISAKMP packet over the actual working MTU. With the DF bit set in the UDP packet, there was basically no chance of the packet getting fragmented when it hit wherever the block was; instead it was just summarily dropped.

(I think I never saw issues with TCP connections because I'd long ago set a PPPoE option to clamp the MSS to 1412 bytes. So only UDP traffic would be affected, and of course I don't really do anything that generates large UDP packets. On the other hand, maybe this was a factor in an earlier mysterious network problem, which I eventually made go away by disabling SPDY in Firefox.)

What this illustrates for me, once again, is that I simply can't predict what the high level symptoms are going to be for a low level network problem. Or, more usefully, given a high level problem I can't even be sure if it's actually due to some low level network issue or if it has a high level cause of its own (like 'code changes between 3.13 and 3.16').

Sidebar: what happens with my office machine's ISAKMP packets

My office machine is running libreswan 3.16 too, so I immediately wondered if its initial ISAKMP packets were also getting dropped because of this (which would mean that my IPSec tunnel would only come up when my home machine initiated it). Looking into this revealed something weird: while my office machine is sending out large UDP ISAKMP packets with the DF bit set, something is stripping DF off and then fragmenting those UDP packets before they get to my home machine. Based on some experimentation, the largest inbound UDP packet I can receive un-fragmented is 1436 bytes. The DF bit gets stripped regardless of the packet size.

(I suspect that my ISP's DSL PPP touchdown points are doing this. It's an obvious thing to do, really. Interesting, the 1436 byte size restriction is smaller than the outbound MTU I can use.)

sysadmin/IKEAndMTUIssue written at 01:16:22; Add Comment

2016-01-25

A Python wish: an easy, widely supported way to turn a path into a module

Years ago I wrote a grumpy entry about Django 1.4's restructured directory layout and mentioned that I was not reorganizing our Django web app to match. In the time since then, it has become completely obvious that grimly sticking to my guns here is not a viable answer over the long term; sooner or later, ideally sooner, I need to restructure the app into what is now the proper Django directory layout.

One of the reasons that I objected to this (and still do) is the problem of how you make a directory into a module; simply adding the parent directory to $PYTHONPATH has several limitations. Which is where my wish comes in.

What I wish for is a simple and widely supported way to say 'directory /some/thing/here/fred-1 is the module fred'. This should be supported on the Python command line, in things like mod_wsgi, and in Python code itself (so you could write code that programmatically added modules this way, similar to how you can extend sys.path in code). The named module is not imported immediately, but if later code does 'import fred' (or any of its variants) it will be loaded from /some/thing/here/fred-1 (assuming that there is no other fred module found earlier on the import path). All of the usual things work from there, such as importing submodules ('import fred.bob') and so on. The fred-1 directory would have a __init__.py and so on as normal.

(Note that it is not necessary for the final directory name to be the same as the module name. Here we have a directory being fred-1 but the module is fred.)

Given PEP 302, I think it's probably possible to implement this in Python code. However, both PEP 302 and the entire current Python import process make my head hurt so I'm not sure (and there are probably differences between Python 2 and Python 3 here).

(I wrote some notes to myself about Python packaging a while back, which is partly relevant to this quest. I don't think .egg and .zip files let me do what I want here, even if I was willing to pack things up in .zips, since I believe their filenames are bound to the package/module name.)

python/PathIntoModuleWish written at 01:38:42; Add Comment

2016-01-24

Hostile HTTPS interception on the modern web is now increasingly costly and risky

One of the things that HTTPS is vulnerable to is a state level actor armed with enough money that is willing (and able) to compromise a CA and get certificates issued for sites that they want to run a MITM attack against. This is nothing new; it is the core security problem of SSL on the web, namely that any of the hundreds of CAs that are trusted by your browser can generate a certificate for anyone.

Technically, that is still the case. All of the trusted CAs in the world can still issue certificates for, say, gmail.com, and quite a lot of browsers will trust those certificates. But not Chrome. If you are an attacker and you try this against a Chrome user, Chrome will try hard to scream bloody murder about this back to Google (and refuse to go ahead). Then pretty soon Google's security people will get to write another blog post and your nice compromised CA will be lost to you (one way or another).

Chrome has been doing this for a while (this is part of how Google has gotten to write a number of blog posts about this sort of thing), but it is not alone. On the modern web, there are a steadily increasing number of things that are more or less automatically looking for and reporting bogus certificates and an increasing number of ways to block many of them from being useful to attack your site. On the one hand, many of the better things are not included in web browsers by default; on the other hand, many of the people that a state level actor is likely to be most interested in MITM'ing are exactly the sort of people who may install things like HTTPS Everywhere and enable its reporting features.

Based on what I've read from the security circles that I follow, the net effect of all of these changes is that mounting anything but an extremely carefully targeted MITM attack is almost certain to cost you the compromised CA you were able to exploit. Each compromised CA you have is good for exactly one attack, if that.

(See for example the Twitter conversation linked to here.)

This doesn't make HTTPS interception impossible, of course. CAs can still be compromised. But it means that no one is going to do this for anything except very high priority needs, which in practice makes us all safer by reducing how often it happens.

An important contributing factor to the increased chanciness of HTTPS interception is that browsers are increasingly willing to say no. There was a time when you could MITM a significant number of people with a plain old bogus certificate (no CA compromise required, just generate it on the fly in your MITM box). Those days are mostly over, especially for some popular sites, and increasingly even a real certificate from a compromised CA may not work due to things like HPKP.

web/HTTPSInterceptionNowRisky written at 01:54:02; Add Comment

2016-01-22

Browsers are increasingly willing to say no to users over HTTPS issues

One of the quiet sea changes that underpins a significant increase in the security of the modern HTTPS web is that browsers are increasingly willing to say no to users. I happen to think that this is a big change, but it's one that didn't really strike me until recently.

There was a time when the fundamental imperative of browsers was that if the user insisted enough, they could go ahead with operations that the browser was pretty sure were a bad idea; attempts to change this back in the days were met by strong pushback. The inevitable result of those decisions was that attackers who wanted to MITM people's HTTPS connections to places like Facebook could often just present a self-signed certificate generated by their MITM interceptor system and have most people accept it. When attackers couldn't do that, they could often force downgrades to unencrypted HTTP (or just stop upgrades from an initial HTTP connection to a HTTPS one); again, these mostly got accepted. People wrote impassioned security advice that boiled down to 'please don't do that' and tweaked and overhauled security warning UIs, but all of it was ultimately somewhat futile because most users just didn't care. They wanted their Facebook, thanks, and they didn't really care (or even read) beyond that.

(There are any number of rational reasons for this, including the often very high rate of false positives in security alerts.)

Over the past few years that has changed. Yes, most of the changes are opt-in on the part of websites, using things like HSTS and HPKP, but the really big sea changes is browsers mostly do not let users override the website settings. Instead, browsers are now willing to hard-fail connections because of HSTS or HPKP settings even if this angers users because they can't get to Facebook or wherever. Yes, browsers have a defense in that the site told them to do this, but in the past I'm not sure this would have cut enough ice to be accepted by browser developers.

(In the process browsers are now willing to let sites commit HSTS or HPKP suicide, with very little chance to recover from eg key loss or inability to offer HTTPS for a while for some reason.)

Obviously related to this is the increasing willingness of browsers to refuse SSL ciphers and so on that are now considered too weak, again pretty much without user overrides. Given that browsers used to accept pretty much any old SSL crap in the name of backwards compatibility, this is itself a welcome change.

(Despite my past views, I think that browsers are making the right overall choice here even if it's probably going to cause me heartburn sooner or later. I previously threw other people under the HTTPS bus in the name of the greater good, so it's only fair that I get thrown under it too sooner or later, and it behooves me to take it with good grace.)

web/BrowsersAndStrictHTTPS written at 22:37:24; Add Comment

Memory-safe languages and reading very sensitive files

Here is an obvious question: does using modern memory safe languages like Go, Rust, and so on mean that the issues in what I learned from OpenSSH about reading very sensitive files are not all that important? After all, the fundamental problem in OpenSSH came from C's unsafe handling of memory; all of the things I learned just made it worse. As it happens, my view is that if you are writing security sensitive code you should still worry about these things in a memory safe language, because there are things that can still go wrong. So let's talk about them.

The big scary nightmare is a break in the safety of the runtime and thus the fundamental language guarantees, resulting in the language leaking (or overwriting) memory. Of course this is not supposed to happen, but language runtimes (and compilers) are written by people and so can have bugs. In fact we've had a whole series of runtime memory handling bugs in JIT'd language environments that caused serious security issues; there have been ones in JavaScript implementations, the JVM, and in the Flash interpreter. Modern compiled languages may be simpler than these JIT environments, but they have their own complexities where memory bugs may lurk; Go has a multi-threaded concurrent garbage collector, for example.

I'm not saying that there will be a runtime break in your favorite language. I'm just saying that it seems imprudent to base the entire security of your system on an assumption that there will never be one. Prudence suggests defense in depth, just in case.

The more likely version of the nightmare is a runtime break due to bugs in explicitly 'unsafe' code somewhere in your program's dependencies. Unsafe code explicitly can break language guarantees, and it can show up in any number of places and contexts. For example, the memory safety of calls into many C libraries depends on the programmer doing everything right (and on the C libraries themselves not having memory safety bugs). This doesn't need to happen in the code you're writing; instead it could be down in the dependency of a dependency, something that you may have no idea that you're using.

(A variant is that part of the standard library (or package set) either calls C or does something else marked as unsafe. Go code will call the C library to do some sorts of name resolution, for example.)

Finally, even if the runtime is perfectly memory safe it's possible for your code to accidentally leak data from valid objects. Take buffer handling, for example. High performance code often retains and recycles already allocated buffers rather than churning the runtime memory allocator with a stream of new buffer allocations, which opens you up to old buffer contents leaking because things were not completely reset when one was reused. Or maybe someone accidentally used a common, permanently allocated global temporary buffer somewhere, and with the right sequence of actions an attacker can scoop out sensitive data from it. There are all sorts of variants that are at least possible.

The good news is that the scope of problems is narrower at this level. Since what you're leaking is the previous state of a specific valid object, an attacker needs the previous state of that sort of object to be something damaging. You don't get the kind of arbitrary crossovers of data that you do with full memory leaks. Still, leaks in a central type of object (such as 'generic byte buffers') could be damaging enough.

The bad news is that your memory safe language cannot save you from this sort of leak, because it's fundamentally an algorithmic mistake. Your code and the language is doing exactly what you told it to, and in a safe way; it is just that what you told it to do is a bad idea.

(Use of unsafe code, C libraries, various sorts of object recycling, and so on is not necessarily obvious, either. With the best of intentions, packages and code that you use may well hide all of this from you in the name of 'it's an implementation detail' (and they may change it over time for the same reason). They're not wrong, either, it's just that because of how you're using their code it's become a security sensitive implementation detail.)

programming/SafeReadingInSafeLanguages written at 02:24:08; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.