OCSP Stapling always faced a bunch of hard problems

August 1, 2024

One reaction to my entry about how the Online Certificate Status Protocol (OCSP) is basically dead is to ask why OCSP Stapling was abandoned along with OCSP and why it didn't catch on. The answer, which will not please people who liked OCSP Stapling, is that OCSP Stapling was always facing a bunch of hard problems and it's not really a surprise that it failed to overcome them.

If OCSP Stapling was to really deliver serious security improvements, it had to be mandatory in the long run. Otherwise someone who had a stolen and revoked certificate could just use the certificate without any stapling and have you fall back to trusting it. The OCSP standard provided a way to do this, in the form of the 'OCSP Must Staple' option that you or your Certificate Authority could set in the signed TLS certificate. The original plan with OCSP Stapling was that it would just be an optimization to basic OCSP, but since basic OCSP turned out to be a bad idea and is now dead, OCSP Stapling must stand on its own. As a standalone thing, I believe that OCSP Stapling has to eventually require stapling, with CAs normally or always providing TLS certificates that set the 'must staple' option.

Getting a web server to do OCSP Stapling requires both software changes and operational changes. The basic TLS software has to provide stapled OCSP responses, getting them from somewhere, and then there has to be something that fetches signed OCSP responses from the CA periodically and stores them so that the TLS software could use them. There are a lot of potential operational changes here, because your web server may go from a static frozen thing that does not need to contact things in the outside world or store local state to something that needs to do both. Alternately, maybe you need to build an external system to fetch OCSP responses and inject them into the static web server environment, in much the same way that you periodically have to inject new TLS certificates.

(You could also try to handle this at the level of TLS libraries, but things rapidly get challenging and many people will be unhappy if their TLS library starts creating background threads that call out to Certificate Authority sites.)

There's a lot of web server software out there, with its development moving at different speeds, plus people then have to get around to deploying the new versions, which may literally take a decade or so. There are also a whole lot of people operating web servers, in a widely varied assortment of environments and with widely varied level of both technical skill and available time to change how they operate. And in order to get people to do all of this work, you have to persuade them that it's worth it, which was not helped by early OCSP stapling software having various operational issues that could make enabling OCSP stapling worse than not doing so.

(Some of these environments are very challenging to operate in or change. For example, there are environments where what is doing TLS is an appliance which only offers you the ability to manually upload new TLS certificates, and which is completely isolated from the Internet by design. A typical example is server management processors and server BMC networks. Organizations with such environments were simply not going to accept TLS certificates that required a weekly, hands-on process (to load a new set of OCSP responses) or giving BMCs access to the Internet.)

All of this created a situation where OCSP Stapling never gathered a critical mass of adoption. Software for it was slow to appear and balky when it did appear, many people did not bother to set stapling up even when what they were using eventually supported it, and it was pretty clear to everyone that there was little benefit to setting up OCSP stapling (and it was dangerous if you told your CA to give you TLS certificates with OCSP Must Staple set).

Looking back, OCSP Stapling feels like something designed for an earlier Internet, one that was both rather smaller and much more agile about software and software deployment. In the (very) early Internet you really could roll out a change like this and have it work relatively well. But by the time OCSP Stapling was being specified, the Internet was lot like that any more.

PS: As noted in the comments on my entry on OCSP's death, another problem with OCSP Stapling is that if used pervasively, it effectively requires CAs to create and sign a large number of mini-certificates on a roughly weekly basis, in the form of (signed) OCSP responses. These signed responses aren't on the critical latency path of web browser requests, but they do have to be reliable. The less reliable CAs are about generating them, the sooner web servers will try to renew them (for extra safety margin if it takes several attempts), adding more load.


Comments on this page:

By r bridges at 2024-08-02 12:35:00:

Personally, I felt like the announcement by Let's Encrypt came out of nowhere. Till I read that, I thought it was "common knowledge" that certificate revocation lists (CRLs; the proposed replacement for OCSP, and literally designed for an earlier internet) were as dead as non-stapled OCSP, and must-staple OCSP was fine.

As I said in my comments on your previous post, most TLS-capable programs (curl, wget, Python, web servers that take client certificates) don't download CRLs, and I'm not aware of any ongoing work to make them do so. CRLs, even if "summarized" as some have proposed, don't solve most of the problems you've noted. They bring back the non-stapled-OCSP problem of what to do if they can't be fetched, and of libraries maybe "wanting" to fetch data in the background.

I've seen comments that Let's Encrypt plans to issue certificates with short lifetimes, like a week or two. If you don't like loading a new OCSP response every week, I doubt you'll like that. And the CAs will still be having to sign stuff on a roughly weekly basis, with the added cost of pushing them to certificate-transparency logs. (But I'm not aware of any officially announced plan; maybe this will be optional.)

The explanations by you and the CAs all make sense, but this feels like a non-solution. Kind of like the proposal to replace the leap second with a leap minute or hour: instead of minor software breakage every year or two, we'll get major breakage in a hundred or a few thousand years—but, realistically, it's a way to kill the idea while pretending to respect certain obligations.

By r bridges at 2024-08-03 18:04:07:

Since this was posted, I've been wondering whether a zk-SNARK could be used to prove something like "I've processed all relevant revocation data up to DATE, and serial number X did not appear." If so, it would provide basically all the client-side advantages of OCSP stapling, without the CA having to do any extra work.

It seems similar to the kind of guarantees wanted for zero-knowledge coin protocols. I don't see any obvious reason why it would require the certificate's private key, or any private key, which means that the work could be delegated—like to an external certificate-monitoring server that pushes the generated proof to each web server.

Written on 01 August 2024.
« We may want /usr/bin/python to be Python 3 sooner than I expected
Modern web PKI (TLS) is very different than it used to be »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Thu Aug 1 23:41:59 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.