A file permissions and general deployment annoyance with Certbot
The more we use Certbot, the more I become convinced that it isn't written by people who actually operate it in anything like the kind of environment that we do (and perhaps not at all, although I hope that the EFF uses it for their own web serving). I say this because while Certbot works, there are all sorts of little awkward bits around the edges in practical operation (eg). Today's particular issue is a two part issue concerning file permissions on TLS certificates and keys (and this can turn into a general deployment issue).
Certbot stores all of your TLS certificate information under
/etc/letsencrypt/live, which is normally owned by root and is
root-only (Unix mode 0700). Well, actually, that's false, because
normally the contents of that directory hierarchy are only symlinks
/etc/letsencrypt/archive, which is also owned by root and
root-only. This works fine for daemons that read TLS certificate
material as root, but not all daemons do; in particular, Exim reads
them as the Exim user and group.
The first issue is that Certbot adds an extra level of permissions
to TLS private keys. As covered by Certbot's documentation, from
Certbot version 0.29.0, private keys for certificates are specifically
root-only. This means that you can't give Exim access to the TLS
keys it needs just by chgrp'ing
/etc/letsencrypt/archive to the Exim group and then making them
mode 0750; you must also specifically chgrp and chmod the private
key files. This can be automated with a deploy hook script, which
will be run when certificates are renewed.
(Documentation for deploy hooks is hidden away in the discussion of renewing certificates.)
The second issue is that deploy hooks do exactly and only what they're documented to do, which means that deploy hooks do not run the first time you get a certificate. After all, the first time is not a renewal, and Certbot said specifically that deploy hooks run on renewal, not 'any time a certificate is issued'. This means that all of your deployment automation, including changing TLS private key permissions so that your daemons can access the keys, won't happen when you get your initial certificate. You get to do it all by hand.
(You can't easily do it by running your deployment script by hand, because your deployment script is probably counting on various environment variables that Certbot sets.)
We currently get out of this by doing the chgrp and chmod by hand when we get our initial TLS certificates; this adds an extra manual step to initial host setup and conversions to Certbot, which is annoying. If we had more intricate deployment, I think we would have to force an immediate renewal after the TLS certificate had been issued, and to avoid potentially running into rate limits we might want to make our first TLS certificate be a test certificate. Conveniently, there are already other reasons to do this.