## The information theory reason for assuming non-secret cryptography algorithms

June 8, 2022

One of the cornerstones of modern cryptography algorithms is Kerckhoffs's principle, "a cryptosystem should be secure even if everything about the system, except the key, is public knowledge". The reasons for having your cryptography algorithms be public knowledge are usually explained in pragmatic terms (eg), and they're all very good reasons, but for a while it's been rattling around in my head that there's also a mathematical information theory basis to say this.

In a sense, the fundamental problem in cryptosystems is distributing, managing, and correctly using secrets ('keys'). If we can assume perfect key distribution and key management (including use) for arbitrary sized keys, we can use one-time pads and have totally secure communications. All other cryptosystems exist because we can't make this assumption in practice; they're all attempts to get secure enough cryptography with more tractable key distribution, management, and use.

As it happens, we know something about what helps and hurts in keeping secrets. The shorter the secret is (without being guessable in too short a time), the more random it is, and the fewer people who know it, the better. Ideally it will be known by only one person, as in asymmetric cryptography, or only two people, as in symmetric cryptography with keys created on the fly.

(Information theory may even have provided us with mathematical proofs of all of these, under appropriate assumptions.)

If a cryptosystem's algorithm is part of its 'secret key', all three of these things are violated. The algorithm or the code implementing it is definitely not random (it will have a lot of structure, making 'guessing' it easier), it will be comparatively large, and it must be distributed to a lot of people who 'know' it at least in the sense of having access to it (and some of them will literally know it).

We also know that such a large secret key is mathematically unnecessary (to the best of our current understanding of various cryptosystems). We have strongly secure cryptosystems that require only modest and more or less random keys that can be narrowly distributed. From an information theory perspective, using large, not so random, widely distributed keys (or parts of keys) is inefficient; at best you're using more (secret) bits for the same end result.

(This is one of the entries I write to get something out of my head because it's been living there for too long. I suspect this information theory take on cryptosystem secrecy is already well known in the field, partly because it feels so obvious once you start thinking about it.)

If a cryptosystem's algorithm is part of its 'secret key', all three of these things are violated. The algorithm or the code implementing it is definitely not random (it will have a lot of structure, making 'guessing' it easier), it will be comparatively large, and it must be distributed to a lot of people who 'know' it at least in the sense of having access to it (and some of them will literally know it).

You say "cryptosystem's algorithm", singular. There are however systems where the encryption and decryption algorithm are different. See NSA's JOSEKI algorithm, which seems to have been designed for protecting sensitive firmware: deleting the entire firmware may not be possible in case of device compromise, but wiping just the key could be easier/faster.

So getting at the (firmware) decryption algorithm wouldn't help you in figuring out the encryption algorithm.

JOSEKI seems to have first been specified around 1992, with an enhacement in 1998. Recent COMSEC product spec sheets, and procurement requirements, also now often mention something called "WATARI", which may be a new split-algorithm system.

By J. Perry at 2022-06-09 09:15:01:

Thank you for this. This explanation definitely seems more rigorous, yet it isn't any harder to understand, so it makes me wonder why this isn't the standard explanation given in cryptography textbooks.
There is just one sentence that's going over my head: "The shorter the secret is (without being guessable in too short a time), the more random it is." A longer secret can potentially have more entropy than a short one, right? I thought we preferred the shorter secrets simply because random bits are costly to generate.

By cks at 2022-06-09 14:43:27:

Broadly speaking, the shorter a secret is the easier and more efficient it is to (securely) distribute it and hold it without having even a partial leak. The ideal secret is no longer than it needs to be for the degree of security you want. It can't be too small because then it's too fast to guess (in the sense of generating suitable keys at random and trying them).

(The degree of secrecy a given key length gives you depends on the cryptosystem and also on your assumptions about future partial breaks, quantum computing, and so on.)

J. Perry:

There is just one sentence that's going over my head: "The shorter the secret is (without being guessable in too short a time), the more random it is."

I think that you misread and that this is not a complete sentence. The full quote was this:

The shorter the secret is (without being guessable in too short a time), the more random it is, and the fewer people who know it, the better.

This can be read two ways. The way you quoted it makes me think you read it like this:

The shorter the secret is (without being guessable in too short a time), the more random it is.

And the fewer people who know it, the better.

But in the context of the preceding sentence I believe Chris intended it like this:

As it happens, we know something about what helps and hurts in keeping secrets:

1. the shorter the secret is (without being guessable in too short a time)
2. the more random it is
3. the fewer people who know it

… the better.

Chris: correct? J.: assuming I am – does this help?

By cks at 2022-06-13 11:56:33:

Belatedly: Aristotle Pagaltzis, your version is correct (ie, what I intended).

Written on 08 June 2022.