The information theory reason for assuming non-secret cryptography algorithms
One of the cornerstones of modern cryptography algorithms is Kerckhoffs's principle, "a cryptosystem should be secure even if everything about the system, except the key, is public knowledge". The reasons for having your cryptography algorithms be public knowledge are usually explained in pragmatic terms (eg), and they're all very good reasons, but for a while it's been rattling around in my head that there's also a mathematical information theory basis to say this.
In a sense, the fundamental problem in cryptosystems is distributing, managing, and correctly using secrets ('keys'). If we can assume perfect key distribution and key management (including use) for arbitrary sized keys, we can use one-time pads and have totally secure communications. All other cryptosystems exist because we can't make this assumption in practice; they're all attempts to get secure enough cryptography with more tractable key distribution, management, and use.
As it happens, we know something about what helps and hurts in keeping secrets. The shorter the secret is (without being guessable in too short a time), the more random it is, and the fewer people who know it, the better. Ideally it will be known by only one person, as in asymmetric cryptography, or only two people, as in symmetric cryptography with keys created on the fly.
(Information theory may even have provided us with mathematical proofs of all of these, under appropriate assumptions.)
If a cryptosystem's algorithm is part of its 'secret key', all three of these things are violated. The algorithm or the code implementing it is definitely not random (it will have a lot of structure, making 'guessing' it easier), it will be comparatively large, and it must be distributed to a lot of people who 'know' it at least in the sense of having access to it (and some of them will literally know it).
We also know that such a large secret key is mathematically unnecessary (to the best of our current understanding of various cryptosystems). We have strongly secure cryptosystems that require only modest and more or less random keys that can be narrowly distributed. From an information theory perspective, using large, not so random, widely distributed keys (or parts of keys) is inefficient; at best you're using more (secret) bits for the same end result.
(This is one of the entries I write to get something out of my head because it's been living there for too long. I suspect this information theory take on cryptosystem secrecy is already well known in the field, partly because it feels so obvious once you start thinking about it.)