Bad patch management at Sun
Today, I tried to use ssh from my Solaris test machine to another machine for the first time in a while (I usually push stuff the other way):
$ scp pca-out.html linux:www/
dlopen(/usr/lib/gss/gl/mech_krb5.so): ld.so.1: ssh: fatal: /usr/lib/gss/gl/mech_krb5.so: open failed: No such file or directory
xmalloc: zero size
Uh oh, thought I. Had one of the patches I'd rolled on recently done something? There had been a Solaris SSH patch in the collection. So I nervously went off to one of the production machines:
$ ssh sol-serv
unable to initialize mechanism library [/usr/lib/gss/gl/mech_krb5.so]
Houston, we have a problem.
It turns out that this is a known issue, present even in recent Solaris 9 Recommended patch sets. According to one report the culprit is patch 113273-11, which does not properly check whether various Kerberos bits were installed before adding things that depend on them; this may be why Sun played peculiar games with some krb5 related patches, as noted on the pca notes page. If so, it's a pity that Sun didn't say anything specific and direct about it.
The whole set of bugs irritates me, because it shows that Solaris's package and patch management infrastructure is dangerously weak. In a real package management system, you cannot accidentally drop dependencies like this. No Debian upgrade and no set of Red Hat packages would have wound up like this; as a minimum they would have aborted at update time, rather than damaging my systems.
I also have to fault Sun's testing. It's one thing to have this happen in a patch, another thing to have it happen in a recommended patch, and an entirely unpleasant thing to have the patch make it into a recommended patch cluster. It's not like our Solaris configurations are exotic, either, regardless of what Sun thinks. (Not fixing it for more than a month doesn't help, either.)
Oh well. Time to become rather more familiar with how to load packages into Solaris systems, and what the effects of adding Kerberos packages to an existing system are.
Update, June 10th 2006: see FixingSolarisSsh for what I've figured out about how to work around this problem. (I wound up not adding the Kerberos packages, partly because it turns out not to help all that much.)