Wandering Thoughts archives

2005-12-31

Notes on getting a Solaris hardware inventory

Being able to find out what hardware is in a random machine is one of those things you don't think about very much until you inherit responsibility for a bunch of machines that you didn't build yourself.

The best hardware inventory program I've used is SGI's hinv (although it doesn't have enough disk information). Linux has decent hardware inventory support, but not bundled into a single command; you have to look through a bunch of /proc files and know a few commands like lspci. Unfortunately, Solaris is less friendly.

The old-fashioned way to get hardware information is to look at the kernel's boot messages; on Solaris this is in syslog or via dmesg. However, these logs get aged away if the system has been up for a while. (I've been known to arrange for kernel syslog messages to never expire, but I haven't set that up on my Solaris systems yet.)

The best program seems to be prtdiag, which gives CPU, memory, and some hardware slot information (and works for non-root users, always a bonus). There's also prtconf and a number of others, but they don't seem to give much additional useful information about hardware.

The names of stuff in /devices has a some information, but I suspect a good familiarity with Solaris device driver names is needed for best results. (Solaris /proc is for processes only, so there is nothing like Linux's collection of informative files.)

(People seem to use Magnicomp's sysinfo a fair bit, but it's commercial software (with a 30 day free trial), and binary packages on systems without real package managers make me twitchy. And its installer has glitches that don't inspire confidence.)

SolarisHinvNotes written at 18:11:12; Add Comment

2005-12-09

A surprising hazard of running as root all the time

We have some machines that are 'no user-operable parts inside' setups; as part of that, they have no user logins, just root. (Yes, yes, running as root all the time is bad, but on these boxes almost all we'd ever do with a plain login is su to root.)

I'm attuned to all of the regular hazards of this, but today I stumbled over a new one: how long it takes to notice that /var had accidentally wound up mode 0750 (and owned by a group that didn't have hardly anything in it) on a Solaris 2.4 machine.

Of course, root doesn't get permission denied messages, and most of the obvious things were running as root and kept on working. About the only sign was a large collection of files called things like 'mailAAAa00087' scribbled in /var/tmp. It turned out that these files were complaints from cron about being unable to run lp cron jobs because it couldn't change to lp's home directory, and bounce messages talking about 'lp... Can't create output'.

So I looked at lp's home directory, /usr/spool/lp, which looked perfectly fine and I could even cd into it as root. Only when I did 'su lp' and tried it did I get a 'permission denied' error and started backtracking to discover the /var permissions problem.

Sidebar: so how did it happen?

What I think happened is that someone built a tar file of a /var/named directory they wanted to move around, but instead of tarring up the directory, they cd'd into the directory and tared up '.'. Then they moved it to this machine and accidentally untarred it in /var instead of making a /var/named directory and untarring it there. As part of unpacking, tar dutifully set the permissions on all of the files and directories in the tarball, including '.'.

So the moral is: tarfiles that include . are annoying and dangerous in more than one way.

AHazardOfRoot written at 22:50:20; Add Comment

2005-11-24

Solaris 9's slow patch installs

Yesterday was my first time installing the Solaris 9 recommended patch set on a production machine; we rolled it onto a basically unpatched server. Because it was a server, I did it in single user mode (the patch set recommends this, as some patches in the patch set say explicitly to apply them in single-user mode).

I already knew that installing the patch set was achingly slow on my test machine, but my test machine is an Ultra 10 so I wasn't surprised. The machine from yesterday was a Sunfire V210, which has modern CPUs and more importantly modern amounts of memory and fast SCSI disks.

It still took an hour.

There are 134 patches in the patch set, so Solaris was only able to average a patch every 26 seconds. Considering how much work a modern machine can do in 26 seconds, I believe I can safely say that the Solaris patch install system is hideously inefficient.

(And, as previously noted it spews incomprehensible and alarming messages on the screen.)

Fortunately it doesn't demand I answer any questions during its run, so next time around I'll know to just go back to my office for a while. Still, an hour is an irritatingly long time to have a production server down in single-user mode.

SlowPatchInstalls written at 22:54:58; Add Comment

2005-11-19

Solaris 9 sendmail irritations

Here's how to give a system administrator a heart attack: the default Solaris 9 sendmail configuration apparently allows other machines that your Solaris machine thinks are in your local domain to relay through you. I say 'apparently' because there's nothing in the sendmail.mc about this, and nothing clear in the generated /etc/mail/sendmail.cf either.

In other fun discoveries, the default sendmail configuration is also set up to relay all your mail through a machine called 'mailhost' in your domain. We don't have such a machine in our subdomain here, so god knows where any administrative mail my test machine may have been trying to send for the past month or so may have wound up.

Solaris 9 was shipped in 2002, and Sun actually started to care about security by that point; for example, it ships with tcpwrappers. In 2002, I would have thought that Sun would know that any open relaying is a bad idea.

In fact it turns out that Solaris sendmail's default configuration has other dubious features, even for 2002: for example, it will happily accept MAIL FROM addresses without domains or with unresolvable domains. None of this is set visibly and explicitly in their supplied .mc files; it is hiding away in the 'solaris-generic' set of settings that those use.

The light at the end of the tunnel is that Solaris 9 actually includes another set of settings, 'solaris-antispam'; changing from 'solaris-generic' to these will give you much stronger settings. (These are in fact the default Sendmail settings, so Solaris deliberately shipped with a less secure, more open to spam and abuse sendmail configuration.)

SolarisRelayingSendmail written at 00:34:18; Add Comment

2005-10-08

Solaris 9 'Power management'

I had another Solaris 9 learning experience today: I came in to find my ssh sessions to my Ultra 10 test machine dead, because the machine was powered off. This was more than a little bit disconcerting, since the last thing I had left it doing was installing the current Solaris 9 patch set. (It took sufficiently long that I'd had to go home before it finished.)

Powering the machine on showed not a normal boot sequence, but a message about restoring the system. This caused me to remember that when I had installed the system, I'd said yes to an offer to have 'power management' software installed. (Unfortunately the installer does not have very many clear explanations of what the software packages all do.)

In the PC world I usually operate in, 'power management' is things like spinning down disks and dropping into low-power CPU states when the machine is idle. In the SPARC world, it turns out that 'power management' is turning the machine off entirely.

Fortunately I was able to find the dtpower program after some quick Googling. Unfortunately dtpower doesn't run over a ssh X connection for some reason, so I had to fire up dtlogin, log in, and run it to shut this feature off. (There is probably a way to fire up the X server and the environment from a console login, but starting dtlogin was faster than trying to figure it out.)

(This whole episode is my fault, not Solaris 9's. I should have read the documentation before firing up the installer, and certainly before answering installer questions I didn't fully understand. But at least I've stubbed my toe on this now, in case I ever have to deal with Sparcs that mysteriously power themselves off every so often.)

SolarisPowerManagement written at 03:59:40; Add Comment

2005-10-06

First irritations with Solaris 9

As with Fedora Core 4, I haven't been using Solaris 9 long enough to have given it a fair shake. So instead of any sort of review, this is just a collection of things that have irritated me about it on first exposure.

I'll start with a nice simple one:

# useradd -m -c 'Chris Siebenmann' cks
UX: useradd: ERROR: Unable to create the home directory: Operation not applicable.

This is on a default configured Solaris 9 machine, straight out of the 'take more or less the defaults' install. Is it too much to ask that the apparent best way to add users from the command line actually works?

(The reason this fails is that /home, the location of nominal user home directories, is actually an automounter setup. But useradd doesn't know about this. Whoops. For extra bonus fun, you actually have to make an entry in the /etc/auto_home automounter map to get things to work.)

Installer irritations:

  • the installer asked me to reinsert a CD-ROM it had already asked for (Solaris 9 Software disk 2, after the documentation). This is just sloppy; you should be able to order your entire install series so it asks for each CD-ROM only once.

  • practically every time it had me swap CD-ROMs, it stopped to prompt me if I really wanted things from the CD-ROM installed. This was despite walking me through an entire earlier 'what stuff do you want installed' step that led it to wanting those CD-ROMs.

  • periodically it would pop up a dialog about continuing in 30 seconds if I did nothing, or I could continue right away, or I could pause. The first time I rolled my eyes and clicked 'Continue'. The next time I realized that this dialog was obscuring a lower dialog with informative options that I might wish to inspect and perhaps change.

  • having previously wanted the sort of interaction normally seen in needy young children, the installer decided to automatically reboot at the end.
    Update: mea culpa; this one is my fault. Right near the start, the Solaris 9 installer asks you if you want to automatically reboot at the end. (Then you are asked sixty zillion other questions so you forget this.)

I'd criticize the installer for not looking very pretty, but it was running on an 8 bit deep framebuffer. (Probably not a very fast one, either. Ultra-10s are not where you go if you want even 1998-era PC graphic basics, like 32-bit colour.)

Then there's the small issue of patch installer error messages, which are lovely things like:

Installation of 117067-01 failed. Return code 2.
[...]
Installation of 112233-12 failed. Return code 8.

Neither are helpful error messages. Does one or both of them mean that it's a patch not applicable to this system? Does one or both of them mean that something important has gone wrong during patch installation?

(It appears that return code 2 means 'update already installed' and return code 8 means 'this update isn't applicable to your system'. But to find this out I had to read the detailed error log. It would not have killed Sun to print an actually useful error message instead of 'Return code N'.)

Solaris9FirstIrritations written at 00:16:55; Add Comment

By month for 2005: Oct Nov Dec; after 2005.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.