Wandering Thoughts archives

2008-03-20

Why you should ratelimit messages that outside things can cause

Modern versions of NFS have a variety of authentication methods, and so one of the errors that a NFS server can give a client is 'your authentication method is too weak'; for example, the client could be sending plain old Unix UIDs and GIDs to a server that requires Kerberos to get strong distributed filesystem authentication. When this happens, the Linux kernel helpfully prints an error message about it:

call_verify: server somehost.cs requires stronger authentication.

In fact, it prints this message every time it gets an RPC reply with this error. (Some of you are wincing already.)

Our current NFS servers are creaky old Solaris 8 machines. One part of that creakiness is that every so often the kernel loses its mind and decides that some or all clients aren't using a strong enough authentication method to talk to some or all filesystems. When this happens, all NFS IO from the affected clients to the affected filesystems suddenly gets 'authentication too weak' errors.

If we are unlucky, this IO is being done by something active that doesn't notice IO errors. When this happens the machine is basically dead, almost entirely consumed by dumping this message to the console over and over again as fast as the console can print, and it is time for a magic SysRq reboot because nothing else works.

(We've lost more than one major server to this. It's not fun.)

I expect that with a properly behaving NFS server, you'd get this error once at mount time and the mount would fail. But as my example illustrates, you can't count on the outside world to work properly all the time, and that is exactly why you should rate-limit error messages that can be produced by the outside world.

Note that this doesn't just apply to the kernel, and it applies even if you are dumping messages to syslog. While syslogd will do rate-limiting of a sort, you and it will burn a bunch of CPU in the process.

(Yes, I'm going to try to report this to the Linux NFS people; if I can, I'll even try to create a patch. Unfortunately it probably won't help us, because we're running Ubuntu 6.06 and the Ubuntu people will probably not accept or backport such a specialized fix.)

RatelimitMessages written at 23:12:29; Add Comment

2008-03-18

Some things I dislike about the ASUS Eee

I should note that I actually like my Eee (for what it is). But every time I try to write some sort of review of it, these things bubble up to the top of my mind, so I am going to write them down to get rid of them.

In no particular order:

  • my one major issue is that I really want more vertical space. 480 pixels is pretty cramped and limiting; while I can fit in two overlapping, readable 80x24 terminal windows, I do feel like I'm peering in through a porthole, and web browsing works less well.

    (And every so often some of the program dialogs just plain run off the bottom of the screen.)

  • mine keeps terrible time when powered off, routinely drifting by multiple seconds in a day.

    (And inexplicably, no NTP client support seems to be included in the default software load.)

  • despite having no hard drive and a slow processor, the Eee is not a cool or silent machine. As a first time laptop user I was somewhat surprised at how hot it gets (apparently it is on the hot side of modern laptops), and it has an audible fan if you let it sit powered on.

  • if I let it sit powered off and unplugged from power, it slowly drains the battery. Well, I guess I now know why it draws two watts from the wall connector even when powered off (cf EeePowerConsumption), although I still have no idea what it's doing with the power.

  • the Eee doesn't even pretend to have security. A basic level of Unix security would be nice, and an option for encrypted disk space would be better, especially since Linux has both.

(And oh yeah, it would have been nice not to be rootable out of the box.)

EeeDislikes written at 23:46:42; Add Comment

2008-03-08

What controls Red Hat Enterprise's ethN device names

Since I just went digging for this the other day, here's what I know about what controls what Ethernet devices get named on Red Hat Enteprise (and probably also on Fedora, but I haven't looked at my Fedora systems in this level of detail).

  • if kudzu is enabled, it uses /etc/sysconfig/hwconf to name everything. If there is no such file or the data in it doesn't match current reality, various bad things happen.

    (You can probably hand-edit the file if necessary.)

  • otherwise, interface naming is controlled by the HWADDR setting in the ifcfg-* files in /etc/sysconfig/network-scripts. If there is no Ethernet address specified, you get no renaming.

The ifcfg files are used by both udev and the ifup scripts that actually bring interfaces up and so on. When udev detects a new network device (including at boot, I believe), it runs /lib/udev/rename_device, which searches the ifcfg-* files for a HWADDR that matches the new device and uses the DEVICE setting from that file to give a name to the new interface.

(A network device that is hotplugged after system boot also winds up running /etc/sysconfig/network-scripts/net.hotplug.)

During boot, the order of operation is udev first, then kudzu, and finally the network init script winds up ifup'ing all of the interfaces that are supposed to be running, potentially undoing any damage kudzu did (if kudzu left the ifcfg configuration files along, which is unlikely).

(You may gather that I have a pretty low opinion of kudzu; in fact, I have been turning it off on most of my systems for years. It was left enabled on this RHEL system mostly because I hadn't taken the time to audit what init scripts were getting run.)

RHELEthernetNaming written at 23:12:00; Add Comment

My problem with Ethernet naming on Red Hat Enterprise 5

Here's my problem: I have a bunch of identical 1U servers (SunFire X2100 M2s) with four onboard Ethernet ports, driven by two different chipsets (two nVidia ones, two Broadcom ones). I want to configure our RHEL installs so that no matter which physical unit I stuff the system disks into, the Ethernet ports come up with consistent names that match the ports on the back of the server; eth0 should always be the port labeled 'port 0' and so on.

(Since they have hotswap drive bays, we want to be able to easily swap drives between units in case of hardware failure or the like. It also simplifies general administration a bunch if the Ethernet naming matches the hardware naming.)

In the good old days, this was simple; just set up /etc/modprobe.conf to alias eth0 and eth1 to the tg3 driver and eth1 and eth2 to the forcedeth driver, and everything usually worked.

In the new world of udev, not so much; much like with Ubuntu, everything really wants to name things based on known Ethernet addresses, and there seems to be no way to control what order modules are loaded in. The furthest I've gotten is a configuration that does nothing with any 'new' Ethernet ports, so you have to log in on the console and change all of the HWADDR values in the ifcfg files to have the correct Ethernet addresses.

(To do this, you have to turn off kudzu with 'chkconfig --del kudzu'. If you leave it enabled, it will helpfully configure any 'new' Ethernet ports to do DHCP on boot, and in the process it will replace your working ifcfg files with new ones. Yes, it leaves the old files around with .bak extensions, but I am pretty sure that if you swap hardware twice you will lose them entirely.)

RHELEthernetNamingProblem written at 00:13:38; Add Comment

2008-03-06

Software RAID, udev, and failed disks

Suppose that you have a software RAID array. Suppose further that you have a disk or two fail spectacularly; they don't just have errors, they go offline completely.

Naturally, software RAID fails the disks out; you wind up with something in /proc/mdstat that looks like this:

md10 : active raid6 sdbd1[12] sdbc1[11] sdbb1[10] sdba1[9] sdaz1[13](F) sday1[7] sdax1[6] sdaw1[5] sdav1[14](F) sdau1[3] sdat1[2] sdas1[1] sdar1[0]

(Yes, this system does have a lot of disks. Part of it is that multipathed FibreChannel makes disks multiply like rabbits.)

So we want to remove the failed disks from the array (perhaps because we have pulled out their hot-swap drive sleds in order to swap new disks in):

# mdadm /dev/md10 -r /dev/sdav1
mdadm: cannot find /dev/sdav1: No such file or directory

This would be because udev removed the /dev nodes for the disks when they went offline, which is perfectly sensible behavior except it presents us with a bit of a chicken and egg problem.

(If this was a Fedora system with mdadm 2.6.2 I might be able to use the '-r failed' option, but this is a Red Hat Enterprise 5 system with mdadm 2.5.4, and I am out of luck. And if I wanted to remove just one of the two failed drives, I would still be out of luck even on Fedora.)

Reinserting the drives doesn't help, at least in this case, as the system sees them as entirely new drives and assigns them a different sd-something name. (It does this even if they are literally the same disk, because you artificially induced this failure by pulling the drive sleds in the first place.)

UdevWithFailedDisks written at 23:53:38; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.