Wandering Thoughts archives

2017-01-30

How you can abruptly lose your filesystem on a software RAID mirror

We almost certainly just completely lost a software RAID mirror with no advance warning (we'll know for sure when we get a chance tomorrow to power-cycle the machine in the hopes that this revives a drive). This comes as very much of a surprise to us, as we thought that this was not supposed to be possible short of simultaneous two drive failure out of the blue, which should be an extremely rare event. So here is what happened, as best we can reconstruct right now.

In December, both sides of the software RAID mirror were operating normally (at least as far as we know; unfortunately the filesystem we've lost here is /var). Starting around January 4th, one of the two disks began sporadically returning read errors to software RAID code, which caused the software RAID to redirect reads to the other side of the mirror but not otherwise complain to us about the read errors beyond logging some kernel messages. Since nothing showed up about these read errors in /proc/mdstat, mdadm's monitoring never sent us email about it.

(It's possible that SMART errors were also reported on the drive, but we don't know; smartd monitoring turns out not to be installed by default on CentOS 7 and we never noticed that it was missing until it was too late.)

In the morning of January 27th, the other disk failed outright in a way that caused Linux to mark it as dead. The kernel software RAID code noticed this, of course, and duly marked it as failed. This transferred all IO load to the first disk, the one that had been seeing periodic errors since January 4th. It immediately fell over too; although the kernel has not marked it as explicitly dead, it now fails all IO. Our mirrored filesystem is dead unless we can somehow get one or the other of the drives to talk to us.

The fatal failure here is that nothing told us about the software RAID code having to redirect reads from one side of the mirror to the other due to IO errors. Sure, this information shows up in kernel messages, but so does a ton of other unstructured crap; the kernel message log is the unstructured dumping ground for all sorts of things and as a result, almost nothing attempts to parse it for information (at least not in a standard, regular installation).

Well, let me amend that. It appears that this information is actually available through sysfs, but nothing actually monitors it (in particular mdadm doesn't). There is an errors file in /sys/block/mdNN/md/dev-sdXX/ that contains a persistent counter of corrected read errors (this information is apparently stored in the device's software RAID superblock), so things like mdadm's monitoring could track it and tell you when there were problems. It just doesn't.

(So if you have software RAID arrays, I suggest that you put together something that monitors all of your errors files for increases and alerts you prominently.)

LosingMirroredRAIDViaDiskErrors written at 00:48:53; Add Comment

2017-01-23

Linux desktops and pre-packaged machines from big vendors

Today, I tweeted:

I wonder if Dell or any of the big vendors makes reasonably-priced (mid-)tower systems with 4 3.5" drive bays and 32 GB of RAM.

There is a story behind this tweet, and it's in part a story of how far Linux has come over the time I've been using it.

For a long time, the only way to get a decent Linux (desktop) machine was to specify it from parts yourself. If you were so foolish as to buy a pre-packaged desktop from a major desktop vendor like Dell, HP, or IBM, you might well wind up with hardware that Linux didn't support or only supported badly. So this is what we routinely did at work and what I did for my home machine. My previous generation work machine here back in 2006 was a specified-from-scratch machine, and so is my current office workstation. But, well, Linux has come a long way since 2006, or even 2011. These days it's pretty mainstream and widely supported, at least on most of the kind of plain vanilla hardware that you find on ordinary desktops. Certainly my co-workers have gotten Dell desktops as their new desktop machines at least once and had no problems running modern Linuxes on them.

Which gets me around to the subject of my current office workstation, which has the same hardware as my current home machine and is roughly as old as that machine, which means that it is a bit over five years old. It still runs perfectly fine and performs well enough for the work that I do, but I need to face reality; just as hard disks wear out eventually, so do things like CPU and case fans, power supplies, and eventually motherboards and CPUs and so on. My office machine is going to die from hardware failure at some point, the only question is when. I would like to replace it with fresh hardware before then, and after five years of life seems like a good time to at least start thinking about it (and to get the gears turning, because around here they move slowly for this kind of thing).

When I started thinking about this, my first instinct was to once again specify a machine from scratch (even though I'm not sure who we'd get to build it for us any more). But, well, do I still have to do that these days? In fact, does it even make sense to do that? I don't necessarily have demanding needs, Linux is likely to run on everyone's pre-packaged desktops, and it really should be the case that Dell et al can build machines cheaper than we can, since Dell is doing it in bulk (although with less volume than in the past, given the general decline of the PC market). And buying a Dell is probably much easier to get through the university purchasing process than a custom-built machine from some small place.

(Or we could buy parts and I could have the fun adventure of assembling my new work machine myself. I'm sure it would be educational and people assure me it's not too hard, but it's probably at least an inefficient use of my work time. Not that universities necessarily care about that.)

Having looked at this a bit, I suspect that my needs are sufficiently esoteric that they push me into the area where Dell and company start selling us excessively expensive 'workstation' machines. This may well make 'specify from parts' the least expensive option, but this time around, unlike in the past, it seems worthwhile to at least check. And I can imagine being perfectly happy with a Dell or the like assuming that it has the basic features I need.

It makes me quietly happy for Linux that what once was an esoteric option that required careful hardware curation has moved to being something where I can generally assume it just works, on both servers and even desktops.

(I'm sure there's some hardware that doesn't quite work great, especially if you're right at the edge of newly released stuff, and of course graphics cards are their own sad story of closed source drivers. But my impression is that running into such hardware is now either uncommon or outright rare.)

Sidebar: What I need in an office workstation

My needs almost fit in a tweet: four or more 3.5" drive bays (five would be great), 32 GB of RAM, a processor at least as modern as the i5-2500 I currently have, an onboard Ethernet port and onboard sound (almost sure to be on anything), and either onboard graphics that can drive two displays at 1920x1200 at 60 Hz or a slot for a graphics card so I can drop my current card in (I'd prefer to use onboard graphics). It would be ideal if there was either a second Ethernet port or a PCI(E) card slot I could put an additional network card into.

(And of course a bunch of USB ports, including at least a few USB 3.0 ports. But everything has those.)

I wouldn't mind an optical drive, but I'm not going to turn down a vendor pre-packaged desktop if it lacks one. I simply don't burn or read discs at work very much these days, and we're looking to move even further away from them if possible. USB memory sticks are (or would be) just so much more convenient for installing machines and so on.

(This isn't what I'd like in a theoretical new machine, but work is unlikely to buy me things like the latest top-end i7 CPU even if I try to make rational arguments about how it has an expected lifetime of at least five years so it totally makes sense.)

MainlineLinuxDesktopHardware written at 23:15:25; Add Comment

2017-01-22

An Ubuntu default Bash setup that irritates me, especially for root

Bash itself has a number of option settings to limit what it puts into the (interactive) command history list, such as $HISTCONTROL and $HISTIGNORE. A stock, no-startup-file Bash shell does not set any of them and so saves everything into the history list. However, many systems give you a default .bashrc file that sets some options here. In particular, Ubuntu has a very irritating default: it enables the ignorespace option in both /etc/skel/.bashrc and /root/.bashrc.

What ignorespace does, for people who have never encountered this, is that if your bash command line starts with a space character (and perhaps any whitespace), it will not be saved into the history list. No access through cursor up, reverse search, anything; it's as if you didn't run the command. Now, I'm sure that there are people and situations where this makes sense, but I believe strongly that it's a bad default and in particular it's very annoying to have as a default for root.

You might ask why this is the case. Well, suppose you have a recipe with steps that look like this:

  • install Exim4 and related packages:
    apt-get install gnutls-bin swaks unrar
    apt-get install exim4-daemon-heavy exim4-doc-info exim4-doc-html exim4
    

When you're going through this recipe, the natural way to execute things is to cut and paste the entire line from the recipe into a (root) window on the machine you're installing or doing something to. This means that the lines you're pasting in will start with whitespace and will not be saved in history. Did you get interrupted and want to quickly cursor-up to see what the last command you ran was? You can't. Do you want to look in .bash_history later to see which version of the install instructions you used, as reflected in the commands? You can't.

(This is the format our build instructions are in, so this particularly gets my goat.)

There are other situations where you can be cutting and pasting things into sessions and include even a single space at the start, which will have the same effects. For that matter, you can be editing previous commands in a way that leaves a space at the start and again, the same thing happens. In my opinion, it ought to be a lot harder than this to exclude things from history.

(Having written this entry, I should go change our standard install stuff so it sets up a /root/.bashrc that has this removed. I don't think any of us will miss it; rather the reverse, probably.)

PS: I don't have a Debian system handy to check, but it's possible that this is a default that Ubuntu inherited from Debian instead of something that they decided on their own.

(Some searching turned up this bug-bash thread on why HISTCONTROL=ignorespace exists as an option (via). Debian or Ubuntu may have decided that this is an important enough usage case to make it a default. If so, I disagree.)

UbuntuBashIgnorespaceIrritation written at 02:24:10; Add Comment

2017-01-19

Thinking about how to add some SSDs on my home machine

It all started when I upgraded from Fedora 24 to Fedora 25 on my office workstation and then my home machine in close succession, and the work upgrade went much faster because my root filesystem was on SSDs. This finally pushed me over the edge to get a pair of SSDs for my home machine, as I've known I should do for a while. I now actually have the SSDs, but, well, I haven't put them into my home machine yet. You might wonder why, so let me put it this way: the next case I get will have at least six drive bays.

My current case has four drive bays (well, four conveniently usable 3.5" drive bays), and all four drive bays are used; two for the mirrored pair of system HDs, and two for the mirrored pair of data HDs. The SSDs will be replacing the system HDs (and pulling in things like my home directory filesystem from the data HDs), but I can't exactly unplug the HDs and put in the SSDs; I need to shift over, and to do that I need to temporarily have the SSDs in the system too. So I've been mulling over how best to do that, and in the mean time my SSDs have just been sitting there.

(If I had six drive bays it would be easy and I would have shoved in the SSDs almost immediately. And the delay is not just because I've been thinking; it's also because shuffling everything around is going to be kind of a hassle however I do it, and so I keep putting it off in favor of more interesting and pleasant things.)

I have a 3.5" to 2.5" dual SSD adaptor for the SSDs (I'm also using one at work), so a single open 3.5" drive bay will allow me to put both into the machine. A number of potential approaches have occurred to me:

  • My case has some 5.25" drive bays, which I'm not using. Maybe I could just temporarily rest the dual adaptor on the bottom of that area, run cables to it, and have that work. (The deluxe version would be to put the 3.5" to 2.5" adaptor in a 5.25" to 3.5" adaptor, but I don't have one of the latter sitting around and that feels like a lot of work.)

  • I could just temporarily run with the side of the case open and cables running to the SSDs. Don't laugh, one co-worker has been running with his machine opened up like this for years. It'd be awkward for me, though, because of where everything is physically (my co-worker has his open machine on his desk).

  • I could deliberately break the mirror of my system disks, remove one, and put the two SSDs in the drive slot freed up by that. It's not very likely that the remaining system disk will fail while I'm shifting over, and if it does I have the other system disk to swap back in.

Breaking the system disk mirror and removing one of the disks strikes me as the least crazy plan. However, it means I get to find out if my Fedora system is set up so that it will actually boot when one of the system disks goes away, or if it will throw up its hands because the shape of the RAID array is not exactly what it wants (this has been known to happen under some circumstances, although that wasn't a disk going missing). Certainly I'd hope that my Fedora 25 system will boot without problems there, but between general issues and systemd I don't have complete confidence here, and I can imagine scenarios that end up with me having to boot a rescue environment and try to glue my system back together again by hand.

(My system disk mirror doesn't just have the root filesystem; it also has /boot and swap, each as mirrored things. So systemd needs to be willing to bring up several RAID arrays in degraded mode in order to be able to get everything in /etc/fstab up.)

I expect that the easiest way to test this is to open the case up, shut the system down, pull the power connector for one of my system disks, and then try to boot the system. If it fails, I can shut everything down, plug the power connector back in, and hopefully everything will be back to being happy with the world. It would probably be more proper to take the disk offline in mdadm, but that may be less easily reversed if things then explode.

(My plan for the SSDs are about a 100 GB ext4 root filesystem (which will also get /boot), a bit of swap space, and then the rest of the space in a ZFS pool. The pool will get my home directory and various other things that fit where I care either about speed or about having ZFS's checksums for the data.)

PlanningHomeSSDShuffle written at 00:18:35; Add Comment

2017-01-17

Making my machine stay responsive when writing to USB drives

Yesterday I talked about how writing things to USB drives made my machine not very responsive, and in a comment Nolan pointed me to LWN's The pernicious USB-stick stall problem. According to LWN's article, the core problem is an excess accumulation of dirty write buffers, and they give some VM system sysctls that you can use to control this.

I was dubious that this was my problem, for two reasons. First, I have a 16 GB machine and I barely use all that memory, so I thought that allowing a process to grab a bit over 3 GB of them for dirty buffers wouldn't make much of a difference. Second, I had actually been running sync frequently (in a shell loop) during the entire process, because I have sometimes had it make a difference in these situations; I figured frequent syncs should limit the amount of dirty buffers accumulating in general. But I figured it couldn't hurt to try, so I used the dirty_background_bytes and dirty_bytes settings to limit this to 256 MB and 512 MB respectively and tested things again.

It turns out that I was wrong. With these sysctls turned down, my machine stayed quite responsive for once, despite me doing various things to the USB flash drive (including things that had had a terrible effect just yesterday). I don't entirely understand why, though, which makes me feel as if I'm doing fragile magic instead of system tuning. I also don't know if setting these down is going to have a performance impact on other things that I do with my machine; intuitively I'd generally expect not, but clearly my intuition is suspect here.

(Per this Bob Plankers article, you can monitor the live state of your system with egrep 'dirty|writeback' /proc/vmstat. This will tell you the number of currently dirty pages and the thresholds (in pages, not bytes). I believe that nr_writeback is the number of pages actively being flushed out at the moment, so you can also monitor that.)

PS: In a system with drives (and filesystems) of vastly different speeds, a global dirty limit or ratio is a crude tool. But it's the best we seem to have on Linux today, as far as I know.

(In theory, modern cgroups support the ability to have per-cgroup dirty_bytes settings, which would let you add extra limits to processes that you knew were going to do IO to slow devices. In practice this is only supported on a few filesystems and isn't exposed (as far as I know) through systemd's cgroups mechanisms.)

FixingUSBDriveResponsiveness written at 00:36:09; Add Comment

2017-01-16

Linux is terrible at handling IO to USB drives on my machine

Normally I don't do much with USB disks on my machine, either flash drives or regular hard drives. When I do, it's mostly to do bulk read or write things such as blanking a disk or writing an installer image to a flash drive, and I've learned the hard way to force direct IO through dd when I'm doing this kind of thing. Today, for reasons beyond the scope of this entry, I was copying a directory of files to a USB flash drive, using USB 3.0 for once.

This simple operation absolutely murdered the responsiveness of my machine. Even things as simple as moving windows around could stutter (and fvwm doesn't exactly do elaborate things for that), never mind doing anything like navigating somewhere in a browser or scrolling the window of my Twitter client. It wasn't CPU load, because ssh sessions to remote machines were perfectly responsive; instead it seemed that anything that might vaguely come near doing filesystem IO was extensively delayed.

(As usual, ionice was ineffective. I'm not really surprised, since the last time I looked it didn't do anything for software RAID arrays.)

While hitting my local filesystems with a heavy IO load will slow other things down, it doesn't do it to this extent, and I wasn't doing anything particularly IO-heavy in the first place (especially since the USB flash drive was not going particularly fast). I also tried out copying a few (big) files by hand with dd so I could force oflag=direct, and that was significantly better, so I'm pretty confident that it was the USB IO specifically that was the problem.

I don't know what the Linux kernel is doing here to gum up its works so much, and I don't know if it's general or specific to my hardware, but it's been like this for years and I wish it would get better. Right now I'm not feeling very optimistic about the prospects of a USB 3.0 external drive helping solve things like my home backup headaches.

(I took a look with vmstat to see if I could spot something like a high amount of CPU time in interrupt handlers, but as far as I could see the kernel was just sitting around waiting for IO all the time.)

PS: We have more modern Linux machines with USB 3.0 ports at work, so I suppose I should do some tests with one just to see. If this Linux failure is specific to my hardware, it adds some more momentum for a hardware upgrade (cf).

(This elaborates on some tweets of mine.)

USBDrivesKillMyPerformance written at 01:32:16; Add Comment

2017-01-10

Picking FreeType CJK fonts for xterm on a modern Linux system

Once I worked out how to make xterm show Chinese, Japanese, and Korean characters, I had to figure out what font to use. I discussed the general details of using FontConfig to hunt for CJK fonts in that entry, so now let's get down to details.

The Arch Linux xterm example uses 'WenQuanYi Bitmap Song' as its example CJK font. This is from the Wen Quan Yi font collection, and they're available for Fedora in a collection of wqy-*-fonts packages. So I started out with 'WenQuanYi Zen Hei Mono' as the closest thing that I already had installed on my system.

(Descriptions of Chinese fonts often talk about them being an 'X style' font. It turns out that Chinese has different styles of typography, analogous to how Latin fonts have serif and sans-serif styles; see here or here or here for three somewhat random links that talk about eg Heiti vs Mingti. Japanese apparently has a similar but simpler split, per here, with the major divisions being called 'gothic' and 'Mincho'. Learning this has suddenly made some Japanese font names make a lot more sense.)

Fedora itself has a Localization fonts requirements wiki page. The important and useful bit of this page is a matrix of language and the default and additional fonts Fedora apparently prefers for it. Note that each of Chinese, Japanese, and Korean pick different fonts here; there isn't one CJK font that's the first or even second preference for all of them. Since you have to pick only one font for xterm's CJK font, you may want to think about which language you care most about.

(This is probably where Han unification sticks its head up, too. Fedora talks about maybe influencing font rendering choices here on its Identifying fonts page.)

In Ubuntu, apparently some CJK default fonts have changed to Google's Noto CJK family. A discussion in that bug suggests that Fedora may also have changed its defaults to the Noto CJK fonts, contrary to what its wiki sort of implies. The Arch Wiki has its usual comprehensive list of CJK font options and there's also Wikipedia's general list. Neither particularly mentions monospaced fonts, though, assuming that this is even something that one has to consider in CJK fonts for xterm.

All of this led me to peer into the depths of /etc/fonts/conf.d on my Fedora machines to look for mentions of monospace. Here I found interesting configuration file snippets that said things like:

   <match>
       <test name="lang">
           <string>ja</string>
       </test>
       <test name="family">
           <string>monospace</string>
       </test>
       <edit name="family" mode="prepend">
       <string>Noto Sans Mono CJK JP</string>
       </edit>
   </match>

   <alias>
       <family>Noto Sans Mono CJK JP</family>
       <default>
           <family>monospace</family>
       </default>
   </alias>

I'm not really up on FontConfig magic, but this sure looked like it was setting up a 'Noto Sans Mono CJK JP' font as a monospace font if you wanted things in Japanese. There's also KR, SC (Simplified Chinese), and TC (Traditional Chinese) variants of Noto Sans Mono CJK lurking in the depths of my Fedora system.

After looking at an xterm using WenQuanYi Zen Hei Mono side by side with one using Noto Sans Mono CJK JP, I decided that the Noto version was probably better looking (on my very limited sample of CJK text, mostly in file names and font names) and also I felt slightly more confident in picking it, since it seemed more likely to be closer to how eg gnome-terminal was operating and also the general trend of CJK font choices in various Linuxes. I wish I could find out what CJK font(s) gnome-terminal was using, but the design of current versions makes that difficult.

(Some experimentation suggests that in my setup, gnome-terminal may be using VL Gothic here. I guess I can live with all of this, however it comes out; mostly I just want CJK characters to show up as something other than boxes or especially spaces.)

LinuxXTermFreeTypeCJKFonts written at 00:48:32; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.