2015-03-31
Btrfs's mistake in limiting itself to two-way mirroring
Recently, I tweeted:
That btrfs still will not do more than two-way mirroring immediately disqualifies it for many serious uses as far as I'm concerned.
On the surface this may sound like a silly limitation to be annoyed at btrfs over, something that only a small number of people playing in the enterprisy (over-)cautious, cost is no object world will ever use. Two way mirrors are pretty reliable, after all, and almost no one actually uses more than two-way mirroring (and the people who do may not be entirely sensible).
This is too small a view of the situation. The problem with having a maximum of two-way mirroring is not steady state operation, it's when you're migrating storage from one disk to another (or from one set of disks to another). Supporting three (or more) way mirroring makes it simple to do this while preserving full redundancy; you attach the new disk as a third mirror, wait for things to resynchronize, and then detach the old disk. If things go wrong with the new disk during this process, no sweat, your old disks are still there and working away as normal.
At this point some people may suggest 'rebalancing' operations, where you attach the third disk and then tell your sophisticated filesystem to change the system by moving all the data from the old disk to the new disk; I believe that btrfs supports this by adding the new disk then deleting the old disk. The problem is that this is not good enough because if things go wrong it will generally leave part of your data non-redundant (whatever data has been migrated to the new disk). It's strictly better to run the new disk in parallel with the old disks and then decide that you trust it enough to drop the old disk out, and that requires real multi-way mirroring.
What btrfs does if you give it more than two disks in a raid-1 setup is actually potentially useful behavior (it mirrors each piece of data on two out of three drives, giving you more disk space). But the right solution here would be to support both this and a way to tell btrfs that you want N-way mirroring instead of just 2-way mirroring. As it is, only having two-way mirroring is yet another reason why I may never use btrfs on my own machines.
(I think that this is an important feature for home machines, with are both the machines most likely to see drive replacements over time and the place where overall drive systems may be the flakiest. You just know that someday someone is going to attach a dubious USB 3.0 external drive to their home system temporarily in order to swap internal drives, with predictable results partway through.)
(Of course, this sort of artificial limitation in btrfs's RAID support is partly fallout from what I feel is btrfs's core mistake.)
2015-03-26
Why systemd should have ignored SysV init script LSB dependencies
In his (first) comment on my recent entry on program behavior and bugs, Ben Cotton asked:
Is it better [for systemd] to ignore the additional [LSB dependency] information for SysV init scripts even if that means scripts that have complete information can't take advantage of it?
My answer is that yes, systemd should have ignored the LSB dependency information for System V init scripts. By doing so it would have had (or maintained) the full System V init compatibility that it doesn't currently have.
Systemd has System V init compatibility at all because it is and was absolutely necessary for systemd to be adopted. Systemd very much wants you to do everything with native systemd unit files, but the systemd authors understood that if systemd only supported its own files, there would be a massive problem; any distribution and any person that wanted to switch to systemd would have to rewrite every SysV init script they had all at once. To take over from System V init at all, it was necessary for systemd to give people a gradual transition instead of a massive flag day exercise. However, the important thing is that this was always intended as a transition; the long run goal of systemd is to see all System V init scripts replaced by units files. This is the expected path for distributions and systems that move to systemd (and has generally come to pass).
It was entirely foreseeable that some System V init scripts would have inaccurate LSB dependency information, especially in distributions that have previously made no use of it. Supporting LSB dependencies in existing SysV init scripts is not particularly important to systemd's long term goals because all of those scripts are supposed to turn into units files (with real and always-used dependency information). In the short term, this support allows systemd to boot a system that uses a lot of correctly written LSB init scripts somewhat faster than it would otherwise have, at the cost of adding a certain amount of extra code to systemd (to parse the LSB comments et al) and foreseeably causing a certain amount of existing init scripts (and services) with inaccurate LSB comments to malfunction in various ways.
(Worse, the init scripts that are likely to stick around the longest are exactly the least well maintained, least attended, most crufty, and least likely to be correct init scripts. Well maintained packages will migrate to native systemd units relatively rapidly; it's the neglected ones or third-party ones that won't get updated.)
So, in short: by using LSB dependencies in SysV init script comments, systemd got no long term benefit and slightly faster booting in the short term on some systems, at the cost of extra code and breaking some systems. It's my view that this was (and is) a bad tradeoff. Had systemd ignored LSB dependencies, it would have less code and fewer broken setups at what I strongly believe is a small or trivial cost.
2015-03-23
Systemd is not fully backwards compatible with System V init scripts
One of systemd's selling points is that it's backwards compatible
with your existing System V init scripts, so that you can do a
gradual transition instead of having to immediately convert all of
your existing SysV init scripts to systemd .service files. For
the most part this works as advertised and much of the time it works.
However, there are areas where systemd has chosen to be deliberately
incompatible with SysV init scripts.
If you look at some System V init scripts, you will find comment blocks at the start that look something like this:
### BEGIN INIT INFO # Provides: something # Required-Start: $syslog otherthing # Required-Stop: $syslog [....] ### END INIT INFO
These are a LSB standard for declaring various things about your init scripts, including start and stop dependencies; you can read about them here or here, no doubt among other places.
Real System V init ignores all of these because all it does is run init scripts in strictly sequential ordering based on their numbering (and names, if you have two scripts at the same numerical ordering). By contrast, systemd explicitly uses this declared dependency information to run some SysV init scripts in parallel instead of in sequential order. If your init script has this LSB comment block and declares dependencies at all, at least some versions of systemd will start it immediately once those dependencies are met even if it has not yet come up in numerical order.
(CentOS 7 has such a version of systemd, which it labels as 'systemd 208' (undoubtedly plus patches).)
Based on one of my sysadmin aphorisms,
you can probably guess what happened next: some System V init scripts
have this LSB comment block but declare incomplete dependencies.
On a real System V init script this does nothing and thus is easily
missed; in fact these scripts may have worked perfectly for a decade
or more. On a systemd system such as CentOS 7, systemd will start
these init scripts out of order and they will start failing, even
if what they depend on is other System V init scripts instead of
things now provided directly by systemd .service files.
This is a deliberate and annoying choice on systemd's part, and I maintain that it is the wrong choice. Yes, sure, in an ideal world the LSB dependencies would be completely correct and could be used to parallelize System V init scripts. But this is not an ideal world, it is the real world, and given that there's been something like a decade of the LSB dependencies being essentially irrelvant it was completely guaranteed that there would be init scripts out there that mis-declared things and thus that would malfunction under systemd's dependency based reordering.
(I'd say that the systemd people should have known better, but I
rather suspect that they considered the issue and decided that it
was perfectly okay with them if such 'incorrect' scripts broke.
'We don't support that' is a time-honored systemd tradition, per
say separate /var filesystems.)
2015-03-22
I now feel that Red Hat Enterprise 6 is okay (although not great)
Somewhat over a year ago I wrote about why I wasn't enthused about RHEL 6. Well, it's a year later and I've now installed and run a CentOS 6 machine for an important service that requires it, and as a result of that I have to take back some of my bad opinions from that entry. My new view is that overall RHEL 6 makes an okay Linux.
I haven't changed the details of my views from the first entry. The installer is still somewhat awkward and it remains an old-fashioned transitional system (although that has its benefits). But the whole thing is perfectly usable; both installing the machine and running it haven't run into any particular roadblocks and there's a decent amount to like.
I think that part of my shift is all of our work on our CentOS 7 machines has left me a lot more familiar with both NetworkManager and how to get rid of it (and why you want to do that). These days I know to do things like tick the 'connect automatically' button when configuring the system's network connections during install, for example (even though it should be the default).
Apart from that, well, I don't have much to say. I do think that we made the right decision for our new fileserver backends when we delayed them in order to use CentOS 7, even if this was part of a substantial delay. CentOS 6 is merely okay; CentOS 7 is decently nice. And yes, I prefer systemd to upstart.
(I could write a medium sized rant about all of the annoyances in the installer, but there's no point given that CentOS 7 is out and the CentOS 7 one is much better. The state of the art in Linux installers is moving forward, even if it's moving slowly. And anyways I'm spoiled by our customized Ubuntu install images, which preseed all of the unimportant or constant answers. Probably there is some way to do this with CentOS 6/7, but we don't install enough CentOS machines for me to spend the time to work out the answers and build customized install images and so on.)
2015-03-12
My feelings about GRUB 1 versus GRUB 2
I have been dealing with CentOS 6 recently (since I didn't give in to temptation), which has been an interesting experience. In many ways CentOS 6 is a real blast from the past, with all sorts of packages that I just don't use any more. One of those is that CentOS 6 still uses GRUB 1 (which is really just plain 'GRUB') instead of GRUB 2, which basically all of our other Linux systems use.
Boy was fiddling with the machine's boot configuration an eye-opening experience, in a good way. I've become so used to GRUB 2's insane level of complications and opacity that I'd forgotten how pleasant and simple GRUB 1 is by comparison. You have menu entries. They say things. Normally it boots the first menu entry. Your entire GRUB file probably fits on a screen (certainly if you have only one or two kernels and a 50-line xterm window). There is not a shell language in sight.
GRUB 2? Well, I was going to quote a bit of the start of one of our
grub.cfgs, but it's way too long. You don't edit GRUB 2 config
files, or even look at them; they are simultaneously verbose and
opaque, generated by scripts (often scripts that leave you lies in
comments). GRUB 2 has an entire Bourne shell like programming
language (and the Bourne shell is not a good programming language),
for what I'm sure is reasons that makes sense to the GRUB 2
maintainers. The result is the traditional new Linux pile of mud
where you make any changes (very) indirectly, magic happens,
everything is supposed to work, and if it doesn't you are up the
creek.
In case it's not obvious, I don't particularly like GRUB 2. No doubt it helps someone, but on all of our machines it just complicates my life (and this includes my desktops and my laptop). Using the original GRUB again was a breath of fresh air, one that I'll now be sad to give up when I work on our other machines.
(I was going to say that some of the complexity of GRUB 2 grub.cfg
files was partly the fault of distributions but no, this appears
to be from a standard config builder that's part of GRUB 2 itself.)
PS: Even if GRUB 1 is still available and supported on our Linux distributions and our hardware, it is not worth fighting city hall on this issue (and then finding out all of the things that are undoubtedly broken despite GRUB 1 theoretically being supported). This nicely illustrates how you lose by an inch at a time and then wind up with an entire collection of sprawling mudpiles.