Wandering Thoughts archives

2020-12-12

My views on the suitability of CentOS Stream

In a comment on my most recent entry on CentOS Stream, Ben Cotton said:

I honestly believe that CentOS Stream will be suitable for the majority of CentOS Linux users, and a huge improvement for some. [...]

At one level, I agree with Ben Cotton on this. There's every indication that CentOS Stream won't be worse than plain CentOS 7 as far as bugs and security issues go; while it will now be getting (some) package versions before RHEL does instead of afterward, Red Hat has also apparently drastically increased its pre-release testing of packages. The move from CentOS 8 to CentOS Stream does cost you an extra five years of package updates, but I also feel that you shouldn't run ancient Linux distribution versions so you probably shouldn't be running most CentOS installs for longer than five years anyway.

(I measure these five years from the release of RHEL 8, since what matters is increasingly ancient software versions. And since RHEL freezes package versions well in advance of the actual release, that means that by the end of five years after release the packages are often six or more years out of date. A lot changes in six years.)

So at that level, if you're already running CentOS 8 as a general OS I believe that CentOS Stream will be perfectly fine replacement for it for you and I don't see a strong reason to, say, migrate your existing systems to Ubuntu LTS. There's good indication that CentOS Stream will not create more bugs and instability, while migrating to Ubuntu LTS is both a bunch of work and won't get you much longer of a support period (20.04 LTS support will run out in early 2025, while I believe that CentOS Stream for 8 support will end in late 2024).

Unfortunately, that's only at one level, the level that ignores the risks now in the future. The blunt fact of the matter is that the IBM-ized Red Hat has now shown us that they are willing to drastically change the support period for an existing CentOS product with basically no notice. We have only Red Hat's word that CentOS Stream for 8 support will continue through end of full maintenance for RHEL 8 in late 2024, or actually we don't even have that; Red Hat has made no promises to not change things around again, for example when RHEL 9 is released. Red Hat has made it clear that they decide how this goes and what the CentOS board feels doesn't really matter; the board can at best mitigate the damage (as they apparently did this time around, including getting Red Hat to allow CentOS Stream for 8 to continue longer than Red Hat wanted).

(Red Hat has also made it relatively clear that their only interest in CentOS today is as a way to give people a free preview of what will be in the current RHEL in the future. This neither requires nor rewards supporting and funding CentOS Stream for RHEL 8 after RHEL 9 comes out. It also implicitly encourages things that get in the way of using CentOS Stream as a substitute for RHEL.)

Any commercial company can change direction at the drop of a hat, so Canonical (or SUSE) could also decide to make similar abrupt changes with their Linux distributions (yes, Ubuntu is Canonical's thing, not a community thing, but that's another entry). However, Canonical has not done this so far (instead they've delivered a very consistent experience for over a decade), while Red Hat just has. There's a bigger difference in practice between 'never' and 'once' than there is between 'once' and 'several'.

If I had a CentOS based environment that I had to plan the next iteration of (for example CentOS 7 and I was considering what next), I'm not sure I would build the next iteration on CentOS Stream. It might well be time to start considering alternatives, ones with a longer record of stability in what had been promised and delivered to people. Certainly at this point Ubuntu LTS has a more than a decade record of basically running like clockwork; there are LTS releases every other April, and they get supported for five years from release. There are real limits on the 'support' you get (see also), but at least you know what you're getting and it seems very likely that there won't be abrupt changes in the future.

(Debian doesn't have Canonical's clockwork precision but may give you more or less the same support period and release frequency, but see also. I don't know enough about SUSE to say anything there, but it does use RPM instead of .debs and I like RPMs better. The Debian community is probably the most stable and predictable one; Debian is extremely unlikely to change its fundamental nature after all this time.)

linux/CentOSStreamSuitability written at 23:50:23;

Sometimes a problem really is just a coincidence

This past Wednesday, we did some maintenance, including precautionary reboots of some of our OpenBSD firewalls. All of our firewalls actually are pairs of machines, one of which is the active firewall and the other of which is the inactive spare (which is at most running pfsync, and is not on the live networks or doing anything). One of the first firewalls we rebooted was what was supposed to be the inactive spare of the bridging firewall that sits between our networks and the rest of the university. Less than a minute after the reboot was initiated, our monitoring system was screaming that we had basically lost all connectivity to the outside world.

Naturally people went digging to try to understand what had happened. We had not accidentally rebooted the live firewall instead of the inactive spare (an easier mistake to make with a bridging firewall than with a routing one), the reboot didn't seem to have somehow influenced the live firewall, our core router had not seen the interface status change, and so on and so forth. Later, we examined our reachability metrics in more detail (including data from an outside perspective) and became even more confused, especially since the reachability data from outside showed that we'd had problems accessing some things not even behind our bridging firewall.

I'll jump to the punchline: it was a coincidence. The overall university network had had some problems that happened to start only very shortly before the reboot of the inactive spare firewall (and by 'only very shortly' I mean less than 60 seconds before the reboot started). There may also have been a small power fluctuation in the building at around the same time, too. If the overall networking problems had dragged on the coincidence would have been more obvious, but instead they faded out within about six minutes of the inactive spare firewall being back up, which was well within the time period where the co-worker actually in the office was poking around at things and trying to figure out what was going on.

It wasn't necessarily wrong of us to immediately assume that the reboot of a firewall was the cause and to look into things around it; the sysadmin's version of Occam's Razor is that if you just did something and a problem shows up, your action is the most likely cause. Often it really is the cause. But not always, as we saw this time, so if things don't seem to make sense maybe we should also start thinking about possible alternate explanations (and where we'd find evidence for or against them).

(In this case, there was nothing we could do to fix the problem since it was outside of our network, so the time spent poking around didn't delay resolving the issue.)

(This elaborates on a tweet of mine.)

sysadmin/SometimesCoincidence written at 00:17:09;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.