Wandering Thoughts archives

2009-02-28

What it would take for me to use Fedora Rawhide

Sparked by Pete Zaitcev, here's a thought: why don't I use Fedora Rawhide? That's a serious question, since I'm sort of in the natural target audience; I've got the skills and the general interest and willingness, and I run and have run relatively bleeding edge versions of various things (at one point including kernels, although I've gotten lazy since then).

In thinking about it, what it comes down is that while I can stand irritations, I can't stand show-stoppers. I don't have spare machines, so I need both my office and my home computer to work reasonably well all the time; I can't afford to lose a day or three because, say, the web browser has stopped working or the X server gets unusably slow after running for a couple of hours.

(Here I am discounting the possibility of true catastrophes such as a broken libc package or the like, partly because I understand that they basically don't happen these days.)

Now, I don't demand that all of the other bleeding edge things that I use now never break. Instead I deal with any breakage by immediately reverting to the previous and working version, waiting a bit, and trying the current version again. Sometimes I have to wait a while before the new version works, but that's all right; I'm not losing productivity except when I feel like it.

The equivalent for Rawhide would be package downgrades, so that if a Rawhide update broke things, I could and would revert back to the last working set of packages. Unfortunately, as far as I know Rawide has no general support for this (probably partly because it's difficult in general), and I believe that Rawhide removes old versions of packages from the repositories in order to save disk space, just like Fedora updates.

You could construct something like this by hand by keeping a copy of all RPMs that yum downloads and installs, along with either yum's records of installs and upgrades or just a list of what package versions were on the system at various points in time. My impression is that yum is a bit too willing to remove RPM files for me to trust it to keep the copies; possibly the solution there is some sort of proxy that's configured to always keep copies (a plain squid instance or the like might work).

(I'm prepared to run the risk of having to do a certain amount of surgery if there are problems such as bad RPM postinstall scripts, and also the general risk of config file format changes and so on.)

WhyNotRawhide written at 22:59:53; Add Comment

The potential problems with distribution downgrades

One of the things I wish for every so often is the ability to downgrade Linux distributions, not just upgrade them (which is common and generally well supported; I can think of only one large Linux distribution without upgrade support, and you can argue that Red Hat Enterprise Linux is a special case). This wouldn't be as good as various forms of 'live upgrade', but it would still be useful and it would certainly make me a lot less nervous about doing upgrades.

Conceptually, a downgrade is easy; you run the normal package upgrade selection process in reverse, replacing all of the current packages with the most recent versions from what you're downgrading to. In theory this should be relatively simple to implement.

(I think you'd want to do it from a standalone environment, because otherwise the ordering constraints might get what programmers call 'interesting'.)

In practice I think that the big problem is the issue of reverting various sorts of configuration and data files to older versions of their formats. For example, consider a syndication feed reader where the format for storing your feed information changed between major versions. When you upgrade your distribution and get the new major version of the feed reader, it will migrate your old feed data to the new format and then carry on as normal. But when you downgrade you need to run this in reverse, converting new format files to old format files that the older version of the feed reader can read. And there are generally no tools for this.

The result is that while a downgrade is technically possible, the result might not be very useful without a lot of fixup work, and thus without a lot of work writing ad-hoc fixup tools (assuming it's even possible without losing significant amounts of information, which it may not always be). Thus I don't think it's any surprise that no Linux distribution has attempted to support downgrades, however nice they would be to have; what you could give people for a reasonable amount of work would be so fenced in with caveats and cautions that it would be useless to most users.

DowngradeDifficulty written at 04:19:34; Add Comment

2009-02-21

How to turn off gnome-terminal's cursor blinking

For the purposes of turning off the irritating blinking cursor, there are three generations of gnome-terminal (so far; the Gnome people may yet change their minds again), each of them with a different way to do this. So it goes like this:

  • in the original version of gnome-terminal, there is a direct preference for this; in the General tab of each profile there is a 'Cursor blinks' option.

  • in the intermediate version of gnome-terminal, there is no option in gnome-terminal to control this; you have to turn it off by turning off a general Gnome setting. In the General tab of the Keyboard preferences (System → Preferences → Keyboard), turn off 'Cursor blinks in text fields'.

    (Yes, this is obscure. Apparently terminals are 'text fields' nowadays, at least to Gnome.)

    This has the side effect of disabling blinking cursors everywhere in Gnome, including in your web browser. Since I spend a lot more time in gnome-terminal than filling in forms in my browser and I absolutely hate blinking cursors in terminal windows, this is a tradeoff that I reluctantly live with.

  • in the modern version of gnome-terminal, there is a hidden gconf setting that controls this if you don't want to turn off the Gnome-wide setting. Fire up gconf-editor, navigate to apps/gnome-terminal/profiles/Default, and set the cursor_blink_mode key to have the value of 'off'. If you have any other profiles, you will have to create and set this key in them as well.

    (It probably will be inherited by any new profile that you create after making this change, but I haven't tested this; all of my profiles are old.)

In terms of what distribution has what version of gnome-terminal: Fedora 8 and Ubuntu 6.06 LTS have the original version, Ubuntu 8.04 LTS has the intermediate version, and Fedora 10 has the modern version. Only the original version pays attention to the preference it sets; the other two versions silently ignore it.

(See this Gnome bug and its links for the history. This Ubuntu bug has more than enough comments that summarize my view on the whole situation.)

BlinkingGnomeTerminal written at 00:55:36; Add Comment

2009-02-09

I hate flaky systems, Fedora 10 and/or hardware edition

I am generally an even, calm person. But there are some things that just absolutely get under my skin and drive me to screaming frustration, and one of them is flaky, unreliable systems. This is especially the case when I can't tell if the problem is hardware, software, or both.

Since I'm writing about this, you might guess that I am dealing with such a thing right now. You would be correct; my office workstation has become really flaky ever since I upgraded it from Fedora 8 to Fedora 10 (about a month ago), with frequent kernel panics and other problems. In a sysadmin's workstation, this is more than frustrating, it's actively dangerous; I've already had the system crash once while I was in the middle of something moderately delicate.

According to the Fedora reply to my bug report, this is almost certainly a hardware problem. I wish I could entirely believe and trust that, but it's hard since the system was completely stable on Fedora 8, with not a crash or problem in sight. And without a good idea of where the problem is, it's hard to go to management with a request for, say, new hardware, since we might be spending money and getting nothing for it.

(This is where it is frustrating that universities are not businesses. In many companies, the cost of my time means that it would long since have made economic sense to just get me a new machine and put the old one in the 'possibly junk' spares pool. But this is not something that universities do, especially in the current budget climate; staff are a sunk cost, while hardware costs real money.)

Unfortunately falling back to Fedora 8 is not a real option at this point; I don't have an alternate OS install on this machine, and while Fedora lets you upgrade a machine, there's no downgrade procedure. I will probably try to wedge the current Fedora 8 kernel into the system and see if I can boot it without everything breaking.

(At a look, the only thing that seems to specifically depend on a recent kernel version is the ATI graphics driver. Unfortunately having working X is somewhat important to me, so I am hoping that it will work in 'nomodeset' mode.)

PS: yes, I have a pre-upgrade backup of my machine, since I'm not crazy. But falling back to it at this stage would take a bunch of work for various reasons, including trying to hunt down various things that have changed since then.

Fedora10UnreliabilityHate written at 01:46:19; Add Comment

2009-02-06

Our SunFire X2100 nVidia Ethernet experiences

I mentioned last entry that we had seen problems with the onboard nVidia Ethernet ports on our SunFire X2100-based Linux iSCSI backends. Here's the details.

The SunFire X2100s have nVidia motherboards and four onboard Ethernet ports, two nVidia based ones and two Broadcomms. In our configuration, one Broadcomm and one nVidia port are used for iSCSI networking, the other nVidia port is used for general system access, and the second Broadcomm port is used only by the integrated service processor. Only the ports used for iSCSI see any significant traffic volume.

What I ran into was that under heavy streaming iSCSI IO, in other words more or less continuous TCP at close to wire rates, the nVidia iSCSI port would start reporting:

kernel: eth2: too many iterations (6) in nv_nic_irq.

When this happened, network activity on that port either dropped significantly or stopped entirely, with bad overall effects on iSCSI data rates. The Broadcomm iSCSI port had no problems, despite seeing the same level of traffic.

My solution was to take a club to the situation by setting a module parameter to suppress the situation; in /etc/modprobe.conf I set:

options forcedeth max_interrupt_work=100

This seems to have made the problem go away; certainly we don't see either the kernel message or network slowdowns any more, including under sustained IO loads.

(Note that we are using the default forcedeth kernel driver, in specific whatever version is included in the kernel.org 2.6.25.3 kernel; it appears that this is version 0.61.)

Sidebar: some references

I haven't found anything that really explains what's going on, assuming that there's even a common cause across all of the cases. Given that this is various versions of potentially buggy hardware combined with a reverse engineered driver (because nVidia has been less than helpful), there are a lot of potential problems and causes.

NvidiaEthernetIssue written at 00:20:08; Add Comment

2009-02-04

Why btrfs was inevitable: a corollary to (not) getting ZFS in Linux

One of the things I said before about ZFS in Linux boils down to that it's a lot of hard work to get outside code into the Linux kernel. This has an important corollary.

This sort of hard work, of modifying foreign code so that it fits into the Linux kernel, takes people who are pretty skilled Linux kernel programmers. These people are skilled enough to have a choice of what to do, so they can either work on the tedious grinding job of adopting other people's code into the kernel or they can write something new of their own (and get it into the kernel if it's kernel code).

It should not surprise anyone that most programmers, facing a choice between grinding maintenance work and writing something new and interesting, will pick writing something new and interesting. My impression is that Linux kernel hackers are not an exception to this, and that pretty much all of the Linux kernel hackers that are competent to get outside code into the Linux kernel would rather write something new, and generally they do.

Thus, even without other issues we would almost certainly get btrfs instead of an adoption of ZFS; it's simply more interesting for the people doing the work. Arguments that Linux kernel programmers should choose the boring work anyways are missing the point in several ways, including that people simply don't behave that way no matter what you would like.

(The one exception to this is, of course, when someone is paying kernel hackers to do the grinding work. I don't think it's a coincidence that the integration of things like IBM's JFS, SGI's XFS, and even Reiserfs have mostly been driven by employees of their respective companies.)

WhatGetsDeveloped written at 01:03:55; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.