Wandering Thoughts archives

2007-11-24

Gotchas with dual-headed X with RandR on ATI cards

Xinerama is the old general way to set up multi-headed X so that things actually knew that you had more than one display. In Fedora 8, many card drivers (including radeon, the driver for ATI cards) have switched to using the XRandR extension.

The Fedora 8 GUI configuration tools don't know anything about RandR-based dual-head configuration; if you try to use them anyways, nothing helpful happens. In theory you can do all of it on the fly with xrandr; in practice this didn't work for me and I needed to modify my /etc/X11/xorg.conf by hand, following the general directions in the Intel documentation.

(To simplify, xrandr operates by placing your physical screens within an overall virtual display. My specific problem was that the default virtual display was too small to fit in two 1280x1024 screens, and you can't change this after the server has started.)

Now, Xinerama is two things: how the X server operates internally, and an X extension that programs can use to find out if you have multiple displays and how they're organized. Roughly speaking, when a program queries the extension, it gets back a list of screen numbers, their dimensions, and the coordinates of the top left of what they're displaying. While XRandR replaces the internal bits, it emulates the extension so that lots of programs don't break. Unfortunately, its Xinerama emulation has some issues:

  • you get one Xinerama screen per lit output, even if the outputs are operating in 'clone' mode (where they display the same thing). In the same situation, real Xinerama will either tell you it's not active or that you have only one screen.

    (What they should do is adjust the Xinerama results on the fly to merge multiple cloned displays into a single Xinerama screen.)

  • real Xinerama always assigns screen numbers in ascending order of their overall position, with the result that screen 0 is always the top left screen.

    XRandR instead assigns screen numbers statically based on which lit output they are being displayed by. For example, if I am using both VGA and DVI out on my ATI X300, the VGA output is always screen 0 and the DVI output is always screen 1, regardless of whether the DVI screen is left or right of the VGA screen. (The top left coordinates are correct, of course.)

The problem with the former is that anything that just counts how many screens a Xinerama query returns will believe that you have multiple displays when you really only have one.

The problem with the latter is that there are some programs which place their windows on specific screens by screen number, counting on the Xinerama behavior (especially for the meaning of screen 0). Since one of these programs is my window manager, I wound up physically rewiring my displays to match XRandR's hardcoded numbering, which is less convenient for various things.

(It turns out I could fiddle with fvwm's configuration to fix things, but I don't want to find out the next program to do this.)

DualHeadedATIRandR written at 23:34:14; Add Comment

2007-11-21

Linux virtual terminals and the X server

Linux virtual terminals have a simple conceptual model; you hit Alt-Fn and the kernel switches you to virtual terminal n. It even works with X, except that you have to remember to hit Ctrl-Alt-Fn when you're in X to switch out of it. What I suspect many people don't know is that in the case of X this simple conceptual model is a bit of a lie.

The issue is that text VTs and the X server VT are in different graphics modes, so when you switch back and forth something has to reprogram the graphics hardware into the right mode for the target VT. While you might think that this is done by the kernel, it is not; it is actually done by the X server, which is the only thing that knows how to do it.

(In the jargon, this is called 'modesetting'.)

There are both historical and engineering reasons that things are this way. Setting anything beyond text modes and some very basic graphics modes is chipset specific and often quite complex; since the X server had to deal with the complexity of the graphics card anyways, it was the natural place to stick yet more card-specific knowledge. In addition, many graphics cards can't be safely reprogrammed while something else is using the card, so to switch away from the X server's VT the X server has to stop anything else it might be doing to the card anyways.

(In general, PC class graphics hardware is not very good about letting multiple processes cleanly share the graphics hardware; it pretty much assumes that it is only going to be used by one thing at once.)

If your X server dies, it generally does not helpfully reprogram the graphics card back to text mode on the way down; instead the card is left in some unknown, random state. At least three things can happen if another X server is started:

  • if you are really lucky, the new X server can reprogram the graphics card to a good mode for X, and then when it exits will program the card back to a good text mode.
  • if you are less lucky, X works but when the X server exits it restores more or less the old broken state; your system works fine in X, but you can't switch to a readable text VT.
  • if you are unlucky, the new X server can't even reprogram the card so that X works correctly. If you are really unlucky, the crash left the card in a state where trying to reprogram it will lock up your system.

(I've alluded to the situation before in passing.)

Technically all of this is not always true; there are some ways of running X so that it is in the same mode as the text virtual terminals, so no modesetting needs to be done and the kernel can just flip back and forth. However, these are rarely used on PC hardware because they're generally achingly slow.

XServerAndVTs written at 22:46:15; Add Comment

2007-11-20

A lesson learned: Always upgrade Fedora with a respin CD

I spent a large part of today upgrading my office workstation from Fedora Core 6 to Fedora 7 (I wanted to go straight to Fedora 8, but either I am too impatient or the Fedora 8 installer stalled during dependency resolution). As usual, I did this by sticking in the Fedora 7 DVD and telling it to do an upgrade install.

The problem with this is simple: an up to date machine running an old Fedora can have more recent versions of packages than are on the install media for the newer Fedora.

(For example, Fedora 7 shipped with RPM version 4.4.2; a current Fedora Core 6 machine has RPM version 4.4.2.1.)

Because such RPMs are 'more recent' than the packages on Fedora 7's install DVD, the Fedora installer did not update them, despite the fact that they were packages for Fedora Core 6. The result left my system in a peculiar and awkward half-updated state, with some packages built for Fedora Core 6 and others built for Fedora 7, and any number of things broke (including yum).

(I eventually managed to dig myself out of this. It was enlivened by the fact that any number of Fedora 7 RPMs haven't been rebuilt since Fedora Core 6, and so still have .fc6 in their version numbers.)

Ultimately this happens because Fedora freezes their DVD images when the release is made, and feels free to update previous releases with current package versions (instead of freezing the version and backporting fixes). As time goes by, this allows an old release to leapfrog the version of the package that is burned into the DVD image. I believe that the way around this is to upgrade using a Fedora 7 re-spin from the Fedora Unity people; these are updated DVD images that include the current updates, and so most everything on them should be seen as 'more recent' than your current packages.

(Unfortunately my machine is an x86_64 one, and x86_64 re-spins are 'currently provided for testing only'. A very large download over DSL may be in my home machine's future.)

The overall moral I draw from this is that it is now necessary to upgrade to each new Fedora release more or less when it comes out, instead of sitting on the upgrade until I feel like it.

(Alternately, the moral is that I should stop being afraid to do upgrades with yum. A yum based upgrade avoids this problem because it draws packages from both the baseline release repository and the updates repository.)

FedoraUpgradeRespin written at 23:24:47; Add Comment

2007-11-18

I love Linux's serial console support

I just rebooted and recovered one of our Linux machines that had hit a panic and then locked up in an endless cycle of 'soft lockup detected on CPU #0' reports. I did all of this from home, through the magic of a serial console, a console server, and Linux's magic SysRq key support. And as far as I'm concerned, one of the best things about this is that it took hardly any bandwidth and no Java.

(While I'm currently on DSL at home, it wasn't too long ago that I was stuck at 28.8 Kbps dialup PPP, and even now my DSL flakes out periodically. So I still value being able to do things without saturating my link or fighting through overly graphics laden and occasionally flaky web interfaces.)

Linux's serial console support does have limitations, including that there are multiple levels of console. One of the big ones is that you cannot reboot the machine unless the kernel is at least somewhat responsive, enough to handle magic SysRq keystrokes. But its advantage is that it is general; it works, and works the same, on every Linux machine with a serial port.

(If you don't have a serial port, there's the kernel's netconsole support, but it only sends kernel messages somewhere else; you don't get magic SysRq key support.)

Sidebar: serial consoles and lights out management hardware

To be fair, a serial console isn't the only way to get this sort of thing; it's just the easiest and most universal. On a machine with lights out management hardware, you can power cycle it remotely and even get console access (so-called 'KVM over IP'). But this involves working inside whatever web interface the vendor has come up with and takes quite a bit more bandwidth, especially for the console, and doesn't do anything to capture messages that scroll off the console.

(And a serial console doesn't give me the ability to remotely power cycle a machine. If the machine had been so far gone that it didn't respond to the magic SysRq keys, I would have been cursing the fact that it didn't have an ELOM and that we still haven't gotten around to setting up our smart power units.)

SerialConsoleLove written at 23:04:48; Add Comment

2007-11-08

How Linux handles virtual memory overcommit

Following up yesterday's entry on the general background of system virtual memory limits, here's how Linux deals with this issue. In its traditional way, Linux gives you three options for what happens when a process tries to allocate some more memory, controlled by the value of the vm.overcommit_memory sysctl:

  • the kernel gives you the memory unless it thinks you would clearly overcommit the system (mode 0, the default, 'heuristic overcommit').

  • the kernel always gives you the memory (mode 1, 'always overcommit').

  • the kernel refuses to give you more memory if it would take the committed address space over the commit limit (mode 2, what I call 'strict overcommit').

(Disclaimer: all of this assumes a relatively recent 2.6 kernel.)

The kernel's commit limit is your swap space plus some percentage of real memory. You set the percentage with the vm.overcommit_ratio sysctl, which lets you deal with both complications of a simple commit limit of swap space plus real memory. (The percentage can be more than 100, for a situation where you have lots of programs that don't use much of their allocated space.)

Whether or not it is enforcing it, the kernel always tracks the amount of committed address space and reports it as Committed_AS in /proc/meminfo, along with CommitLimit, the current commit limit.

For both heuristic overcommit and strict overcommit, the kernel reserves a certain amount of memory for root. In heuristic mode, this is 1/32nd of free RAM; in strict overcommit mode it is 1/32nd of the percent of real memory that you set. This is hard-coded and not tunable, and I can't say I was entirely pleased to discover that our 64 GB compute server is reserving around 2 GB for root.

If you want the gory details, see the __vm_enough_memory function in mm/mmap.c in the kernel source, and also Documentation/vm/overcommit-accounting, which sort of documents the sysctl settings.

Sidebar: How heuristic overcommit works

Heuristic overcommit attempts to work out how much memory the system could give you if it reclaimed all the memory it could and no other process used more RAM than it currently is; if you are asking for more than this, your allocation is refused. In specific, the theoretical 'free memory' number is calculated by adding up free swap space, free RAM (less 1/32nd if you are not root), and all space used by the unified buffer cache and kernel data that is labeled as reclaimable (less some reserved pages).

LinuxVMOvercommit written at 22:32:24; Add Comment

2007-11-06

A thought about competition between Red Hat Enterprise and CentOS

Roughly speaking, I can think of two different market segments that might be attracted to Red Hat Enterprise Linux:

  • the people who need official support and certification, whether for the base operating system or for some commercial package like Oracle or SAP.
  • the people who merely want to put a box in a corner and forget about it for three to five years.

Red Hat's pricing for RHEL is clearly pitched at the former market, not the latter, quite possibly because Red Hat has determined that they cannot charge enough to make the latter market worth it at its current size. (Note that Red Hat started out with a much lower pricing model for RHEL in prior versions; I assume that they changed for good reason.)

CentOS is not even in the running for the former market; it lacks the all-important 'certified for <whatever>' stamp from the vendors of packages. Given what CentOS is, the packages will almost certainly work anyways, but that's irrelevant for the people who need the official cover of the certification.

(I am relatively convinced that it would be pretty difficult for CentOS to get such certifications. If you are a would-be certifying company, where is your official cover that the CentOS project won't go off the rails or just disappear?)

So my conclusion is that there isn't actual competition between CentOS and RHEL. If you are in RHEL's target market, CentOS won't do, and if you are not Red Hat is probably happy to see you using something that is basically RHEL.

(This thought probably isn't novel, and I should note that it was sparked by exposure to the first paragraph of this recent article in one of my syndication feeds.)

RHELvsCentOS written at 23:08:49; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.