Wandering Thoughts

2021-10-24

Your SMART drive database of attribute meanings needs regular updates

A lot of people use the smartmontools tools to monitor their drives, either automatically through smartd (which is often set up by default) or directly through 'smartctl'. If you've ever used 'smartctl -A' to get the SMART attributes of your drives, you know that these attributes have names that often give you important information on how to interpret their values. Sadly, these names (and measurement units) mostly aren't standardized; instead, drive vendors come up with whatever they want, including varying them from model to model. Even when they use the same SMART attributes for the same purpose as other drives they may change the units.

Smartmontools deals with this by having a text file that contains a bunch of mappings for what various SMART attribute IDs mean on various different drives. On many systems, you can see your current version of this database in /usr/share/smartmontools/drivedb.h or /var/lib/smartmontools/drivedb/drivedb.h. Unsurprisingly, this drive database is updated on a regular basis to add and correct entries.

A SMART database that's missing necessary entries for your disks can have two effects. The obvious effect is if some SMART attributes are called just 'Unknown Attribute'; this at least lets you know that you don't really know what's going on. More dangerously, some attributes might be mis-labeled, so that for example smartctl thinks that the attribute is 'Total LBAs Written' when it's actually 'Host Writes GiB' (which explains why the values we see are so low on that disk model).

If you're using a Linux distribution that makes frequent releases, your smartmontools probably has a pretty up to date version of the drive database. If you're using a less frequently updated distribution, for example Ubuntu LTS, your SMART drive database is going to be out of date almost as soon as the distribution is released. This generally means that you want to update your drive database on a regular basis, unless you only ever use drives that are sufficiently old that they're in a SMART database from, say, 2018 (spoiler, they're not).

Smartmontools ships with an update-smart-drivedb script that can be used to do this. By default, it updates the location that was built into distribution package, which you may not want to do. Fortunately you can tell it to fetch to some other location, which you can then tell smartctl to use with 'smartctl -B ....'. You can also just read the script to determine the URLs and then fetch one of them yourself. I've generally used the smartmontools Github repo since I find the web interface much easier than SVN. You want drivedb.h in raw format, possibly from a branch instead of trunk.

(We've used the trunk version of drivedb.h without problems so far. Well, without obvious and visible problems at least.)

SMARTUpdateDriveDatabase written at 00:22:43; Add Comment

2021-10-21

The easy way to see underneath NFS mount points on Linux

One of the little and generally obscure problems of running NFS clients is that every so often you could wind up in a situation where you wanted to peek at what was on the underlying filesystem beneath an NFS mount, and you wanted to do this without the disruption of actually unmounting the NFS filesystem to expose what was below it. Sometimes this was because you thought a program might have written things to the directory hierarchy before the NFS mount had been made (and those writes were now invisible). Sometimes this was because you wanted to put copies of some vital things from the NFS mount into the underlying filesystem so they'd still be available if the NFS mount was delayed for some reason.

Traditionally people used various reasonably clever but somewhat awkward approaches to do this. Today I realized that for some time, Linux has provided a straightforward option for this in the form of bind mounts. Bind mounts make (part of) one filesystem appear at another, additional mount point and so can be used as a stronger form of symlinks (which is what I use them for). However, bind mounts also have the property that they don't 'see' things mounted on top of the original filesystem; instead they see only the filesystem that you're bind mounting.

Suppose that you put all your NFS mounts in subdirectories and have a /cs/site NFS mount that you want to be able to access underneath. You can bind mount /cs (which is most likely on the root filesystem) to, say, /var/local/cs, and then /var/local/cs/site is the part of the root filesystem that's underneath the /cs/site NFS mount. You can read and write it exactly as you would any other part of the root filesystem (and so can copy things from /cs/site to it). If the /cs/site NFS mount isn't there for some reason, your local versions can be used instead.

(You can't automatically use the local versions if the NFS mount is there but the NFS server isn't responding, only if you haven't been able to make the NFS mount. There may be complex Linux filesystem arrangements that would let that work, but I haven't looked into them.)

Whether it is a good idea to have local copies of critical bits of, say, your central administrative filesystem, is another question (even if you arrange to keep them up to date). On the one hand, maybe it's crucial to be able to do some things from that filesystem during boot or if the NFS mount isn't available yet. On the other hand, it adds a potential point of confusion to diagnosing problems. Rather than files that are either there or not there, now you have a filesystem that magically looks different (or potentially different) depending on when you look at it.

(I've seen extra copies of files and filesystems cause problems before.)

NFSSeeingBeneathMounts written at 23:30:32; Add Comment

2021-10-16

My wish for a way to activate one systemd unit when another one started

I recently wrote (again) about bind mounts under systemd for ZFS filesystems, covering how you don't want to put them in fstab and how you might not want them to say they're required by local-fs.target. ZFS filesystems are special here because they're not represented in /etc/fstab or in explicit .mount units; instead they unpredictably appear out of the blue when some other program is run. ZFS mounts aren't the only thing that can appear this way; our local NFS mounts are managed through a program, and for that matter removable media is not always there. These other cases can expose similar issues (especially NFS mounts). All of them could be solved through a feature that, as far as I know, systemd doesn't have, which is activating one unit when another unit asynchronously appears and becomes active.

(With such a feature, I would set bind mounts to activate when the underlying filesystem had appeared and been successfully mounted.)

Effectively what I want is something like a hypothetical 'TriggeredBy=' property. If any unit in the list appeared and became active, the unit with a TriggeredBy would automatically try to activate. Generally you'd also want an After= on the TriggeredBy units. Listed units would not have to exist at the beginning of systemd's operation, and TriggeredBy couldn't be implemented by creating /etc/systemd/system/<what>.d/ directories with drop-in files, because that doesn't work for all units (eg, the last time I looked, automatically created .mount units).

As far as I can see you can't implement this with any existing set of properties in systemd.unit(5), although maybe I'm missing something. It's possible that PartOf= and After= will do this, but if so it's not entirely clear to me and on systemd v237, this doesn't appear to work if you declare a 'PartOf=' to a ZFS filesystem's .mount unit (which will only appear to systemd when the pool is imported). Even if PartOf worked, it could be too strong for some uses (although it's probably right for a bind mount to automatically unmount if you unmount the underlying filesystem).

I understand why systemd doesn't do this. Unlike Upstart, systemd is not necessarily fundamentally built around events, so having events trigger units is not as natural a thing to it. Systemd has some limited event triggering in the form of its various activation schemes (socket, path, automounting, and timers), but all of this is specific and seems relatively hard coded. Asynchronous event handling is the domain of udev.

SystemdAlasNoTriggering written at 23:27:07; Add Comment

2021-10-15

You may not want to require all of your bind mounts (in systemd)

Today, as part of my recent testing of bind mounts of ZFS filesystems, I discovered a new way to shoot my foot off with systemd. After yesterday's discoveries about fstab entries for bind mounts, I wanted to be sure that systemd wouldn't do anything peculiar with a .mount unit for a bind mount if the underlying filesystem was unavailable. So I arranged for the ZFS filesystem in question to not be mountable and rebooted my test system. On boot, it promptly stopped in single user rescue mode (asking for the root password as usual). It took me a little bit of time to work out what I'd done to myself.

My normal setup for my ZFS bind mounts includes the following standard bit at the end:

[Install]
RequiredBy=local-fs.target

As the documentation for Requires will sort of tell you, this is a pretty strong statement. A RequiredBy here is equivalent to local-fs.target Requiring this bind mount, which means:

[...] If one of the other units fails to activate, and an ordering dependency After= on the failing unit is set, this unit will not be started. [...]

Mounts of all sorts generally have a Before of local-fs.target, which is equivalent to local-fs.target having an After of them, which means that this takes effect. If you say your mount is RequiredBy local-fs.target and it fails to start for some reason, local-fs.target itself is not considered to be started and systemd goes to its rescue environment. A RequiredBy of local-fs.target is implicitly declaring that this mount is essential for the system to boot at all.

In some situations this is what you really want, even for bind mounts. In other situations you're better off having your system boot up far enough to appear on the network so that you can log in remotely to try to fix things. In those situations what you want here is WantedBy, not that reflexive RequiredBy. I have various bind mounts on my office desktop; especially in these times, I'm probably better off making those 'WantedBy'.

(This may lead you to setting Requires= and After= on some service units that would otherwise malfunction without the bind mount. Or you can set appropriate RequiredBy and Before lists on your mount units, which may be easier to manage.)

PS: As covered in systemd.mount(5), you can get this for /etc/fstab lines with the 'nofail' mount pseudo-option.

SystemdNotRequireBindMounts written at 23:08:07; Add Comment

2021-10-14

Systemd (v237) can do quite odd things with /etc/fstab bind mounts

Suppose, not entirely hypothetically, that you have an Ubuntu 18.04 system (which has systemd v237) and you want to add a bind mount for a ZFS filesystem; for example, you want to bind mount /cs/mail on to /var/mail. You're managing ZFS filesystems themselves through ZFS's usual boot time mechanisms instead of /etc/fstab, but putting the bind mount itself in /etc/fstab is the simplest and shortest way. So you put the following in /etc/fstab:

/cs/mail /var/mail none bind

If you boot the system and do a 'df /var/mail', everything will look good. But this is probably an illusion that will be exposed if you do 'grep /var/mail /proc/mounts', because you most likely really have two bind mounts, one on top of the other:

/dev/sda1 /var/mail ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
tank/cs/mail /var/mail zfs rw,nosuid,nodev,noatime,xattr,noacl 0 0

The ultimate cause of this is the same as it was in 2014, when I wrote an entry on bind mounts with systemd and non-fstab filesystems. Since there is no /etc/fstab line or .mount unit for /cs/mail, systemd doesn't know that it's a separate filesystem until ZFS's commands run and the filesystem magically appears. So apparently systemd does the bind mount twice, once when just the root filesystem is mounted (which looks like everything needed) and then a second time when a /cs/mail filesystem appears and it knows more. If another program (such as your IMAP server) looks at /var/mail before the second mount appears, it will see (and possibly take a reference to) the bad, empty version on the root filesystem.

The automatically created systemd mount unit for /cs/mail will include a 'RequiresMountsFor=/var /cs/mail', but this doesn't mean that systemd will require a /cs/mail mount. When systemd starts, as far as it knows /cs/mail only requires the root filesystem.

Suppose that you then decide to get clever and use brute force. If systemd is bind mounting the root filesystem's empty /cs/mail directory, let's give it a mount source that doesn't exist on the root filesystem by changing fstab to be:

/cs/mail/mail /var/mail none bind

Your ZFS filesystem remains /cs/mail; you will just put everything in a directory in it, /cs/mail/mail. So you make this change, reboot your system, and now you find that /cs/mail is not mounted as a ZFS filesystem, and you still have a bind mount from the root filesystem. As far as I can tell, systemd v237 will automatically mkdir the source of a bind mount if it doesn't exist. So systemd created a /cs/mail/mail on the root filesystem and bind mounted it to /var/mail, and then ZFS found that its /cs/mail mount point wasn't empty and refused to mount the ZFS filesystem. My clever idea made me worse off than before.

(This behavior doesn't seem to be documented in the current systemd.mountd(5) manual page, but systemd v237 does date from 2018. On the other hand, it's not documented in the v237 version of the manual page either.)

These days, systemd allows you to express some but not all systemd unit conditions in /etc/fstab (see [systemd.mountd(5)]]). However, I still think that a real .mount unit file is your most reliable bet. This also lets you use 'ConditionPathExists' to check for the presence of some known file or subdirectory in your original filesystem in a way that hopefully won't cause systemd to create it on the root filesystem if it's missing.

(In a modern systemd you can also use ConditionPathIsMountPoint, but that's not in v237.)

PS: I haven't tested this on anything more recent than the Ubuntu 18.04 systemd, since Ubuntu 18.04 is what we're going to be using for the particular server where this is an issue (because it's what all our other ZFS fileservers use). That is also part of why this is not currently any sort of bug report to anyone. For that matter, I don't know if this is a bug, other than perhaps a documentation bug.

SystemdFstabBindMountOddities written at 23:38:41; Add Comment

2021-10-11

Unknown NMIs and counting hardware CPU events in eBPF programs

I mentioned in a recent entry that my office workstation had started producing alarming kernel messages about non-maskable interrupts (NMIs) happening for an unknown reason:

Uhhuh. NMI received for unknown reason 31 on CPU 10.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue

I've now been able to identify what triggers these NMI messages. On my office machine they can reliably be produced by running the Cloudflare eBPF Prometheus exporter with the ipcstat example exporter, which uses perf events to count CPU instructions and CPU cycles, processes them through an eBPF program, and lets you query the result as Prometheus metrics. They don't happen all of the time (only every so often) and they don't seem to be particularly correlated with anything (they don't happen every time I scrape metrics from the Cloudflare eBPF exporter, for example). They may require actually obtaining metrics from the Cloudflare exporter so that it gets them from the kernel eBPF program; I'm not sure yet.

(This isn't triggered just by the Cloudflare eBPF exporter in general, because I've been running it for a long time to get disk IO latency histograms. Taking the ipc eBPF program out of my eBPF exporter configuration stops the messages; running a separate eBPF exporter instance with just that program causes them to start again.)

My office machine is running Fedora 34 with Fedora's 64-bit '5.14.9-200.fc34' kernel, on a machine with an AMD Ryzen 7 1800X. My home machine is running the same Fedora kernel and the same Cloudflare eBPF exporter (with the same eBPF programs), but has an Intel i7-8700K CPU and doesn't get these unknown reason NMIs. Nor have I been able to produce these NMIs so far by running 'perf stat -a' on my office machine. My leading theory is that there's some combination of obtaining CPU performance counters, in an eBPF program, and possibly pulling data from it on a regular basis from user level that is triggering this on (some) Ryzen CPUs.

(I've experimented with a bpftrace command line that I think is doing much the same as the eBPF exporter's program, but haven't seen anything yet. The problem can go hours without triggering, though.)

BPF programs apparently do run from NMIs for handling perf events such as counting CPU cycles (source), so this seems not completely implausible. I don't know if perf events normally trigger NMIs or if there's a different mechanism.

The large scale moral I take from this is that eBPF programs aren't necessarily as non-invasive as they're often presented as. In a perfect world this obviously wouldn't happen, but in this world we deal with the hardware and kernel bugs that we have, like it or not. I'll have to take care with any future eBPF usage and pay attention to potential correlations with, for example, new kernel messages.

(For my own future reference when doing Internet searches, most sources seem to just talk about 'BPF' instead of 'eBPF'.)

PS: I don't have test results for kernels before this one because I only recently started running this eBPF program on my office workstation. On my home desktop I've been running it for some time without problems in previous kernel versions.

NMIFromEBPFCPUPerfCounters written at 23:17:51; Add Comment

2021-10-08

What Linux kernel "unknown reason" NMI messages mean

Today, my office workstation logged a kernel message (well, a set of them) that I've seen versions of before, and perhaps you have too:

Uhhuh. NMI received for unknown reason 31 on CPU 13.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue

While I (still) don't know what caused this and what to do about it (other than reboot the machine in the hopes that it stops happening), this time I looked into the kernel source to at least figure out what the 'reason 31' means and what is generally going on here. I will put the summary up front: the specific reason number is probably meaningless and at least somewhat random. I don't think it tells you anything about the potential causes.

The 'NMI' here is short for Non-maskable interrupt; the OSDev wiki has an x86-focused page on them. In the Linux kernel, NMIs can be generated for various reasons, some of which are specific for a single CPU and some of which are general and may be handled by any CPU. When a kernel driver enables something that may generate NMIs (of either type), it registers a NMI handler for it. Typical source of and handlers for non CPU specific NMIs include watchdog timers and the kernel debugger. NMI handlers are called on every NMI and each is expected to check its NMI source and tell the kernel if the NMI came from it (well, more or less). If no handler speaks up to say it handled the NMI and certain other conditions are true, the kernel will generate this particular 'unknown reason' message.

(Actually, the 'local' NMI handlers are called first. If any of them say they handled an NMI, the kernel assumes the entire NMI was for a per-CPU reason and stops there.)

On normal x86 hardware, the reason number in the message comes from reading a specific x86 I/O port, what the OSDev wiki calls 'System Control Port B (0x61)'. This port is actually 8 separate status bits together, and the Linux kernel's reason is reported in hex, not decimal, so the reason here should be decoded from hex to binary, where we will find out that it's 0b110001, with bits 6, 5, and 1 set.

When the Linux kernel handles a non CPU specific NMI in default_do_nmi(), it starts out by seeing if either or both of bit 8, NMI_REASON_SERR, or bit 7, NMI_REASON_IOCHK, are set. If bit 8 is set and no SERR handler take the NMI, the kernel will report:

NMI: PCI system error (SERR) for reason ... on CPU ...

If bit 8 is not set and bit 7 is set (and no IOCHK handler takes the NMI), the kernel will report:

NMI: IOCK error (debug interrupt?) for reason ... on CPU ...

(The bit is called IOCHK but the message really does say 'IOCK' instead.)

If either bit is set, the "unknown reason" kernel message is skipped for this NMI; it's considered handled by the PCI or IOCK handler. So as far as I can tell, the largest "unknown reason" number you'll ever see is 3f (remember, this is hex), because anything larger than that sets at least one of the high two bits and will take the SERR or IOCK path.

(All of this is in nmi.c.)

In theory the OSDev wiki page has a nice table of what the low five bits in System Control Port B tell you about your uknown NMI. In practice the information seems relatively inscrutable and meaningless. For instance, in the original IBM PC designs, bit 5 toggled back and forth on every DRAM refresh, bit 6 was system timer 2's output pin state, and bits 3 and 4 seemed to reflect whether or not you had enabled parity checks (bit 8) and channel checks (bit 7). What these mean on modern x86 hardware is anyone's guess; they may mean very little. Linux only cares about bits 8 and 7.

Based on all of this, I think that the 'unknown reason' likely says nothing about what caused the NMI to be generated or about what the (interesting) state of the hardware is. An 'unknown reason' NMI came from some source that was not recognized by any handler, which means that either there is no handler registered for its source (for example hardware is generating unexpected NMIs) or the handler didn't recognize that its hardware caused the NMI. Based on the kernel message about power savings mode, these seem to have at one point been a fruitful source of surprise NMIs.

(That kernel message seems to go back quite a way, although it's hard to trace it because code has moved around a lot between files. I think there's a way to do this in git, but I lack the energy to work it out right now.)

NMIUnknownReasonMeaning written at 00:02:51; Add Comment

2021-10-03

Desktops don't always use NetworkManager's programs

Three things are definitely true on modern Linux and modern Linux desktops. Pretty much everyone is using NetworkManager, NetworkManager includes GUI frontends, and desktops have GUI interfaces for controlling your active networks and configuring them. In a nice regular universe, the desktop's GUI frontends would be using the NetworkManager GUI frontends like nm-applet and nm-connection-editor, and would thus usually automatically support everything that NetworkManager itself supports (since the NetworkManager developers tend to update their GUI frontends when they add features like support for WireGuard or for "metered" connections).

Unfortunately this isn't a nice regular universe, so several major desktops do not use the NetworkManager programs for their GUI and as a result can be missing support for NetworkManager features. GNOME and Cinnamon definitely use their own code for both controlling active network connections (what nm-applet is used for) and for configuring network connections (what nm-connection-editor is used for). Cinnamon's version doesn't support WireGuard VPNs or setting connections to "metered" status, as I've found out over time; I don't know about the state of GNOME's. I believe that KDE has its own applet; I don't know if it uses the NetworkManager connection editor.

(The GNOME and Cinnamon desktop shells both implement applets as Javascript code that runs in the context of the desktop shell, instead of as separate programs. However, Cinnamon can use nm-applet instead of its own thing if you do the right magic things. I don't know about GNOME.)

Generally you can run nm-connection-editor directly if you want to (and remember what it's called); it appears to work when run directly even in a GNOME Wayland session. Some desktop environments (such as Cinnamon) may offer you a confusingly named additional "applet" menu option that runs nm-connection-editor instead of the desktop's own connection editor (on my Cinnamon desktop it's called "Network Connections", instead of the "Network Settings" that invokes Cinnamon's own, limited version). Unfortunately as far as I know there may be no way to run nm-applet instead of your desktop's less up to date version, and even if you can you may lose other features in the overall desktop environment.

You might reasonably ask why this matters. One of the reasons it's mattered for me in the past is that it can be rather confusing to read some online documentation and then not find what it's talking about in my Cinnamon desktop environment, because the online writing is talking about the official NetworkManager way (and may be written by someone on a desktop environment that does use the NM programs). It also complicates having full support for things like metered connections and WireGuard links, because it's not enough for them to be present in the official NetworkManager programs; they also have to make their way into all of the desktop reimplementations.

PS: The situation with nm-applet can be especially confusing in Cinnamon (at least); my desktop session actually has a nm-applet process running, despite not using it. A test GNOME session doesn't behave this way. Since my Cinnamon desktop environment has been around for a while (it looks like since 2013, since even my laptop environment has been long-lived), starting nm-applet may be inherited from old days when Cinnamon actually used it.

NMProgramsNotAlwaysUsed written at 00:26:38; Add Comment

2021-09-28

Avoiding flawed server ACPI information for power monitoring

Today I noticed that one of our servers was regularly logging a burst of kernel messages about problems with ACPI data. These messages looked like:

ACPI Error: No handler for Region [POWR] (000000000c9d7b92) [IPMI] (20190816/evregion-129)
ACPI Error: Region IPMI (ID=7) has no handler (20190816/exfldio-261)
No Local Variables are initialized for Method [_PMM]
No Arguments are initialized for method [_PMM]
ACPI Error: Aborting method \_SB.PMI0._PMM due to previous error (AE_NOT_EXIST) (20190816/psparse-529)
ACPI Error: AE_NOT_EXIST, Evaluating _PMM (20190816/power_meter-325)

This surprised me, because in this day and age I would expect servers like this (a current model from a name brand vendor) to not have ACPI problems, especially with Linux. But here we are. This particular set of ACPI error reports is happening because the Prometheus host agent was trying to read power usage information from /sys/class/hwmon that was theoretically available through ACPI.

In modern kernels, the acpi_power_meter kernel module is what extracts this information from the depths of ACPI (or tries to); it is, to quote it, "a hwmon driver for ACPI 4.0 power meters". As with all information from ACPI stuff, the driver does this by asking the kernel's general ACPI subsystem to perform ACPI magic, and it's this that is failing because Linux feels the BIOS's ACPI data has problems. Unfortunately there's no good way to fix bad ACPI data like this; all we can do is stop looking at it. In this case, the best way to do that is to unload the acpi_power_meter module and blacklist it so that it won't be reloaded on reboot. One set of directions for this is in this Prometheus host agent issue.

(Since the module seems to not be able to do anything due to the bad ACPI information, I don't feel too bad about blocking it entirely.)

As a side note, this is another case of a set of kernel error messages that should be rate-limited but aren't. The BIOS's ACPI data is rather unlikely to change while the kernel is running, so this error is essentially permanent until reboot. Reporting it every time something peers at the /sys hwmon files is not particularly useful and is a great way to have your kernel messages spammed, driving out more important things.

ACPIFlawedPowerMonitoring written at 23:55:05; Add Comment

2021-09-24

Understanding EGL, GLX and friends in the Linux and X graphics stack

Earlier this month I wrote a blog entry about Firefox having jank with WebRender turned on, in which I mentioned that the problem had appeared when I stopped using an environment variable $MOZ_X11_EGL that forced Firefox to use "EGL". The blog post led to someone filing a quite productive Firefox bug, where I learned in passing that Firefox is switching to EGL by default in the future. This made me realize that I didn't actually know what EGL was, where it fit into the Linux and X graphics stack, and if it was old or new. So here is a somewhat simplified explanation of what I learned.

In the beginning of our story is OpenGL, which in the 1990s became the dominant API (and pretty much the only one) for 3D graphics on Unix systems, as well as spreading to other platforms. However, OpenGL is more or less just about drawing things on a "framebuffer". Generally people on Unix and X don't want to just draw 3D things over top of the entire screen; they want to draw 3D things in an X window (or several) and then have those mix seamlessly with other 3D things being done by other programs in other windows. So you need to somehow connect the OpenGL world and the X world so that you can have OpenGL draw in a way that will be properly displayed in a specific X window and so on.

(This involves the action of many parties, especially once hardware acceleration gets involved and you have partially obscured windows with OpenGL rendering happening in them.)

The first version of an interconnection layer was GLX. As you can see from its features, GLX is a very X way of approaching the problem, since its default is to send all your OpenGL operations to the X server so that the X server can do the actual OpenGL things. The result inherits the X protocol's advantages of (theoretical) network transparency, at the cost of various issues. The 'glx' in programs like 'glxinfo' (used to find out whether your X environment has decent OpenGL capabilities) and 'glxgears' (everyone's favorite basic OpenGL on X test program) comes from, well, GLX. As suggested by the 'X' in its name, GLX is specific to the X Window System.

(Other platforms had similar interface layers such as WGL and CGL.)

Eventually, various issues led to a second version of an interconnection layer. This time around the design was intended to be cross platform (instead of being tied to X) and it was done through the Khronos Group, the OpenGL standards organization. The result is EGL, and you can read (some of?) its manpages here. EGL will let you use more than just OpenGL, such as the simpler subset OpenGL ES, and I believe its API is platform and window system independent (although any particular implementation is likely to be specific to some window system). EGL apparently fixes various inefficiencies and design mistakes in GLX and so offers better performance, at least in theory. Also, pretty much everyone working on the Unix graphics stack likes EGL much more than GLX.

On Unix, EGL is implemented in Mesa, works with X, and has been present for a long time (back to 2012); current documentation is here. Wayland requires and uses EGL, which is unsurprising since GLX is specific to X (eg). I suspect that EGL on X is not in any way network transparent, but I don't know and haven't tested much (I did try some EGL programs from the Mesa demos and they mostly failed, although eglinfo printed stuff).

On X, programs can use either the older GLX or the newer EGL in order to use OpenGL; if they want to use OpenGL ES, I believe they have to use EGL. Which one of GLX and EGL works better, has fewer bugs, and performs better has varied over time and may depend on your hardware. Generally the view of people working on Unix graphics is that everyone should move to EGL (cf), but in practice, well, Firefox has had a bug about it for nine years now and in my searches I've seen people say that EGL used to perform much worse than GLX in some environments (eg, from 2018).

While I'm here, Vulkan is the next generation replacement for OpenGL and OpenGL ES, at least for things that want high performance, developed by the Khronos Group. As you'd expect for something developed by the same people who created EGL, it was designed with an interconnection layer, Windows System Integration (WSI) (also [pdf]). I believe that a Vulkan WSI is already available for X, as well as for Wayland. Vulkan (and its WSI) is potentially relevant for the future partly because of Zink, a Mesa project to implement OpenGL on top of Vulkan. If people like Intel, AMD, and maybe someday NVIDIA increasingly provide Vulkan support (open source or otherwise) that's better than their OpenGL support, Zink and Vulkan may become an important thing in the overall stack. I don't know how an application using EGL and OpenGL on top of a Zink backed would interact with a Vulkan WSI, but I assume that Zink plumbs it all through somehow.

On Ubuntu, programs like eglinfo are available in the mesa-utils-extra package. On Fedora, the egl-utils package gives you eglinfo and es2_info, but for everything else you'll need the mesa-demos package.

PS: For one high level view of the difference between OpenGL and OpenGL ES, see here.

EGLAndGLXAndOpenGL written at 22:21:21; Add Comment

(Previous 10 or go back to September 2021 at 2021/09/20)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.