2016-07-31
(Not) changing the stop timeout for systemd login session scopes
I wrote earlier about an irritating systemd reboot behavior, where systemd may twiddle its thumbs for
a minute and a half before killing some still-running user processes
and actually rebooting your machine. In that entry I suggested that
changing DefaultTimeoutStopSec in /etc/systemd/user.conf (or in
a file in user.conf.d) would fix this, and then reversed myself
in an update. That still leaves us with the question: how do you
change this for user scopes?
The answer is that you basically can't. As far as I can tell, there
is no way in systemd to change TimeoutStopSec for just user scopes
(or just some user scopes). You can set DefaultTimeoutStopSec in
/etc/systemd/system.conf (or in a file in system.conf.d), but
then it affects everything instead of just user scopes.
(It's possible that you're okay with this; you may be of the view that when you say 'shut the machine down', you want the machine down within a relatively short period even if some daemons are being balky. And aggressively killing such balky processes is certainly traditional Unix shutdown behavior; systemd is being really quite laid back here. This is an appealing argument, but I haven't had the courage to put it into practice on my machines.)
As you can see in systemctl status's little CGroup layout chart,
your user processes are in the following inheritance tree:
user.slice -> user-NNN.slice -> session-N.scope
You can create systemd override files for user.slice or even
user-NNN.slice, but they can't contain either DefaultTimeoutStopSec
or TimeoutStopSec directives. The former must go in a [Manager]
section and the latter must go in a [Service] one, and neither
section is accepted in slice units. There is a 'user@NNN.service'
that is set up as part of logging in, and since it's a .service
unit you can set a TimeoutStopSec in systemd override files for
it (either for all users or for a specific UID), but it isn't used
for very much and what you set for it doesn't affect your session
scope, which is where we need to get the timeout value set.
If you want to use a really blunt hammer you can set KillUserProcesses
in /etc/systemd/logind.conf (or in a logind.conf.d file).
However, this has one definite and one probable drawback. The
definite drawback is that this kneecaps screen, tmux, and any
other way of keeping processes running when you're not logged in,
unless you always remember to use the workarounds (and you realize
that you need them in each particular circumstance).
(I don't know what it does to things like web server CGIs run as
you via suexec or processes started under your UID as part of
delivering mail. Probably all of those processes count as part of
the relevant service.)
The probable drawback is that I suspect systemd does this process killing in the same way it does it for reboots, which means that the default 90 second timeout applies. So if you're logging in with processes that will linger, you log out, and then you immediately try to reboot the machine, you're still going to be waiting almost the entire timeout interval.
2016-07-25
An irritating systemd behavior when you tell it to reboot the system
For reasons well beyond the scope of this entry, I don't use a
graphical login program like gdm; I log in on the text console and
start X by hand through xinit (which is sometimes annoying). When I want to log out, I cause the
X server to exit and then log out of the text console as normal.
Now, I don't know how gdm et al handle session cleanup, but for me
this always leaves some processes lingering around that just haven't
gotten the message to give up.
(Common offenders are kio_http_cache_cleaner and speech-dispatcher
and its many friends. Speech-dispatcher is so irritating here that
I actually chmod 700 the binary on my office and home machines.)
Usually the reason I'm logging out of my regular session is to
reboot my machine, and this is where systemd gets irritating. Up
through at least the Fedora 24 version of systemd, when it starts
to reboot a machine and discovers lingering user processes still
running, it will wait for them to exit. And wait. And wait more,
for at least a minute and a half based on what I've seen printed.
Only after a long timer expires will systemd send them various
signals, ending in SIGKILL, and force them to exit.
(Based on reading manpages it seems that systemd sends user processes
no signals at all at the start of a system shutdown. Instead it
probably waits TimeoutStopSec, sends a SIGTERM, then waits
TimeoutStopSec again before sending a SIGKILL. If you have a
program that ignores everything short of SIGKILL, you're going
to be waiting two timeout intervals here.)
At one level, this is not crazy behavior. Services like database engines may take some time to shut down cleanly, and you do want them to shut down cleanly if possible, so having a relatively generous timeout is okay (and the timeout can be customized). In fact, having a service have to be force-killed is (or should be) an exceptional thing and means that something has gone badly wrong. Services are supposed to have orderly shutdown procedures.
But all of that is for system services and doesn't hold for user
session processes. For a start, user sessions generally don't have
a 'stop' operation that gets run explicitly; the implicit operation
is the SIGHUP that all the processes should have received as the
user logged out. Next, user sessions are anarchic. They can contain
anything, not just carefully set up daemons that are explicitly
designed to shut themselves down on demand. In fact, lingering user
processes are quite likely to be badly behaved. They're also
generally considered clearly less important than system services, so
there's no good reason to give them much grace period.
In theory systemd's behavior is perhaps justifiable. In practice, its generosity with user sessions simply serves to delay system reboots or shutdowns for irritatingly long amounts of time. This isn't a new issue with systemd (the Internet is full of complaints about it), but it's one that the systemd authors have let persist for years.
(I suspect the systemd authors probably feel that the existing ways to change this behavior away from the default are sufficient. My view is that defaults matter and should not be surprising.)
When I started writing this entry I expected it to just be a grump,
but in fact it looks like you can probably fix this behavior. The
default timeout for all user units can be set in /etc/systemd/user.conf
with the DefaultTimeoutStopSec setting; set this down to less
than 90 seconds and you'll get a much faster timeout. However I'm
not sure if systemd will try to terminate a user scope other than
during system shutdown, so it's possible that this setting will
have other side effects. I'm tempted to try it anyways, just because
it's so irritating when I slip up and forget to carefully kill
all of my lingering session processes before running reboot.
Update: I'm wrong. Setting things in user.conf does nothing
for the settings you get when you log in.
(You can also set KillUserProcesses in /etc/system/logind.conf,
but that definitely will have side effects you probably don't
want,
even if some people are trying to deal with them anyways.)
I should learn more about Grub2
I have a long-standing dislike of Grub2 (eg, and). Ever since I started having to deal with it I've felt that it's really overcomplicated, and this complexity makes it harder to deal with. There's a lot more to know and learn with Grub2 than there is with the original Grub, and I resent the added complexity for what I feel should be a relatively simple process.
You know what? The world doesn't care what I think. Grub2 is out
there and it's what (almost) everyone uses, whether or not I like
it. And recent events have shown me that
I don't know enough about how it works to really troubleshoot
problems with it. As a professional sysadmin, it behooves me to fix
this sort of a gap in my knowledge for the same reason that I
should fix my lack of knowledge about dpkg and apt.
I'm probably never going to learn enough to become an expert at
Grub 2 (among other things, I don't think there's anything we do
that requires that much expertise). Right now what I think I should
learn is twofold. First, the basic operating principles, things
like where Grub 2 stores various bits of itself, how it finds things,
and how it boots. Second, a general broad view of the features and
syntax it uses for grub.cfg files, to the point where I can read
through one and understand generally how it works and what it's
doing.
(I did a little bit of this at one point, but much of that knowledge has worn off.)
Unfortunately there's a third level I should also learn about. Grub2
configurations are so complex that they're actually mostly written
and updated by scripts like grub2-mkconfig. This means that if I
want to really control the contents of my grub.cfg on most systems,
I need to understand broadly what those scripts do and what they
get controlled by (and thus where they may go wrong). Since I don't
think this area is well documented, I expect it to be annoying and
thus probably the last bit that I tackle.
(If I cared about building custom grub2 configurations, it should be the first area. But I don't really; I care a lot more about understanding what Grub2 is doing when it boots our machines.)
2016-07-22
Ubuntu 16.04's irritatingly broken MySQL updates
So, Ubuntu 16.04 can't apply MySQL server updates if you have the server installed but have disabled it running. Good show, you lot.
We install the MySQL server package on a few machines but deliberately don't start the daemon. In older versions of Ubuntu, this worked reasonably well; you could do it, you could keep the daemon from starting on boot, and you could apply updates (although doing so generally started the daemon up, so you had to remember to then go stop it). In 16.04, if you've disabled the daemon your attempts to apply updates will error out:
mysql_upgrade: Got error: 2002: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) while connecting to the MySQL server Upgrade process encountered error and will not continue. mysql_upgrade failed with exit status 11
The direct cause of this problem is that the mysql-server-5.7
postinstall script needs to run mysql_upgrade, which requires
the server to be running. Perhaps at this point you sigh, run
'service mysql start', and try the upgrade again. It'll still
fail, because the postinstall script is more complicate and more
wrong than that.
The postinstall script needs to stop the MySQL daemon, do some
things, and then start the daemon again and run mysql_upgrade
(and then restart the daemon yet again). It does all of this starting
and restarting by running invoke-rc.d, and invoke-rc.d
specifically refuses to start disabled daemons. In the grand Unix
tradition, this behavior is burried in an innocuous phrasing in
the invoke-rc.d manpage:
invoke-rc.d is a generic interface to execute System V style init script /etc/init.d/name actions, obeying runlevel constraints as well as any local policies set by the system administrator.
Via a complex chain of actions, what 'obeying runlevel constraints'
translates to here is that if you do 'systemctl disable <whatever>',
invoke-rc.d will decided that <whatever> is specifically blocked
from running and not start it.
(Invoke-rc.d in general is the wrong tool in Ubuntu 16.04, because
it's actually fairly tied to the System V init framework. The system
goes through ugly hacks in order to make it work 'right' on things
that are actually native systemd .service units, as the MySQL
daemon is.)
This selectivity is the wrong approach, or at least it's in the
wrong place. What the postinst script should really be doing is
unconditionally shutting down the server, unconditionally starting
it to run mysql_upgrade, unconditionally shutting it down again,
and only then using invoke-rc.d to conditionally start it again.
This would achieve the twin goals of upgrading MySQL while not
leaving the daemon running if it's disabled. This would actually
be an improvement over the 14.04 situation, instead of a massive
headache.
(Of course I expect that the real answer is simply that no one thought about this possibility, and that if we were to file a bug report we'd be told that disabling the daemon is not a supported configuration.)
The workaround is simple. Before you try to apply a MySQL server pack
update, do 'systemctl enable mysql'. After it's done, do 'systemctl
disable mysql; systemctl stop mysql' to return to the original state.
Sidebar: 'Runlevel constraints' and invoke-rc.d
Invoke-rc.d checks to see whether something is enabled or disabled
by looking for S* and K* symlinks in /etc/rc<runlevel>.d. In
16.04, the 'runlevel' is arbitrary and is reported as '5', so we're
looking at /etc/rc5.d. When you do 'systemctl disable' or
'systemctl enable' on an Ubuntu 16.04 system and the service also
has an /etc/init.d file, systemctl helpfully maintains rcN.d S*
and K* symlinks for you. So running 'systemctl disable mysql'
also creates a /etc/rc5.d/K02mysql, which invoke-rc.d will then
see as saying that mysql is specifically constrained to not start
in runlevel 5, and so should not be started.
(If there was no /etc/rc5.d symlink at all, invoke-rc.d would
also conclude that it shouldn't start the MySQL daemon.)
2016-07-19
An interesting (and alarming) Grub2 error and its cause
I upgraded my office workstation from Fedora 23 to Fedora 24 today,
following my usual procedure of doing a live upgrade with dnf.
Everything went smoothly, which is normal, and it was pretty fast,
which isn't my normal experience but was probably because my root
filesystem is now on SSDs. After the updates
finished, I ran the grub2-install command that you're instructed
to do and rebooted. My machine made it into Grub's menu but trying
to start any kernel immediately halted with an error about the
symbol grub_efi_secure_boot not being found (as in this Fedora
24 bug report
or this old Ubuntu one).
This could politely be called somewhat alarming. Since it seemed to involve (U)EFI booting in some way, I went through the BIOS settings for my motherboard to see if I could turn that off and force a pure BIOS boot to make things work. Naturally I wound up looking through the boot options screen, at which point I noticed that the boot order looked a little odd. The BIOS's boot list didn't have enough room to display full names for drives, but the first and second drives had names that started with 'ST31000', and things called 'Samsung ...' were way down the list at the bottom.
At this point the penny dropped: my BIOS was still booting from my
older hard drives, from before I'd moved the root filesystem to the
SSDs. The SSDs were definitely considered sda and sdb by Linux
and they're on the first two SATA links, but the BIOS didn't care;
as far as booting went, it was sticking to its old disk ordering.
When I'd updated the Grub2 boot blocks with grub2-install, I'd
of course updated the SSD boot blocks because that's what I thought
I was booting from; I hadn't touched the HD boot blocks. As a result
the old Fedora 23 Grub boot blocks were trying to load Fedora 24
Grub modules, which apparently doesn't work very well
and is a classic cause of these Grub 'undefined symbol' errors.
Once I realized this the fix was pleasantly simple; all I had to do was put the SSDs in their rightful place at the top of the (disk) boot priority list. Looking at the dates, this is the first Fedora version upgrade I've done since I added the SSDs, which explains why I didn't see it before now.
There's an argument that the BIOS's behavior here is sensible. If I'm correct about what's going on, it has essentially adopted a 'persistent boot order' in the same way that Linux (and other Unixes) are increasingly adopting persistent device names. I can certainly see people being very surprised if they add an extra SSD and suddenly their system fails to boot or boots oddly because the SSD is on a channel that the BIOS enumerates first. However, it's at least surprising for someone like me; I'm used to BIOSes cheerfully renumbering everything just because you stuck something into a previously unused SATA channel. A BIOS that doesn't do that for boot ordering is a bit novel.
(This may be especially likely on motherboards with a mix of 6G and 3G SATA ports. You probably want the 6G SATA ports enumerated first, and even if HDs live there for now, they're going to wind up being used for SSDs sooner or later.)
In the process of writing this entry I've also discovered that while
I moved my root filesystem over to the SSDs, I seem to never have
moved /boot; it's still a mirrored partition on the HDs. I'm not
sure if this was something I deliberately planned, if I was going
to move /boot later but forgot, or if I just plain overlooked the
issue. I have some notes from my transition planning, but they're
silent on this.
(Since /boot is still on the HDs, I'm now uncertain both about
how the BIOS is actually numbering my drives and how Grub2 is finding
/boot. Maybe the Grub boot blocks (technically the core image)
have a hard-coded UUID for /boot instead of looking at specific
BIOS disks.)
2016-07-04
A feature I wish the Linux NFS client had: remapping UIDs and GIDs
My office workstation is a completely standalone machine, deliberately not dependent on, say, our NFS fileserver infrastructure. As a sysadmin's primary machine, there are obvious motivations for this and it's not something I'm ever going to change, but at the same time it does have its drawbacks and it'd be nice to sometimes NFS mount my fileserver home directory on my workstation.
Unfortunately there is a little but important obstacle. For peculiar historical reasons, the UID and GID of my local workstation account are not the same as on our fileservers (and thus our Ubuntu servers and so on). My login name is the same so things like SSH work fine, but NFS cares about the actual UID and GID and I wind up out of luck as far as NFS mounts go.
In theory the solution to this is NFS v4. In practice we don't use NFS v4 now, we're very unlikely to add NFS v4 any time soon for general use, and there is exactly zero chance that we'll add NFS v4 to our environment just so that I can NFS mount my fileserver home directory on my office workstation. Put bluntly, there are much easier solutions to that particular problem, ones that put all the work on my head where it rightfully belongs.
Hence my wish that the NFS client would support remapping UIDs and GIDs between the NFS server's view and the local Unix's view. In my particular situation I'd even be happy with a mount option that said 'always tell the server that we're UID X and GID Y', because that's all I need.
There's a pro and a con argument for doing this in the NFS client instead of the NFS server. The pro argument is that it's easier to scale this administratively if you can do it in the client. If it's done on the client, only the person running the client has to care; if it's done on the server, the server administrators have to be involved every time another client needs another UID or GID remapping.
The con argument is that NFS v3 'security' is already leaky enough without handing people a totally convenient client side way of subverting it totally (well, if you have root on a client). Yes, sure, you can already do this in a number of ways if you have client root, but all of those ways take at least some work. This feature would make it trivially easy, and there's a benefit to avoiding that.
(I expect that the real answer here is 'the Linux NFS maintainers have no interest in adding NFS v3 UID and GID mapping code in either the client or the server; use NFS v4 if you need this'. On the other hand, they did add NFS v3 server support for more than 16 GIDs.)
2016-06-20
A tiny systemd convenience: it can reboot the system from RAM alone
One of the things I do a fair bit of is building and testing
from-scratch system installs. Not being crazy, I do this in virtual
machines (it's much faster that way). When you do this sort of work,
you live in a constant cycle of installing a machine from scratch,
testing it, and then damaging the install enough so that when you
reboot, your VM will repeat the 'install from scratch' part. Most
of the time, the most convenient way to damage the install is with
dd:
dd if=/dev/zero of=/dev/sda bs=1024k count=32; sync reboot
(The sync can be important.)
Dd'ing over the start of the (virtual) disk makes sure that there isn't a partition table and a bootloader any more, and it also generally prevents the install CD environment from sniffing around and finding too many traces of your old installed OS.
On normal System V init or Upstart based systems, this sequence has
a minor little irritation: the reboot will usually fail. This is
because the reboot process needs to read files off the filesystem,
which you've overwritten and corrupted with the dd. Then you (I)
get to go off to the VM menus and say 'power cycle the machine',
which is just a tiny little interruption.
With systemd, at least in Ubuntu 16.04, this doesn't happen. Sure, a number of things run during the reboot process will spit out various errors, but systemd continues driving everything onwards anyways and will successfully reboot my virtual machine with no further activity on my part. The result is every so slightly more convenient for my peculiar usage case.
I believe that systemd can do this for several reasons. First,
systemd parses and loads all unit files into memory when it starts
up (or you tell it 'systemctl daemon-reload'), which means that
it doesn't have to read anything from disk in order to know what
needs to be done to shut the system down. Second, systemd mostly
terminates processes itself; it doesn't need to repeatedly get
scripts to run kill and the like, which could fail if kill or
other necessary bits have been damaged by that dd. Finally,
I think that systemd can handle calling reboot() internally,
instead of having to run an executable (which might not be there)
in order to do this.
(Systemd clearly has internal support in PID 1 for rebooting the system under some circumstances. I'm not quite clear if this is the path that a normal reboot eventually takes; it's a bit tangled up in units and handling this and that and so on.)
PS: Possibly there is a better way to damage a system this way than
dd. dd has the (current) virtue of being easy to remember and
clearly sufficient. And small variants of this dd command work on
any Unix, not just Linux (or a particular Linux).
2016-06-15
ZFS on Linux has just fixed a long standing little annoyance
I've now been running ZFS on Linux for a while.
Over that time, one of the small little annoyances of the ZoL
experience has been that all ZFS commands required you to be root,
even if all you wanted to do was something innocuous like 'zpool
status' or 'zfs list'. This wasn't for any particularly good
reason and it's not how Solaris and Illumos behave; it was just
necessary because the ZoL kernel code itself had no permissions
restrictions on anything for complicated porting reasons. Anyone
who could talk to /dev/zfs could do any ZFS operation, including
dangerous and destructive ones, so it had to be restricted to root.
Like many people running ZoL, I dealt with this in a straightforward
way. To wit, I set up a /etc/sudoers.d/02-zfs file that allowed
no-password access to a great big list of ZFS commands that are
unprivileged on Solaris, and then I got used to typing things like
'sudo zpool status'. But this was never a really great experience
and it's always been a niggling annoyance.
I'm happy to report that as of a week or so ago, the latest development
version of ZoL now has fixed this issue. Normal non-root users can
now run all of the ZFS commands that are unprivileged on Solaris.
As part of this, ZoL now supports normal ZFS 'zfs allow' and 'zfs
unallow' for most operations, so you can (if desired) allow yourself
or other normal users to do things like create snapshots.
(Interestingly, poking around at this caused me to re-discover that
'zpool history' is a privileged operation even on Solaris. I
guess some bits of my sudoers file are going to stay.)
Things like this are part of why I've been pretty happy to run the development version of ZoL. Even the development version has been pretty stable, and it means that I've gotten a fair number of interesting and nice features well before they made it into one of the infrequent ZoL releases. I don't know how many people run the development version, but my impression is that it's not uncommon.
(I can't blame the ZoL people for the infrequent releases, because they want releases to be high quality. Making high quality releases is a bunch of work and takes careful testing. Plus sometimes the development tree has known outstanding issues that people want to fix before a release. (I won't point you at the ZoL Github issues to see this, because there's a fair amount of noise in them.))
2016-06-14
Some notes on how xdg-open picks which web browser to use
Let's start with my tweet:
It sure would be convenient if xdg-open documented how it picked which browser to use, so you could maybe change that to one you like.
XDG here is what used to be called the 'X Desktop Group' and is now
freedesktop.org (per wikipedia). As you might then
guess, xdg-open the program is the canonical KDE/GNOME/etc agnostic
way of opening (among other things) an URL in whatever is nominally
your preferred browser. Since providing a web interface as your
package's UI is an increasingly common thing to do these days, there
are any number of programs and scripts and so on that want to do
this and do it by just running xdg-open, which gives you whatever
strange browser
xdg-open decided to inflict on you today.
If you read the xdg-open manpage, it's remarkably
uninformative about how it decides what your default or preferred
web browser is. On the one hand xdg-open is currently a shell
script so you could just read it to find this out; on the other
hand, it's a 900+ line shell script full of stuff, which makes it
not exactly clear 'documentation'. So here are some notes on the
current behavior, boiled down to the useful bits.
(Because none of this is documented, freedesktop.org is free to change it at any time if they feel like it. They probably won't, though.)
Broadly, xdg-open does the following when you ask it to open an
URL:
- first, it tries to figure out if you're running a desktop
environment that it knows about. If you are, it uses the
desktop's own mechanism to open URLs if there is one (these
are things like
kde-open,gvfs-open, andexo-open). The important thing here is that this should automatically respect your normal desktop browser preference, since it's using the same mechanism. - if
$BROWSERis set, it's used immediately. I'll discuss what it can be set to later. - in some but not all versions of
xdg-open, it will attempt to determine the correct program to handle the nominal MIME type of 'x-scheme-handler/http' (or https, or ftp, or whatever) and then run this program. This determination is made byxdg-mime; how it works is complicated enough to require another entry.In versions of
xdg-openthat support this, it's extremely likely that this will produce a result andxdg-openwill never go on to its final option. - if all else fails,
xdg-opentries to run a whole bunch of possible browsers (via the$BROWSERmechanism). The most important thing about this is that the first default browserxdg-opentries isx-www-browser.
Some but not all Linux distributions have a /usr/bin/x-www-browser.
If yours has one, it may or may not point to something useful or
what you want, and it may vary from version to version (and on what
packages you wind up installing). For example, on our Ubuntu 12.04
and 14.04 machines x-www-browser points to arora by default,
which is not what I consider a particularly good choice on machines
with Firefox installed (it's certainly likely to puzzle users who
wind up running it). On 16.04 it points to Chromium, which is at
least a plausible default choice.
On current Fedora and Ubuntu 16.04, the full list of (graphical) browsers, in order, is:
x-www-browser firefox iceweasel seamonkey mozilla epiphany konqueror chromium-browser google-chrome
The useful thing here is that xdg-open searches for all of these
using your $PATH, so you can create a personal x-www-browser
symlink that points to whatever you want without having to risk
unintended side effects from eg setting $BROWSER.
If your xdg-open uses xdg-mime, you can determine what
program it will use (more or less) by running eg:
$ xdg-mime query default x-scheme-handler/http firefox.desktop
Xdg-mime will let you set this stuff as well as query it, but of
course doing so may have more consequences than just changing
xdg-open's behavior. However at that point there is no really
good solution; if xdg-open is consulting xdg-mime, I think all
you can do is either set $BROWSER or set xdg-mime's defaults.
(If you're setting scheme handlers in xdg-mime, remember to get
at least http and https, and perhaps ftp as well.)
While the xdg-open manpage will refer you to the xdg-settings
manpage,
this is potentially quite misleading. If you aren't using a desktop
environment, there's no guarantee that what xdg-settings reports
as your default browser is what xdg-open will necessarily run.
The two scripts use different code and as far as I can tell, neither
makes any particular attempt to carefully duplicate each other's
logic.
(If you are running a desktop environment that xdg-settings
knows about, well, you're at the mercy of how well it can read
out your DE's default browser. I wouldn't hold my breath there.)
Sidebar: What $BROWSER can be set to
As xdg-open interprets it, $BROWSER is a colon separated list
of commands, not just program names (ie you can include things
like command line switches). A command may contain a %s, which
will be substituted with the URL to be opened; otherwise the URL
is just tacked on the end of the command. Xdg-open does not exactly
have sophisticated handling of quoting and argument splitting or
anything, so I would do nothing tricky here; I wouldn't even really
trust the %s substitution. If you think you need it, use a cover
shell script.
How xdg-mime searches for MIME type handlers (more or less)
XDG is what was once called the X Desktop Group and is now
freedesktop.org. Part of their
work is to create various cross-desktop and desktop agnostic standards
and tools for how to do things like determine what program should
handle a particular MIME type. This includes special 'scheme' MIME
types, like x-scheme-handler/ftp, which means 'whatever should
handle FTP' (technically this is 'FTP urls').
The XDG tool for mapping MIME types to programs (actually .desktop
files) is xdg-mime. Like all of
the XDG tools, xdg-mime uses a collection of environment variables,
which will normally have the default values covered in the XDG
base directory specification.
In theory there are two sorts of data that xdg-mime looks at. The
first, found in files called <desktop>-mimeapps.list and
mimeapps.list, is supposed to be a set of specifically chosen
applications (whether configured by the user or the system). The
second is a general cache of MIME type mapping information based
on what program can handle what MIME types (as listed in each
program's .desktop file); these are found in files called
defaults.list and mimeinfo.cache. In practice, a system
mimeapps.list file may have just been slapped together based on
what the distribution thought was a good idea, and it may not
correspond to what you want (or even what you have installed).
(Not all Linux distributions ship a system mimeapps.list.
Fedora does; Ubuntu doesn't.)
Xdg-mime searches for mimeapps.list in any of $XDG_CONFIG_HOME,
$XDG_CONFIG_DIRS, and the 'applications' subdirectories of
$XDG_DATA_HOME and $XDG_DATA_DIRS. It searches for the other
two files only in the applications subdirectories of $XDG_DATA_HOME
and $XDG_DATA_DIRS.
Now, we get to an important consequence of this. $XDG_DATA_DIRS
is normally two directories, /usr/local/share and /usr/share,
each of which has MIME type information only for itself, and they
are checked in order. The normal result of this is that things
installed into /usr/local steal MIME types from things installed
in /usr because xdg-mime will check the
/usr/local/share/applications files first.
(I discovered this when I installed Google Chrome on my laptop and
it promptly stole URL handling from Firefox, which I did not want,
because the RPM put bits into /usr/local instead of /usr.
Actually finding the files that controlled this was remarkably
hard.)
Normally, nothing automatically generates or updates the system
mimeapps.list; on Fedora, it's part of the shared-mime-info RPM
(and kde-settings for the kde-mimeapps.list version). The
mimeinfo.cache files are maintained by update-desktop-database,
which will normally be automatically run on package installs, removals,
and probably upgrades.
Now, xdg-mime does not give you an actual program to run. What
it does is give you the name of a .desktop file, eg firefox.desktop.
In order to use this to run a program, you must look through the
right magic places to find the .desktop file and then parse it
to determine the command line to run. Probably you don't want to
do this yourself, but I don't know what your alternative is; as far
as I know, there is no XDG tool script to say 'run this .desktop
command with the following arguments'.
(Note that the .desktop file is not necessarily in the same
directory as the MIME mapping file that caused it to be the handler
application. For example, your $HOME/.local/share/applications
might just have various MIME type overrides you've set for what
system application should handle what.)
This explanation is somewhat dependent on what exact version of the XDG tools and scripts you have installed. It's also not necessarily totally complete, because I am reading through undocumented shell scripts here and I've left a few things out. See the Arch wiki entry on default applications for much more information.
PS: If you're trying to track down why xdg-mime is giving you
some strange result, set $XDG_UTILS_DEBUG_LEVEL to at least
'2'. This will tell you just what files it looked at when,
although I don't think it ever reports what directories it looked
at but didn't find any files in.