Why Linux bootloaders wind up being complicated
I'm not really a fan of GRUB2 for a variety of reasons, but at the same time I have to acknowledge that a significant part of why it's so large and sprawling is simply that good, flexible bootloaders for Linux have a complicated job, especially if they're designed to work for traditional BIOS MBR booting as well as UEFI. However, if you haven't poked into this area it may not be obvious how a bootloader's job goes from looking simple to actually being complicated.
Here's what a modern bootloader like GRUB2 has to go through to boot Linux (which is only the start of the story of how modern Linux systems boot):
- In a traditional BIOS environment, a very small initial boot
stage must load the full bootloader, sometimes in multiple steps;
this requires a fixed and known place to put at least some of the
next stage of the bootloader. In a UEFI system, the UEFI BIOS will
directly load and start the main bootloader code, sparing the
bootloader from having to deal with this.
- The bootloader reads a configuration file (possibly several
configuration files) from somewhere. At a minimum, this configuration
file will let the bootloader identify a bunch of kernel environments
that it can potentially boot. For modern Linux, booting a kernel
requires at least the kernel itself, its initial ramdisk, and the kernel
command line arguments. If the bootloader supports the Boot
Loader Specification (BLS), these kernel
environments are each in separate files in a directory and have
to all be inventoried and loaded one by one.
In a UEFI environment, this configuration information can live on the EFI system partition and be loaded through the UEFI services provided by the BIOS. However, requiring the bootloader configuration to live in the EFI system partition causes problems. In a BIOS environment, a modern bootloader is expected to load the configuration from some regular filesystem (and see later about that). So for both UEFI and BIOS MBR booting, a good bootloader will be able to read its main configuration from some Linux filesystem.
(A UEFI bootloader might sensibly start from an initial configuration file on the EFI system partition that tells it where to find the main configuration.)
- Optionally, the bootloader presents you with some menu that lets
you pick what you want to boot (either what kernel or what other
operating system on the machine in multi-boot setups). Often you
want to control how these boot entries are named and how many
of them are shown at once (with options to hide more of them in
Modern bootloaders are increasingly expected to not disturb whatever graphics and logo the system BIOS has left on the screen, or to disturb it as little as possible if they present a menu. The desire (although not necessarily the reality) is that you turn on the machine, see a logo that sits there with a progress spinner of some sort, and then the system login screen fades in, all without flashing or abrupt visual changes. This generally requires the bootloader to understand something about PC graphics modes and how to deal with them (although I think there may be some UEFI services that help you here if you're booting with UEFI).
As a quality of life issue, the bootloader should also let you (the person booting the system) temporarily modify things like kernel command line parameters because you can wind up with ones that cause your kernel to hang.
- The selected kernel and initial ramdisk are loaded into
memory from disk and the bootloader transfers control to them,
with its job done. Properly transferring control to a Linux
kernel requires some magic setup that's specific to Linux
(for example to tell the kernel where the initial ramdisk
is in memory).
As with the bootloader's configuration, a UEFI bootloader can in theory require that all the kernels and initial ramdisks live on the EFI system partition so that it can load them through standard UEFI system services. However this is even more inconvenient for people, so in practice a well regarded bootloader must support loading kernels and initial ramdisks from regular Linux filesystems, and in any case this is required if the bootloader wants to support BIOS MBR booting.
Loading kernels, initial ramdisks, perhaps configuration files, and
more from regular Linux filesystems opens up a world of complexity,
because in practice people want their bootloader to support the
entire storage stack, not just filesystems. If you have ext4
filesystems in LVM on
top of a mirrored or RAID-6 software RAID array, the bootloader has
to understand all levels of that in order to be able to load things.
If the bootloader doesn't have a fixed location for what it loads,
you have to be able to specify on what filesystem they're found and
what the storage stack looks like. Generally this goes in the
configuration file, which requires a whole language for specifying
the various levels of the storage stack (such as the UUIDs for the
/ filesystem, the RAID array, and so on, cf and also).
(Sometimes this causes issues when there are new filesystems and storage stacks people would like you to support, such as the challenges with booting ZFS from Linux with GRUB.)
Often the bootloader requires too much code for all of the filesystems, storage stacks, graphical options, and so on to build it all into the core bootloader (especially if you want the core bootloader to work with relatively little RAM). In this situation GRUB2 turns to loadable modules, which means that you also need a way to find and load those modules. A bootloader that works only on UEFI can once again put everything on the EFI system partition and load its modules with UEFI services, but otherwise the core bootloader needs enough built in code to find and load modules from at least some sort of filesystems and storage stacks.
(The bootloader may also want to load fonts, graphics, and so on in order to support showing menus and progress indicators in the graphics mode that the BIOS has left things in. And a bootloader may want a fancier menu than just a bunch of text.)
Bootloaders often want to support additional quality of life features like a flexible default for what kernel (or boot entry) is selected, one time booting of a particular kernel, configurable timeouts before the boot continues (either with or without the boot menu displayed), and especially some sort of recovery mode if things are broken. You don't really want your bootloader to fall over if the default kernel can't be loaded for some reason, although a simple bootloader can leave this out and tell you to boot from a recovery stick instead.
(Such a simple bootloader won't be popular with people running servers or anyone who wants unattended boots and reboots to be reliable even if things go a little wrong.)
A bootloader that wants to properly support UEFI Secure Boot needs to use UEFI services to verify the signatures on the kernel, any modules it loads, and so on. I believe that this is theoretically straightforward but can be complicated in practice.