Wandering Thoughts archives

2020-05-04

Notes on the autoinstall configuration file format for Ubuntu 20.04

Up through Ubuntu 18.04 LTS, Ubuntu machines (usually servers) could be partially or completely automatically installed through the Debian 'debian-installer' system, which the Internet has copious documentation on. It was not always perfect, but it worked pretty well to handle the very initial phase of server installation in our Ubuntu install system. In Ubuntu 20.04, Canonical has replaced all of that with an all new system for automated server installs (you will very much want to also read at least this forum thread). The new system is strongly opinionated, rather limited in several ways, not entirely well documented, at least somewhat buggy, clearly not widely tested, and appears to be primarily focused on cloud and virtual machine installs to the detriment of bare metal server installs. I am not a fan, but we have to use it anyway, so here are some notes on the file format and data that the autoinstaller uses, to supplement the official documentation on its format.

(How to use this data file to install a server from an ISO image is a topic for another entry.)

If you install a server by hand, the install writes a data file, /var/log/installer/autoinstall-user-data, that in theory can be used to automatically reproduce your install. If you're testing how to do auto-installs, one obvious first step is to install a system by hand, take the file, and use it to attempt to spin up an automated install. Unfortunately this will not work. The file the installer writes has multiple errors, so it won't be accepted by the auto-install system if you feed it back in.

There are three minimum changes you need to make. First, set a global version:

#cloud-config
autoinstall:
  version: 1
  [...]

Then, in the keymap: section, change 'toggle: null' to use '' instead of null:

  keyboard: {layout: us, toggle: '', variant: ''}

Finally, change the 'network:' section to have an extra level of 'network:' in it. This changes from:

 network:
   ethernets:
     [...]

To:

 network:
   network:
     ethernets:
       [...]

Given that this is YAML, spaces count and you cannot use tabs.

If you want to interact with some portions of the installer but not all of it, these are specified in the 'interactive-sections' YAML section. For example:

 interactive-sections:
    - network
    - storage
    - identity

In theory you can supply default answers for various things in your configuration file for these sections, which show up when you get prompted interactively. In practice this does not entirely work; some default answers in your configuration file are ignored.

In network configuration, there currently appears to be no way to either completely automatically configure a static IP address setup or to supply default answers for configuring that. If you supply a complete set of static IP information and do not set the network section to be interactive, your configuration will be used during the install, but after the system boots, your configuration will be lost and the system will be trying to do DHCP. If you provide a straightforward configuration and set 'network' to interactive, the system will attempt to do DHCP during the install, probably fail, and when you set things manually your defaults will be gone (for example, for your DNS servers). The best you can do is skip having the system try to do DHCP entirely, with a valid configuration that the installer throws up its hands on:

    network:
      version: 2
      renderer: networkd
      ethernets:
        mainif:
          match:
            name: en*
          [...]

Then you get to set up everything by hand (in a setup that's a regression from what debian-installer could do in 18.04).

One of the opinionated aspects of the new Ubuntu installer is that you absolutely must create a regular user and give it a password (even if you're going to immediately wipe out local users to drop in your own general authentication system), and you cannot give a password to root; your only access to root is through 'sudo' from this regular user. The installer will give this user a home directory in /home; you will likely need to remove this afterward. You could skip making this 'identity' section an interactive section, except for the problem that the system hostname is specified in the 'identity' section and has no useful default if unset (unlike in debian-installer, where it defaults to the results of a reverse DNS lookup). Unfortunately once you make 'identity' an interactive section, the installer throws away your preset encrypted password and makes you re-enter it.

So you want something like this:

  identity: {hostname: '', password: [...],
    realname: Dummy Account, username: cs-dummy}

With the initial hostname forced to be blank (and 'identity' included in the interactive sections), the installer won't let people proceed until they enter some value, hopefully an appropriate one.

As sort of covered in the documentation, you can run post-install commands by specifying them in a 'late-commands:' section; they're run in order. When they're run, the installed system is mounted at /target and the ISO image you're installing from is at /cdrom (if you're installing from an ISO image or a real CD/DVD). If you want to run commands inside the installed system, you can use 'chroot' or 'curtin', but the latter requires special usage:

  late-commands:
    - curtin in-target --target=/target -- usermod -p [...] root

(The --target is the special underdocumented bit.)

There is no curtin program in the current server install CD; the installer handles running 'curtin' magically. This means that you can't interactively test things during the install on an alternate video console (you can get one with Alt-F2).

Initially I was going to say that the installer has no way to set the timezone. This is technically correct but not practically, because the installer assumes you're using cloud-init, so you set the timezone by passing a 'timezone' key to cloud-init for its 'timezone' module through the '_user-data:' section:

  user-data:
    timezone: America/Toronto

If you don't set this data, you get UTC. This includes if you do a manual installation with no configuration file, as you might be if you're just starting with Ubuntu 20.04. In that case, you want to set it with 'timedatectl set-timezone America/Toronto' after the system is up.

I haven't yet attempted to play around with the 'storage' section, although I have observed that it now wants to always use GPT partitioning. We always want disk partitioning to require our approval and allow intervention, but it would be handy if I can set it up so that the default partitioning that you can just select is our standard two disk mirrored configuration. As an important safety tip, when doing mirrored partitioning you need to explicitly make your second disk bootable (this applies both interactively and if you configure this in the 'storage' section). If you don't make a second disk bootable, the installer doesn't create an EFI boot partition on it. In the configuration file, this is done by setting 'grub_device: true' in the disk's configuration (which is different from partition configurations) and also including a 'bios_grub' partition:

storage:
  config:
  - {ptable: gpt, path: /dev/sda, wipe: superblock-recursive, preserve: false, name: '',
    grub_device: true, type: disk, id: disk-sda}
  - {device: disk-sda, size: 1048576, flag: bios_grub, number: 1, preserve: false,
    type: partition, id: partition-0}

Reading the documentation, it unfortunately appears that you can't specify the size of partitions as a percentage or 'all the remaining space'. This probably makes any sort of 'storage:' section in a generic autoinstall configuration not very useful, unless your systems all have the same size disks. I now think you might as well leave it out (and set 'storage' as an interactive section).

PS: It's possible that there are better ways to deal with several of these issues. If so, they are not documented in a way that can be readily discovered by people arriving from Ubuntu 18.04 who just want to autoinstall their bare metal servers, and who have no experience with Canonical's new cloud systems because they don't use cloud stuff.

PPS: It's possible that an Ubuntu 20.04 server ISO image will some day be made available that use the debian-installer or doesn't behave in all of these ways. Unfortunately, the only currently available 20.04 server ISO image is the 'live' image, which is apparently cloud-focused or at least includes and uses cloud focused tools by default.

linux/Ubuntu2004AutoinstFormat written at 22:52:41; Add Comment

The Go compiler has real improvements in new versions (and why)

When I wrote that I think you should generally be using the latest version of Go, I said that one reason was that new versions of Go usually include improvements that speed up your code (implicitly in meaningful ways), not just better things in the standard library. This might raise a few eyebrows, because while it's routine for new releases of C compilers and so on to tout better performance and more optimizations, these rarely result in clearly visible improvements. As it happens, Go is not like that. New major versions of Go (eg 1.13 and 1.14) often provide real and clearly visible improvements for Go programs, so that they run faster, use less memory, and soon will take up somewhat less space on disk.

My impression is that there are two big reasons that this happens in Go but doesn't usually happen in many other languages; they are that Go is still a relatively young language and it has a complex runtime (one that does both concurrency and garbage collection). Generally, Go started out with straightforward implementations of pretty much everything (both in the runtime and in the compiler), and it has been steadily improving them since. Sometimes this is simply in small improvements (especially in code generation, which sees a steady stream of small optimizations) and sometimes this is in much larger rewrites, such as the one that added asynchronous preemption of goroutines in Go 1.14 or the currently ongoing work on a better linker. Go's handling of memory allocation and garbage collection has especially seen a steady stream of improvements, sometimes major ones, such as the set covered in Getting to Go: The Journey of Go's Garbage Collector.

(And back in 2015, there was the rewrite of the compiler to have a new SSA backend (also), which unlocked significant opportunities for additional optimizations since then.)

Generally, other languages have had some combination of having a lot longer to mature and extract all of the straightforward optimizations from their compiler, having a simpler runtime environment that doesn't need as much development effort, or having a lot of very smart people working on them. Java, Javascript, and the Microsoft .NET languages all have complex runtimes, but they also have a lot of resources poured into their implementations, which means that they often improve at a faster rate than Go does (and they all pretty much started earlier). C and C++ compilers generally have simpler runtime environments that need less work and have also had a lot longer to optimize their code generation. What C compilers can already do is pretty spooky, so it's not terribly surprising that the improvements now are mostly small incremental ones. It will likely be a long time before Go gets to that level, if it every does (since there is a tradeoff between how fast you can compile and how much optimization you do, and Go values fast compile times).

programming/GoRealImprovementsWhy written at 00:19:24; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.