Wandering Thoughts

2018-06-16

The 'on premise' versus 'off premise' approach to environments

As a result of thinking about why some people run their own servers and other people don't, it struck me today that on the modern Internet, things have evolved to the point where we can draw a division between two approaches or patterns to operating systems and services. I will call these the on premise and off premise patterns.

In the on premise approach, you do most everything within a self contained and closed environment of your own systems (a 'premise'). One obvious version of this is when you have a physical premise and everything you work with is located in it. This describes my department, for example, and many similar sysadmin setups; since we operate physical networks, have printers, and so on, we have no real choice but to do things on premise with physical hardware, firewall servers, and so on. However, the on premise approach doesn't require you to be doing internally focused work or for you to have physical servers. You can take the on premise approach in a cloud environment where you're running a web business.

(You can have a rousing debate over whether you can truly have a single on premise environment if you're split across multiple physical locations, or a physical office plus a cloud.)

In the off premise approach, you don't try to have a closed and self contained environment of your own systems and services, a 'premise' that stands alone by itself. Instead you have a more permeable boundary that you reach across to use and even depend on outside things, up to and including things from entirely separate companies (where all you can really do if there's a problem is wait and hope). The stereotypical modern Silicon Valley startup follows an off premise and outsourced approach for as many things as it can, and as a result works with and relies on a whole host of Software as a Service companies, including for important functions such as holding its source code repositories and coordinating development (often on Github).

An off premise approach doesn't necessarily require outsourcing to other companies. Instead I see it as fundamentally an issue of how self contained (and complete) your service environments are. If you're trying to do most everything yourself within an environment, or within a closely connected cluster of them, you're on premise. If you have loosely connected services that you group into different security domains and talk across the Internet to, you're probably off premise. I would say that running your own DNS servers completely outside and independently of the rest of your infrastructure is an off premise kind of thing (having someone else run them for you is definitely off premise).

While there's clearly a spectrum in practice, my impression is that on premise and off premise are also mindsets and these mindsets are generally sticky. If you're in the on premise mindset, you're reflexively inclined to keep things on premise, under your control; 'letting go' to an outside service is a stretch and you can think of all sorts of reasons that it'd be a problem. I suspect that people in the off premise mindset experience similar things in the other direction.

(As you might guess, I'm mostly an on premise mindset person, although I've been irradiated by the off premise mindset to a certain extent. For example, even though I'm in no hurry to run my own infrastructure for email, I'm even less likely to outsource it to a provider, whether GMail or anyone else.)

OnPremiseVsOffPremiseApproach written at 01:06:32; Add Comment

2018-05-15

I have a boring desktop and I think I'm okay with that

Every so often I wind up either reading about or looking at pictures of beautifully customized Unix desktops. These are from people who have carefully picked colors and themes, often set up a highly customized window manager environment, set up all sorts of panels and information widgets, and so on (one endless source of these is on reddit). Sometimes this involves using a tiling window manager with various extension programs. I look at these things partly because there's something in me that feels drawn to them and that envies those setups.

My desktop is unconventional, but within that it's boring. It has colours that are either mismatched or at best vaguely matched, various font choices picked somewhat at random, and assorted decorations and so on that are there mostly because they're what's easy to do in the plain old standby of fvwm. There's very little design and very little use of interesting programs; I mean, I still use xclock and xload, and I don't think fvwm is considered an interesting window manager these days.

(Fvwm certainly has limitations in terms of what you can easily do in it. A dedicated person could expand fvwm's capabilities by use of modern X programs like wmutils and wmctrl, but I'm not such a person.)

I wound up thinking about this when I was recently reading another article about this (via, itself via), and this time around I came to a straightforward realization, one that I could have arrived at years ago: I'm not that dedicated and I don't care that much. My mismatched, assorted desktop is boring, but it works okay for me, and I've become the kind of pragmatic person that is okay with that.

I wouldn't mind a nicer desktop and every so often I make a little stab in that direction (I recently added some fvwm key bindings that were inspired by Cinnamon), but I'm never going to do the kind of work that's required to build a coherent custom desktop or make the kind of sacrifices required. Tiling window managers, programmable window managers, highly custom status bars, all of that stuff is neat to read about and look at, but it's not something for me. The best I'm likely to ever do is minor changes around the edges (at least until Wayland forces me to start over from scratch). And so my desktop is always going to be boring. I think I'm finally okay with that.

(There's a freedom in giving up in this sense. One way to put it is that I can stop feeling slightly guilty about not having a nicer, more coherent desktop environment, or in having something that's the kind of advanced environment you might expect a serious Unix person to have. I know this is an irrational feeling, but no one said feelings are rational.)

PS: This also means that I can give up thinking about switching to another window manager. It's quite unlikely I could find one that I want to use other than fvwm (fvwm's icon manager is extremely important to me), but I've long had the thought that there might be a better one out there somewhere. Maybe there is, but even if there is, it's probably way too much effort to switch.

MyBoringDesktop written at 01:20:46; Add Comment

2018-05-03

Using grep to hunt around for null bytes in text files

Suppose, not entirely hypothetically, that you've developed a suspicion that some mailbox files have zero (null) bytes in them. Null bytes are not traditionally found in mail messages, or indeed in any text format, and their presence is not infrequently a sign that something has gone wrong (in email related things, an obvious suspect is locking issues). So if you suspect null bytes might be lurking you want to at least check for them and perhaps count them, and once you've found some null bytes you may want to know more about the context they occur in.

On most modern Unixes, searching for null bytes is most easily done with GNU grep's 'grep -P' option, which lets you supply a Perl-style regular expression that can include a direct byte value:

grep -l -P '\x00' ....

If your version of grep doesn't support -P (FreeBSD's doesn't), you'll have to investigate more elaborate approaches. The general problem is that a lot of things will interpret a literal null byte as a C end-of-string; you need to find a way to supply one to your grep that doesn't fall victim to this, and then hope your (e)grep does the right thing.

(Really, it might be easier to get and compile the latest GNU grep just for this.)

Once you've found files with null bytes, you might want to do things like count how many null bytes you have and in how many different places. As I found out recently, modern versions of awk are perfectly happy about null bytes in their input, which makes life reasonably easy when combined with 'grep -o':

grep -ao -P '\x00+' FILE |
 awk '{cnt += 1; tlen += length($0)}
      END {print cnt, tlen}'

(More elaborate analysis is up to you. I looked at shortest, longest, and average size.)

To really dig into what's going on, you need to see the context that these null bytes occur in. In an ideal world, less would let you search for null bytes, so you could just do 'less FILE', find the first null byte, and go look around as usual. Unfortunately less has no such feature as far as I know, and neither does any other pager. I will save you the effort and say that the easiest way to do this is to use grep, telling it to provide some context:

grep -a -n -C 1 -P '\x00' FILE | less

It's worth breaking down why we're doing all of this. First, we're feeding the grep's output to less because less will actually show us the null byte or bytes, instead of silently not printing it. Grep's -P argument we're already familiar with. -a forces grep to consider the file printable, despite the null bytes. -C N is how many lines of context we want before and after the line with null bytes. Finally, -n prints the line numbers involved. We want the line numbers because with them, we can do 'less FILE' and then jump to the spot with the null bytes using the line number.

When looking at the output from less here, remember that null bytes will be printed as two characters (as '^@') even though they're only a single character. Where this came up for me was that one of our null bytes was in some base64 lines, and initially I was going to say that the null byte was clearly an addition to it because the line with it in was a character longer than the other base64 lines around it. Then I realized this expansion, and thus that our null byte had replaced a base64 character instead of being inserted between two.

(Unfortunately all of this looking brought us not much closer to having some idea of why null bytes are showing up in people's mailboxes, although there's indications that some of them may have been inserted by the original sender and then passed intact through Exim. What I haven't done yet and should do is actually test how various elements of our overall mail system behave when fed SMTP messages with null bytes, although that would only tell me the system's current behavior, not what it used to do several or many years ago, which is when some of the nulls apparently date from.)

PS: I've tried loading the mailbox file into an editor to use it to search for nulls. For a sufficiently large mailbox, the result didn't go all that well, and I had to worry about inadvertently modifying the file. Perhaps there is an 'editor' that is efficient for this, but if so I don't have it lying around, while grep is right there.

(I believe I got the grep -P '\x00' trick from Stackoverflow, where it's shown up in a number of answers.)

GreppingForNullBytes written at 00:38:58; Add Comment

2018-04-26

The shifting goals of our custom NFS mount authorization system

We've been doing custom authorization for NFS mounts in our overall environment for a very long time. Our most 'recent' system for this is the NSS-based custom NFS mount authorization scheme that we introduced on our original Solaris fileservers and now run on our OmniOS-based fileservers; this system has now been running for on the order of a decade. In one sense how this system operates has remained the same over that time (in that it still uses the same basic mechanics); in another sense, things have changed significantly because our goals and priorities for NFS mount authorization have changed in a decade.

In our system, NFS mount permission is based on what netgroup a machine is in but we then authenticate that the machine hasn't been replaced with an impostor before we allow the NFS mount. We have two sorts of machines that do NFS mounts from our fileservers; our own (Linux) servers, which are only on a couple of our networks, and then a number of additional machines run by other people on various of our sandbox networks. Our custom authorization systems have historically verified the identity of all NFS clients, both our machines and other people's machines, and the initial decade-ago version of the current system was no different.

However, over time we ran into issues with verifying our own servers. There were a whole collection of failure modes where some or many of our servers could get verification failures, and then the entire world exploded because NFS mounts are absolutely critical to having a working machine. At one point we made a quick pragmatic decision to temporarily disable the host verification for our own servers, and then as time went on we became more and more convinced that this wasn't just an expedient hack, it was the correct approach. These days our servers live on a machine room network where no outside machines are allowed, so if you can swap your own impostor machine in you have physical access to our machine room and we have major problems.

(Well, there are other options, but they're all about equally bad for us.)

As a result of this, we've now explicitly shifted to viewing our custom NFS mount authorization system as being just for verifying not-us machines (or more exactly, machines on networks we don't trust). This matters because those machines shouldn't be as crucially dependent on our NFS filesystems as our own servers are, and so we can afford to design a system that works somewhat differently, for example by requiring some active step by the NFS client to get a machine authenticated.

(We have a central administrative filesystem that's so crucial to our machines that most of them won't finish booting until they can mount it. No non-us machine should be so dependent on our NFS infrastructure (hopefully we aren't going to find out someday that one of them is anyway).)

Especially with security-related systems, it's probably a good idea to sit down periodically and re-validate all of your assumptions about how they need to work. It's very easy for your threat model to shift (as ours did), as well as your goals and needs. There's also the question of how much security the system has to provide, and at what cost (in potential misfires, complexity, and so on). You may find that the passage of time has changed your views on this for various reasons.

NFSMountAuthShiftingGoals written at 01:11:45; Add Comment

2018-04-25

An implementation difference in NSS netgroups between Linux and Solaris

NSS is the Name Service Switch, or as we normally know it, /etc/nsswitch.conf. The purpose of NSS is to provide a flexible way for sysadmins to control how various things are looked up, instead of hard-coding it. For flexibility and simplicity, the traditional libc approach is to use loadable shared objects to implement the various lookup methods that nsswitch.conf supports. The core C library itself has no particular knowledge of the files or dns nsswitch.conf lookup type; instead that's implemented in a shared library such as libnss_files.

(This is a traditional source of inconvenience when building software, because it usually makes it impossible to create a truly static binary that uses NSS-based functions. Those functions intrinsically want to parse nsswitch.conf and then load appropriate shared objects at runtime. Unfortunately this covers a number of important functions, such as looking up the IP addresses for hostnames.)

The general idea of NSS and the broad syntax of nsswitch.conf is portable between any number of Unixes, fundamentally because it's a good idea. The shared object implementation technique is reasonably common; it's used in at least Solaris and Linux, although I'm not sure about elsewhere. However, the actual API between the C library and the NSS lookups is not necessarily the same, not just in things like the names of functions and the parameters they get passed, but even in how operations are structured. As it happens we've seen an interesting example of this divergence in a fundamental way.

Because it comes from Sun, one of the traditional things that NSS supports looking up is netgroup membership, via getnetgrent() and friends. In the Solaris implementation of NSS's API for NSS lookup types, all of these netgroup calls are basically passed directly through to your library. When a program calls innetgr(), there is a whole chain of NSS API things that will wind up calling your specific handler function for this if you've set one. This handler function can do unusual things if you want, which we use for our custom NFS mount authorization.

We've looked at creating a similar NSS netgroup module for Linux (more than once), but in the end we determined it's fundamentally impossible because Linux implements NSS netgroup lookups differently. Specifically, Linux NSS does not make a direct call to your NSS module to do an innetgr() lookup. On Linux, NSS netgroup modules only implement the functions used for getting the entire membership of a netgroup, and glibc implements innetgr() internally by looping through all the entries of a given netgroup and checking each one. This reduces the API that NSS netgroup modules have to implement but unfortunately makes our hack impossible, because it relies on knowing which specific host you're checking for netgroup membership.

At one level this is just an implementation choice (and a defensible one in both directions). At another level, this says something about how Solaris and Linux see netgroups and how they expect them to be used. Solaris's implementation permits efficient network-based innetgr() checks, where you only have to transmit the host and netgroup names to your <whatever> server and it may have pre-built indexes for these lookups. The Linux version requires you to implement a smaller API, but it relies on getting a list of all hosts in a netgroup being a cheap operation. That's probably true today in most environments, but it wasn't in the world where netgroups were first created, which is why Solaris does things the way it does.

(Like NSS, netgroups come from Solaris. Well, they come from Sun; netgroups predate Solaris, as they're part of YP/NIS.)

NSSNetgroupsDifference written at 01:33:20; Add Comment

2018-04-20

The increasingly surprising limits to the speed of our Amanda backups

When I started dealing with backups the slowest part of the process was generally writing things out to tape, which is why Amanda was much happier when you gave it a 'holding disk' that it could stage all of the backups to before it had to write them out to tape. Once you had that in place, the speed limit was generally some mix between the network bandwidth to the Amanda server and the speed of how fast the machines being backed up could grind through their filesystems to create the backups. When networks moved to 1G, you (and we) usually wound up being limited by the speed of reading through the filesystems to be backed up.

(If you were backing up a lot of separate machines, you might initially be limited by the Amanda server's 1G of incoming bandwidth, but once most machines started finishing their backups you usually wound up with one or two remaining machines that had larger, slower filesystems. This slow tail wound up determining your total backup times. This was certainly our pattern, especially because only our fileservers have much disk space to back up. The same has typically been true of backing up multiple filesystems in parallel from the same machine; sooner or later we wind up stuck with a few big, slow filesystems, usually ones we're doing full dumps of.)

Then we moved our Amanda servers to 10G-T networking and, from my perspective, things started to get weird. When you have 1G networking, it is generally slower than even a single holding disk; unless something's broken, modern HDs will generally do at least 100 Mbytes/sec of streaming writes, which is enough to keep up with a full speed 1G network. However this is only just over 1G data rates, which means that a single HD is vastly outpaced by a 10G network. As long as we had a number of machines backing up at once, the Amanda holding disk was suddenly the limiting factor. However, for a lot of the run time of backups we're only backing up our fileservers, because they're where all the data is, and for that we're currently still limited by how fast the fileservers can do disk IO.

(The fileservers only have 1G network connections for reasons. However, usually it's disk IO that's the limiting factor, likely because scanning through filesystems is seek-limited. Also, I'm ignoring a special case where compression performance is our limit.)

All of this is going to change in our next generation of fileservers, which will have both 10G-T networking and SSDs. Assuming that the software doesn't have its own IO rate limits (which is not always a safe assumption), both the aggregate SSDs and all the networking from the fileservers to Amanda will be capable of anywhere from several hundred Mbytes/sec up to as much 10G bandwidth as Linux can deliver. At this point the limit on how fast we can do backups will be down to the disk speeds on the Amanda backup servers themselves. These will probably be significantly slower than the rest of the system, since even striping two HDs together would only get us up to around 300 Mbytes/sec at most.

(It's not really feasible to use a SSD for the Amanda holding disk, because it would cost too much to get the capacities we need. We currently dump over a TB a day per Amanda server, and things can only be moved off the holding disk at the now-paltry HD speed of 100 to 150 Mbytes/sec.)

This whole shift feels more than a bit weird to me; it's upended my perception of what I expect to be slow and what I think of as 'sufficiently fast that I can ignore it'. The progress of hardware over time has made it so the one part that I thought of as fast (and that was designed to be fast) is now probably going to be the slowest.

(This sort of upset in my world view of performance happens every so often, for example with IO transfer times. Sometimes it even sticks. It sort of did this time, since I was thinking about this back in 2014. As it turned out, back then our new fileservers did not stick at 10G, so we got to sleep on this issue until now.)

AmandaWhereSpeedLimits written at 23:28:38; Add Comment

2018-04-13

A learning experience about the performance of our IMAP server

Our IMAP server has never been entirely fast, and over the years it has slowly gotten slower and more loaded down. Why this was so seemed reasonably obvious to us; handling mail over IMAP required a fair amount of network bandwidth and a bunch of IO (often random IO) to our NFS fileservers, and there was only so much of that to go around. Things were getting slowly worse over time because more people were reading and storing more mail, while the hardware wasn't changing.

We have a long standing backwards compatibility with our IMAP server, where people's IMAP clients have full access to their $HOME and would periodically go searching through all of it. Recently this started causing us serious problems, like running out of inodes on the IMAP server, and it became clear that we needed to do something about it. After a number of false starts (eg), we wound up doing two important things over the past two months. First we blocked Dovecot from searching through a lot of directories, and then we started manually migrating users one by one to a setup where their IMAP sessions could only see their $HOME/IMAP instead of all of their $HOME. The two changes together significantly reduce the number of files and directories that Dovecot is scanning through (and sometimes opening to count messages).

Well, guess what. Starting immediately with our first change and increasing as we migrated more and more high-impact users, the load on our IMAP server has been dropping dramatically. This is most clearly visible in the load average itself, where it's now entirely typical for the daytime load average to be under one (a level that was previously only achieved in the dead of night). The performance of my test Thunderbird setup has clearly improved, too, rising almost up to the level that I get on a completely unloaded test IMAP server. The change has basically been night and day; it's the most dramatic performance shift I can remember us managing (larger than finding our iSCSI problem in 2012). While the IMAP server's performance is not perfect and it can still bog down at some times, it's become clear that all of the extra scanning that Dovecot was doing was behind a great deal of the performance problems we were experiencing and that getting rid of it has had a major impact.

Technically, we weren't actually wrong about the causes of our IMAP server being slow; it definitely was due to network bandwidth and IO load issues. It's just that a great deal of that IO was completely unproductive and entirely avoidable, and if we had really investigated the situation we might have been able to improve the IMAP server long ago.

(And I think it got worse over time partly because more and more people started using clients, such as the iOS client, that seem to routinely use expensive scanning operations.)

The short and pungent version of what we learned is that IMAP servers go much faster if you don't let them do stupid things, like scan all through people's home directories. The corollary to this is that we shouldn't just assume that our servers aren't doing stupid things.

(You could say that another lesson is that if you know that your servers are occasionally doing stupid things, as we did, perhaps you should try to measure the impact of those things. But that's starting to smell a lot like hindsight bias.)

IMAPPerformanceLesson written at 02:06:21; Add Comment

2018-04-07

Some numbers for how well various compressors do with our /var/mail backup

Recently I discussed how gzip --best wasn't very fast when compressing our Amanda (tar) backup of /var/mail, and mentioned that we were trying out zstd for this. As it happens, as part of our research on this issue I ran one particular night's backup of our /var/mail through all of the various compressors to see how large they'd come out, and I think the numbers are usefully illustrative.

The initial uncompressed tar archive is roughly 538 GB and is probably almost completely ASCII text (since we use traditional mbox format inboxes and most email is encoded to 7-bit ASCII). The compression ratios are relative to the uncompressed file, while the times are relative to the fastest compression algorithm. Byte sizes were counted with 'wc -c', instead of writing the results to disk, and I can be confident that the compression programs were the speed limit on this system, not reading the initial tar archive off SSDs.

Compression ratio Time ratio
uncompressed 1.0 0.47
lz4 1.4 1.0
gzip --fast 1.77 11.9
gzip --best 1.87 17.5
zstd -1 1.92 1.7
zstd -3 1.99 2.4

(The 'uncompressed' time is for 'cat <file> | wc -c'.)

On this very real-world test for us, zstd is clearly a winner over gzip; it achieves better compression with far less time. gzip --fast takes about 32% less time than gzip --best at only a moderate cost in compression ratio, but it's not competitive with zstd in either time or compression. Zstd is not as fast as lz4 but it's fast enough, while providing clearly better compression.

We're currently using the default zstd compression level, which is 'zstd -3' (we're just invoking plain '/usr/bin/zstd'). These numbers suggest that we'd lose very little compression from switching to 'zstd -1' but get a significant speed increase. At the moment we're going to leave things as they are because our backups are now fast enough (backing up /var/mail is now not the limiting factor on their overall speed) and we do get something for that extra time. Also, it's simpler; because of how Amanda works, we'd need to add a script to switch to 'zstd -1'.

(Amanda requires you to specify a program as your compressor, not a program plus arguments, so if you want to invoke the real compressor with some non-default options you need a cover script.)

Since someone is going to ask, pigz -fast got a compression ratio of 1.78 and a time ratio of 1.27. This is extremely unrepresentative of what we could achieve in production on our Amanda backup servers, since my test machine is a 16-coreCPU Xeon Silver 4108. The parallelism speed increase for pigz is not perfect, since it was only about 9.4 times faster than gzip --fast (which is single-core).

(Since I wanted to see the absolute best case for pigz in terms of speed, I ran it on all cores CPUs. I'm not interested in doing more tests to establish how it scales when run with fewer cores CPUs, since we're not going to use it; zstd is better for our case.)

PS: I'm not giving absolute speeds because these speeds vary tremendously across our systems and also depend on what's being compressed, even with just ASCII text.

BackupCompressionNumbers written at 01:13:23; Add Comment

2018-04-04

Today's learning experience is that gzip is not fast

For reasons beyond the scope of this entry, we have a quite large /var/mail and we take a full backup of it every night. In order to save space in our disk-based backup system, for years we've been having Amanda compress these backups on the Amanda server; since we're backing up ASCII text (even if it represents encoded and compressed binary things), they generally compress very well. We did this in the straightforward way; as part of our special Amanda dump type that forces only full backups for /var/mail, we said 'compress server best'. This worked okay for years, which enticed us into not looking at it too hard until we recently noticed that our backups of /var/mail were taking almost ten hours.

(They should not take ten hours. /var/mail is only about 540 GB and it's on SSDs.)

It turns out that Amanda's default compression uses gzip, and when you tell Amanda to use the best compression it uses 'gzip --best', aka 'gzip -9'. Now, I was vaguely aware that gzip is not the fastest compression method in the world (if only because ZFS uses lz4 compression by default and recommends you avoid gzip), but I also had the vague impression that it was reasonably decently okay as far as speed went (and I knew that bzip2 and xz were slower, although they compress better). Unfortunately my impression turns out to be very wrong. Gzip is a depressingly slow compression system, especially if you tell it to go wild and try to get the best compression it can. Specifically, on our current Amanda server hardware 'gzip --best' appears to manage a rate of only about 16 MBytes a second. As a result, our backups of /var/mail are almost entirely constrained by how slowly gzip runs.

(See lz4's handy benchmark chart for one source of speed numbers. Gzip is 'zlib deflate', and zlib at the 'compress at all costs' -9 level isn't even on the benchmark chart.)

The good news is that there are faster compression programs out there, and at least some of them are available pre-packaged for Ubuntu. We're currently trying out zstd as probably having a good balance between running fast enough for us and having a good compression ratio. Compressing with lz4 would be significantly faster, but it also appears that it would get noticeably less compression.

It's worth noting that not even lz4 can keep up with full 10G Ethernet speeds (on most machines). If you have a disk system that can run fast enough (which is not difficult with modern SSDs) and you want to saturate your 10G network during backups, you can't do compression in-stream; you're going to have to capture the backup stream to disk and then compress it later.

PS: There's also parallel gzip, but that has various limitations in practice; you might have multiple backup streams to compress, and you might need that CPU for other things too.

GzipNotFast written at 02:14:06; Add Comment

2018-03-31

Using a local database to get consistent device names is a bad idea

People like consistent device names, and one of the ways that Unixes have historically tried to get them is to keep a local database of known devices and their names, based on some sort of fingerprint of the device (the MAC address is a popular fingerprint for Ethernet interfaces, for example). Over the years various Unixes have implemented this in different ways; for example, some versions of Linux auto-created udev rules for some devices, and Solaris and derivatives have /etc/path_to_inst. Unfortunately, I have to tell you that trying to get consistent device names this way turns out to be a bad idea.

The fundamental problem is that if you keep a database of local device names, your device names depend on the history of the system. This has two immediate bad results. First, if you have two systems with identical hardware running identical software they won't necessarily use the same device names, because one system could have previously had a different hardware configuration. Second, if you reinstall an existing system from scratch you won't necessarily wind up with the same device names, because your new install won't necessarily have the same history as the current system does.

(Depending on the scheme, you may also have the additional bad result that moving system disks from one machine to an identical second machine will change the device names because things like MAC addresses changed.)

Both of these problems are bad once you start dealing with multiple systems. They make your systems inconsistent, which increases the work required to manage them, and they make it potentially dangerous to reinstall systems. You wind up either having to memorize the differences from system to system or needing to assemble your own layer of indirection on top of the system's device names so you can specify things like 'the primary network interface, no matter what this system calls it'.

Now, you can have this machine to machine variation problems even with schemes that derive names from the hardware configuration. But with such schemes, at least you only have these problems on hardware that's different, not on hardware that's identical. If you have truly identical hardware, you know that the device names are identical. By extension you know that the device names will be identical after a reinstall (because the hardware is the same before and after).

I do understand the urge to have device names that stay consistent even if you change the hardware around a bit, and I sometimes quite like them myself. But I've come to think that such names should be added as an extra optional layer on top of a system that creates device names that are 'stateless' (ie don't care about the past history of the system). It's also best if these device aliases can be based on general properties (or set up by hand in configuration files), because often what I really want is an abstraction like 'the network interface that's on network X' or 'the device of the root filesystem'.

NoConsistentNamesDB written at 20:18:02; Add Comment

(Previous 10 or go back to March 2018 at 2018/03/25)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.