Modern email is hard to search, which encourages locked up silos
Today, I tweeted:
One of the things I don't like about modern email is how you can't really grep it in raw as-received form, because too many emails are encoded in eg base-64. Before you can thoroughly search you must parse and de-MIME everything (and store it that way).
Once upon a time, email was plain text (or at least mostly). This had the useful consequence that you could dump it in one or more files (even one file per email message) and then do basic searches through it with any tool that could search through plain text. There are a lot of tools that search through plain text, especially on Unix.
(Unless you only had one message per file, it was never quite true that you could do good searching without parsing the email structure in any way. If you wanted to search for two things being mentioned in the same email message, you needed something that could understand message boundaries. But this was not that much work, and you could construct it with brute force if you had to.)
Those days are of course long gone. A lot of email today is encoded, for example because it contains some UTF-8 characters and email is still theoretically seven-bit ASCII only. This means that in order to do a good job of searching email, you must be able to decode all of this, which requires being able to parse the MIME structure of email messages.
Parsing MIME and decoding email messages is not the fastest thing in the world (and it's also not the easiest). If you want to do fast searching or use general text searching tools, you can no longer store email in its raw, as received state; you need to come up with some partially or completely decoded format. There's no standard storage format for this, so everyone makes up their own, then doesn't document it or commit to preserving backwards compatibility with it in the future. This restricts what tools can be used to do even basic text searches on your archived email, and is part of what encourages custom archiving formats.
The result is that to do a good job of searching modern email, you need to use a relatively narrow range of tools instead of having your pick of anything that can search or index text. Often your tools will be restricted to whatever's built into your mail client.
(The extreme case of this are web-based mail systems where you don't normally get the text form of the mail at all, just a rendered version, and all of the searching happens on the server with whatever features, tools, and decoding that they choose to support.)
Problems in the way of straightforward device naming in operating systems
I've recently been writing (again) about Linux network interface names, this time what goes into udev's normal device names. This is a perennial topic in many operating systems; people are forever wanting straightforward and simple names for devices (networks, disk drives, and so on) and forever irritated that operating systems don't seem to be able to deliver this. Unix network device naming makes an illustrative example for everything that adds complexity even without hot-plugged devices.
Once upon a time the name of Unix Ethernet devices was simple; they
eth2, and so on. This is an appealing
naming scheme for networks, disk drives, and so on; you number the
eth0 and go upward from there. The big problem
with this naming scheme is the question of how you decide what is
the first Ethernet, and in general how do you create an order for
them and then keep it stable.
The minimum requirement for any naming scheme is that if the system
is rebooted with no other changes, you get the same device names.
In any operating system that probes and registers physical devices
in parallel, this means you don't want to make the order be the
order in which hardware is detected, because that might vary from
reboot to reboot due to races. If one piece of hardware or one
device driver is a little faster to respond this time around, you
don't want what
eth0 is to change. Operating systems could probe
and register hardware one at a time, but this is unpopular because
it can take a while and slow down reboots. Generally this means
that you have to order devices based on either how they're connected
to the system or some specific characteristics they have.
(The hardware changing how fast it responds may be unlikely with network interfaces, but consider disk drives.)
The next thing you want for a naming scheme is that existing devices
don't have their names changed if you add or remove hardware. If
you already have
eth1 and you add two new network cards
(each with one interface), you want those two new interfaces to be
eth3 (in some order). If you later take out the card
eth2, most people want
eth3 to stay
eth3. To make
this case more tricky, if the card for
eth2 fails and you replace
it with an identical new card, most people want the new card's
network interface to also be
eth2, although it will of course
have a different unique Ethernet hardware address.
Historically, some operating systems have attempted to implement
this sort of long term stable device naming scheme by maintaining
a registry of associations between device names and specific hardware.
This creates its own set of problems, because now your replacement
eth2 is most likely going to be
eth4, but if you reinstall the
machine from scratch it will be
eth2 again. This leads to the
third thing you want in a naming scheme, which is that if two
machines have exactly the same hardware, they should have the same
device names. Well, you may not want this, but system administrators
Most modern general purpose computers use PCIe, even if they're not based on x86 CPUs. PCIe has short identifiers for devices that are stable over simple reboots in the form of PCIe bus addresses, but unfortunately it doesn't have short identifiers that are stable over hardware changes. Adding, removing, or changing PCIe hardware can insert or remove PCIe busses in the system's PCIe topology, which will renumber some other PCIe busses (busses that are enumerated after the new PCIe bus you've added). PCIe can have fully stable identifiers for devices, but they aren't short since you have to embed the entire PCIe path to the device.
It's possible to reduce and postpone these problems by expanding
the namespace of device names. For instance, you can make the device
names depend on the hardware driver being used, so instead of
eth1, you have
bge0. However, this doesn't eliminate
the problem, since you can have multiple pieces of hardware that
use the same driver (so you have
ix1, and so on). It also
makes life inconvenient for people, because now they have to remember
what sort of hardware a given system has. A long time ago, Unix
disk device names were often dependent on the specific hardware
controller being used, but people found
that they rather preferred only having to remember and use
(The third problem is that your device names can change if you
replace a network card with a different type of network card; you
might go from
bge0. It might even be from the same
vendor. Intel has had several generations of networking chipsets,
which are supported by different drivers on Linux. My office
machine has two Intel 1G interfaces, one
and one using the
University computer accounts are often surprisingly complicated
In many organizations, the life cycle of computer accounts is theoretically relatively straightforward. Your employees have an official HR start date, at which point their account comes into being, and eventually they will have a departure date, at which point their account goes away. There are wrinkles, but you can mostly get away with driving your account system from HR data. Often it's explicitly a good idea to do this, to make sure that people who are no longer employed also no longer have computer accounts (or at least access to them).
This is not how it goes in a university, at least on the research side of a university department (the teaching side is more straightforward). To start with, people have all sorts of relationships with the department that creates computer accounts; there are professors, graduate students, staff, postdocs, research assistants and research staff hired by professors, undergraduates doing research work with a professor or a group, research collaborators from inside or outside the university, visiting researchers, and more. Many of these relationships are not recorded in the university-wide HR and student information systems, and the university doesn't necessarily want them to be (especially for things like who is collaborating with who for research). Some of these people don't have any formal association with the university that would be reflected in university-wide HR systems, as they aren't being paid by the university, attending as a student, or otherwise officially recorded as being eg a 'visiting professor (status only)'.
When people have relationships that are officially recorded, such as being professors or graduate students, the start date of their official relationship may be well after we want to give them a computer account. For example, when new faculty are hired into the department, they have an official start date that may be some time in the future, but the department usually wants to start integrating them right away, so they can both feel welcomed and hit the ground running. The same is true for end dates in many cases. For example, just because a student has graduated doesn't mean that they've stopped interacting with people in the department and should be cut off by having their computer account closed. Some sorts of official relationships can go on hold for a while, such as a graduate student taking a leave of absence, and when this happens we definitely don't want to remove their computer account.
(In addition, the end dates of even the formal associations like graduate students graduating can be uncertain and changeable. People usually don't finish their thesis and thesis defense on an exact schedule that's known well in advance and never changes or slips.)
Complicating the life cycle is that people frequently move back and forth between different relationships with the department. In an extreme example, an undergraduate doing work with a professor can become a graduate student, then a postdoc, then a remote collaborator, and then come back to be faculty themselves. Graduate students can be hired as (part-time) staff, and then sometimes they become full time staff and no longer graduate students.
Visitors are their own collection of complications. Visiting researchers may be here for an extended period or just a flying visit where they're around for a week. They aren't necessarily from another university, since plenty of research is done in industry. They may have already collaborated with professors here and been given computer accounts, or they may not. Currently, they're often not entitled (by the university's standards) to have a university-wide identifier. Even if they are entitled to one, they may not be here long enough to go through the process for getting it (or be able to wait that long). And once they leave, we generally don't want to just delete their computer accounts, because they might either come back later or at least become collaborators.
A general theme of all of this is that the research side of an academic department runs on a broad network of relationships with all sorts of people that are developed and cultivated over time. Generally one of the last things the department wants to do is reduce those relationships through things like denying computer accounts or removing them. When graduate students or postdocs go off into the world, when visitors go home, and so on, the department wants them all to feel still connected to the department as much as is reasonably possible.
Unused hardware in computers can now be distinctly inactive
When I wrote about monitoring the state of Linux network interfaces with Prometheus, I wished for a way to tell if there was link carrier on an unused network interface (one that had not been configured as 'up' in the (Linux) kernel). Ben Hutchings commented on the entry to say (in part):
This is more than just a restriction of the kernel's user-space APIs. Network drivers are not expected to update link state while an interface is down, and they may not be able to. When an interface is down, it's often in a low power state where the PHY may not negotiate a link, or where the driver is unable to query the link status or receive link interrupts.
I'm from the era where both computer hardware and Unix device drivers were simpler things than they are today. Back in those days, you could safely assume that hardware became active once its device driver had initialized it, and it was to some degree running from then onward. You (the hardware driver) might tell the hardware to turn off interrupts if you weren't using it (for example, if no one had a serial port open or a network interface configured), but the hardware itself was generally operating and could at least have its status checked if you wanted to. If you wanted hardware to be genuinely inactive, you wanted to leave it untouched even by the driver, which often meant not configuring the driver into the kernel at all.
In this world, expecting that an unused network interface would still have link carrier information available was a natural thing. Once the kernel driver initialized the hardware (and perhaps even before then, from when power was applied), the hardware was awake and responding to the state of the outside world. Being unused was a higher level kernel issue, one that was irrelevant to the hardware itself.
Those days are long over. Today, hardware is much more complicated and can be controlled at a much finer level, so there are many more things that unused hardware may not be doing. As Hutchings noted, the kernel driver will probably opt to set unused hardware into some sort of low power state, and in this state it won't be doing all sorts of things that I expect. General kernel subsystems may go further, perhaps with driver support, doing things like turning down PCIe links or turning on larger scale power savings modes. I've already seen that PCIe slot bandwidth can change dynamically and that this can happen over surprisingly short time periods.
(One corollary and sort of inverse of this is that merely looking at the current status of some piece of hardware may push it into a more active, higher power state, because otherwise it can't answer your questions about how it is. We've seen this before, where if you told some HDs to power down and then asked them about their SMART status, they powered back up.)
Some of this capability probably comes as a free ride on what was necessary to support low power operation in laptops. With hardware standards like PCIe, driver and kernel support, and probably even some degree of PCIe interface chipsets (and chipset functional block IP) shared between laptops and other platforms, servers (and desktops) might as well benefit from the power savings advantages.
(People with large, high density server installs are probably also using servers that have exactly and only the hardware that they're actually going to use, so I wouldn't expect them to get much power savings from these features. But perhaps I'm wrong and they can routinely turn down network interfaces and so on.)
PS: This has also created the situation where at least some servers can genuinely suspend themselves to RAM (entering the ACPI S3 state) as if they were laptops, although getting them to un-suspend themselves is usually nowhere near as simple as opening your laptop again.
I like WireGuard partly because it doesn't have 'sessions'
I like WireGuard for any number of reasons, both on a technical level and on the pragmatic level of it working well. One of the pragmatic things I like about it is that at the level of the user experience, it doesn't have the idea of 'sessions' the way at least OpenVPN and L2TP do.
The problem with sessions is that an established session can be dropped or broken, and this can happen for relatively mysterious reasons (at least as the person using the VPN sees it). When a session is severed this way, software must established a new connection. Often this process is user visible and causes issues like long-running TCP connections to be broken (ie, your SSH logins). Sometimes one end thinks the session is dead and the other end doesn't realize this, so your activity just disappears into the void until either your software notices (and establishes a new connection) or you turn your VPN off and on again to do it by hand.
With WireGuard, you can have logical connections that you turn on or off in your configuration system of choice, but this is purely a user interface issue. The underlying protocol is connection-less and there's no session to break. If the underlying network path is interrupted for a while, neither end of the WireGuard connection will get upset. Packets will get lost for a while, then start getting delivered again, and any long-running TCP connections that break will break for natural reasons that the connection itself timed out. How WireGuard works even lets you move one end between networks without having things explode (I've gone through this).
The practical result is that a WireGuard connection is likely to be quite resilient, and if it isn't the problem will be pretty clearly in the network path between the two endpoints. Even if the network path is flaky, a WireGuard connection may simply be slow. Meanwhile, OpenVPN and L2TP sessions seem to break if you sneeze on them.
Another thing that helps this out is that WireGuard's connection-less nature pushes strongly toward static IP address assignment for the clients. I'm not sure it's completely required, but it's certainly what the tools want you to do. With static IP address assignment as the default, a given client will always have the same IP if it has to re-handshake and will go on to naturally handle and continue ongoing traffic to it. With more session oriented VPN protocols, a client may well get a different IP when it re-establishes a new session and then generally all of your old TCP (and UDP) connections are intrinsically dead.
PS: The flipside of this session-less nature is that you need a separate WireGuard identity (ie a client configuration) for each device someone wants to use, since two devices can't share a WireGuard identity at once. Other VPN protocols can potentially allow this.
A strong commitment to backwards compatibility means keeping your mistakes
Plenty of people like backwards compatibility, especially strong backwards compatibility. But it has what is a sometimes unpleasant consequence, which is that a strong commitment to backwards compatibility requires keeping your mistakes. Or at least many of them. To put it one way, you need to keep mistakes that work, and of course you have to keep them giving the same result as they currently do. For example if you provide an API that people can use to express potentially conflicting things and you don't reject the attempt but instead give some deterministic result, you're stuck with it.
You don't have to have a strong commitment to backwards compatibility, of course, and many people think it's better not to. Microsoft Windows is a big example of such a strong commitment (what has sometimes been called 'bug for bug compatibility'), and while it's given Microsoft a lot of commercial success it's also left them with a lot of technical challenges that have required heroic (and expensive) work to deal with. Backwards compatibility in general is definitely something that helps some groups at the potential expense of others, and it has a cost regardless of whatever benefits it gives to people.
But if you're going to make a strong commitment to backwards compatibility, it comes with the warts, including that your mistakes have to be preserved. If you don't want to have to preserve your mistakes, you should be honest about the limits of your commitment. You don't have to do this (you don't have to do anything), but if you don't it can surprise people and make them unhappy.
(These days, one limit you might want to write in is that you'll break backwards compatibility if it's the only way to fix a sufficiently serious security issue.)
Another way to not have to preserve your mistakes is to do your best to make sure you aren't making any before you commit to something. This can mean not shipping something at all until you're confident in it, or only shipping something as explicitly not covered. The latter is dangerous, though, because regardless of what you say some people will come to rely on your 'experimental' feature and then have problems when it changes.
(Among other reasons, people will rely on experimental features when it's the only way to get their work done. If your choice is 'don't do this at all' or 'rely on an experimental feature', a lot of people will be strongly pushed to the latter. These people are not making a mistake; they're doing something you can predict in advance if you want to.)
TLS certificate durations turn out to be complicated and subtle
The surprising TLS news of the time interval is that Let's Encrypt made a systematic mistake in issuing all of their TLS certificates for years. Instead of being valid for exactly 90 days, Let's Encrypt certificates were valid for 90 days plus one second. This isn't a violation of the general requirements for Certificate Authorities on how long TLS certificates can be, but it was a violation of Let's Encrypt's own certificate policy.
TLS certificates have a 'not before' and 'not after' times. For ages, Let's Encrypt (and almost everyone else) has been generating these times by taking a start time and adding whatever duration to it. You can see an example of this in some completely unrelated code in my entry on how TLS certificates have two internal representations of time, where the certificate starts and ends on the same hour, minute, and second (19:40:26 in the entry). However, it turns out that the TLS certificate time range includes both the start and the end times; it's not 'from the start time up to but not including the end time'. Since this includes both the second at the start and the second at the end, a simple 'start time plus duration' is one second too long.
(A properly issued literal 90 day certificate from Let's Encrypt now has an ending seconds value that's one second lower than it starts, for example having a not before of 2021-06-10 15:31:37 UTC and a not after of 2021-09-08 15:31:36.)
This is already a tricky issue but the Mozilla bug gets into an even more tricky one, which is fractional seconds. If a certificate has a 'not after' of 15:31:36, is it valid right up until 15:31:37.000, or does it stop being valid at some time after 15:31:36.000 but before 15:31:37.000? The current answer is that it's valid all the way up to but not including 15:31:37.000, per Ryan Sleevi's comment, but there's some discussion of that view in general and it's possible there will be a revision to consider these times to be instants.
(People are by and large ignoring leap seconds, because everyone ignores them.)
All of this careful definition of not before and not after is in the abstract of RFCs and requirements for Certificate Authorities, but not necessarily in what actual software does. Some versions of OpenSSL apparently treat both the not before and not after times as exclusive when validating TLS certificates (cf); the time must be after the not before time and before the not after time. Other software may have similar issues, especially treating the not after time as the point where the certificate becomes invalid. I would like to say that it also doesn't matter in actual practice, but with TLS's luck someone is eventually going to find an attack that exploits this. Weird things happen in the TLS world.
PS: Let's Encrypt's just updated CPS deals with the whole issue by simply saying they will issue certificates for less than 100 days.
PPS: Some certificate reporting software may not even print the seconds for the not before and not after fields. I can't entirely blame it, even though that's currently a bit inconvenient.
TLS certificates have at least two internal representations of time
TLS certificates famously have a validity period, expressed as 'not before' and 'not after' times. These times can have a broad range, and there are some TLS Certificate Authority root certificates that already have 'not after' times relatively far in the future (as I mentioned here). All TLS certificates, including CA root certificates, are encoded in ASN.1. Recently I was both generating long-lived certificates and peering very closely into them in an attempt to figure out why my new certificates weren't working, and in the process of doing so I discovered that ASN.1 has at least two representations of time and what representation a TLS certificate uses depends on the specific time.
Most TLS certificates you will encounter today encode time in what
openssl asn1parse' calls a UTCTIME. If you have a
TLS certificate with a sufficiently far in the future time, it will
instead be represented as what OpenSSL calls a GENERALIZEDTIME. Somewhat to my
surprise, both of these turn out to be strings under the covers and
the reason that TLS switches from one to the other isn't what I
thought it was. I'll start by showing the encoding for a not before
and a not after date (and time) for a certificate I generated:
UTCTIME :210531194026Z GENERALIZEDTIME :20610521194026Z
This certificate is valid from 2021-05-31 19:40 UTC to 2061-05-21 19:40 UTC. The Z says this is in UTC, the '194026' is 19:40:26, and the '0531' and '0521' are the month and day. The difference between the two time formats is at the front; the UTCTIME starts with '21' while the other starts with '2061'.
When I started looking into the details of this, I assume that the choice between one or the other form was because of the year 2038 problem. This is not the case, since UTCTIME is not represented as any sort of Unix epoch timestamp and has no such limits. Instead, UTCTIME's limitation is that it only uses a two-digit year. As covered in RFC 5280, if the two year digits are 00 to 49, the year is 20yy, and for 50 to 99, it is 19yy. This means that a UTCTIME can only represent times up to the end of 2049. The certificate I generated is valid past that, so it must use the more general version.
In theory, all code that deals with TLS certificates should be able to deal with both forms of time. This is a low level concern that the ASN.1 parsing library should normally hide from programs, and both forms have been valid since RFC 2459 from 1999. In practice, I suspect that there's almost no use of the second time format in certificates today, so I suspect that there's at least some software that mishandles them. For general use, we have years to go before this starts to be an issue (starting with CA root certificates that push their expiry date into 2050 and beyond).
For our own use, I think I'm going to limit certificate validity to no later than 2049. The more cautious approach is to assume that there's a Unix timestamp somewhere in the chain of processing things and stick to times that don't go beyond the year 2038 boundary.
(I think that these are the only two ASN.1 time representations that are considered valid in TLS certificates on the Internet, but I haven't carefully gone through the RFCs and other sources of information to be sure. So I'm being cautious and saying that TLS certificates have 'at least' two representations of time.)
Understanding OpenSSH's future deprecation of the 'ssh-rsa' signature scheme
OpenSSH 8.6 was recently released, and its release notes have a 'future deprecation notice' as has every release since OpenSSH 8.2:
Future deprecation notice
It is now possible to perform chosen-prefix attacks against the SHA-1 algorithm for less than USD$50K.
In the SSH protocol, the "ssh-rsa" signature scheme uses the SHA-1 hash algorithm in conjunction with the RSA public key algorithm. OpenSSH will disable this signature scheme by default in the near future.
More or less a year ago I flailed around about what this meant. Now I think that I understand more about what is going on, enough so to talk about what is really affected and why. Helping this out is that since the OpenSSH 8.5 release notes, OpenSSH has had the current, more explicit wording above about the situation.
When we use public key cryptography to sign or encrypt something, we generally don't directly sign or encrypt the object itself. As covered in Soatok's Please Stop Encrypting with RSA Directly, for encryption we normally use public key encryption on a symmetric key that the message itself is encrypted with. For signing, we normally hash the message and then sign the hash (see, for instance, where cryptographic hashes come into TLS certificates). OpenSSH is no exception to this; it has both key types and key signature schemes (or algorithms), the latter of which specify the hash type to be used.
(OpenSSH's underlying key types are documented best in the ssh-keygen's
manpage for the
-t option. The -sk keytypes are FIDO/U2F keys,
as mentioned in the OpenSSH 8.2 release notes. The supported key signature
algorithms can be seen with '
ssh -Q key-sig'.)
What OpenSSH is working to deprecate is the (sole) key signature algorithm that hashes messages to be signed with SHA-1, on the grounds that SHA-1 hashing is looking increasingly weak. For historical reasons, this key signature algorithm has the same name ('ssh-rsa') as a key type, which creates exciting grounds for misunderstandings, such as I had last year. Even after this deprecation, OpenSSH RSA keys will be usable as user and host keys, because OpenSSH has provided other key signature algorithms using RSA keys and stronger hashes (specifically SHA2-256 and SHA2-512, which are also known as just 'SHA-256' and 'SHA-512', see Wikipedia on SHA-2).
Most relatively modern systems support RSA-based key signature schemes other than just ssh-rsa. Older systems may not, especially if they're small or embedded systems using more minimal SSH implementations. Even if things like routers from big companies support key signature schemes beyond ssh-rsa, you may have to update their firmware, which is something that not everyone does and which may require support contracts and the like. Unfortunately, anything you want to connect to has to have a key signature scheme that you support, because otherwise you can't authenticate their host key.
(OpenSSH Ed25519 keys also have a single key signature scheme associated with them, if you ignore SSH certificates; they are both 'ssh-ed25519'. Hopefully we will never run into a similar hash weakness issue with them. Since I just looked it up in RFC 8709 and RFC 8032, ed25519 signatures use SHA2-512.)
Realizing one general way to construct symmetric ciphers
One of the areas of cryptography that's always seemed magical to me is symmetric ciphers. I believed that they worked, but it felt amazing that people were able to construct functions that produced random-looking output but that could be inverted if and only if you had the key (and perhaps some other information, like a nonce or IV). I recently read Soatok's Understanding Extended-Nonce Constructions, which set off a sudden understanding of a general, straightforward way to construct symmetric ciphers (although not all ciphers are built this way).
A provably secure general encryption technique is the one-time pad. One way to do one-time pad encryption on computers is to have your OTP be a big collection of random bytes (known by both sides) and then use the fact that 'A xor B xor A' is just B. The sender XORs their message with the next section of their OTP, and the receiver just XORs it again with the same section, recovering the original message (this is a form of XOR cipher). However, one-time pads are too big for practical use. What we would like is for each side to generate the one-time pad from a smaller, easier to handle seed.
What we need is a keystream, or more exactly a way to generate a keystream from an encryption key and probably some other values like a nonce (a one-time pad is a keystream that requires no generation). The keystream we generate needs to have a number of security properties like randomness and unpredictability, but the important thing is that our keystream generation function doesn't have to be invertible; in fact, it shouldn't be invertible. There are a lot of ways to do this, especially since it's sort of what cryptographic hashes do, and it's easy for me to see how you could possibly create keystream generation functions.
What I've realized and described is a stream cipher, as opposed to a block cipher. While I'd heard to the two terms before, I hadn't understood the cryptographic nature of this distinction, vaguely thinking it was only about whether you had to feed in a fixed size input block or could use more flexible variable-sized inputs. Now I've learned better in the process of writing this entry and learned something more about cryptography.
(I could probably learn and understand more about how it's possible to construct block ciphers if I read more about them, but there's only so far I'm willing to go into cryptography.)