2025-03-14
Some notes on configuring Dovecot to authenticate via OIDC/OAuth2
Suppose, not hypothetically, that you have a relatively modern Dovecot server and a shiny new OIDC identity provider server ('OP' in OIDC jargon, 'IdP' in common usage), and you would like to get Dovecot to authenticate people's logins via OIDC. Ignoring certain practical problems, the way this is done is for your mail clients to obtain an OIDC token from your IdP, provide it to Dovecot via SASL OAUTHBEARER, and then for Dovecot to do the critical step of actually validating that token it received is good, still active, and contains all the information you need. Dovecot supports this through OAuth v2.0 authentication as a passdb (password database), but in the usual Dovecot fashion, the documentation on how to configure the parameters for validating tokens with your IdP is a little bit lacking in explanations. So here are some notes.
If you have a modern OIDC IdP, it will support OpenID Connect Discovery, including the provider configuration request on the path /.well-known/openid-configuration. Once you know this, if you're not that familiar with OIDC things you can request this URL from your OIDC IdP, feed the result through 'jq .', and then use it to pick out the specific IdP URLs you want to set up in things like the Dovecot file with all of the OAuth2 settings you need. If you do this, the only URL you want for Dovecot is the userinfo_endpoint URL. You will put this into Dovecot's introspection_url, and you'll leave introspection_mode set to the default of 'auth'.
You don't want to set tokeninfo_url to anything. This setting is (or was) used for validating tokens with OAuth2 servers before the introduction of RFC 7662. Back then, the defacto standard approach was to make a HTTP GET approach to some URL with the token pasted on the end (cf), and it's this URL that is being specified. This approach was replaced with RFC 7662 token introspection, and then replaced again with OpenID Connect UserInfo. If both tokeninfo_url and introspection_url are set, as in Dovecot's example for Google, the former takes priority.
(Since I've just peered deep into the Dovecot source code, it appears
that setting 'introspection_mode = post
' actually performs an
(unauthenticated) token introspection request. The 'get' mode
seems to be the same as setting tokeninfo_url. I think that
if you set the 'post' mode, you also want to set active_attribute
and perhaps active_value
, but I don't know what to, because
otherwise you aren't necessarily fully validating that the token
is still active. Does my head hurt? Yes. The moral here is that you
should use an OIDC IdP that supports OpenID Connect UserInfo.)
If your IdP serves different groups and provides different 'issuer'
('iss') values to them, you may want to set the Dovecot 'issuers
=
' to the specific issuer that applies to you. You'll also want
to set 'username_attribute
' to whatever OIDC claim is where
your IdP puts what you consider the Dovecot username, which might
be the email address or something else.
It would be nice if Dovecot could discover all of this for itself
when you set openid_configuration_url
, but in the current
Dovecot, all this does is put that URL in the JSON of the error
response that's sent to IMAP clients when they fail OAUTHBEARER
authentication. IMAP clients may or may not do anything useful
with it.
As far as I can tell from the Dovecot source code, setting 'scope =' primarily requires that the token contains those scopes. I believe that this is almost entirely a guard against the IMAP client requesting a token without OIDC scopes that contain claims you need elsewhere in Dovecot. However, this only verifies OIDC scopes, it doesn't verify the presence of specific OIDC claims.
So what you want to do is check your OIDC IdP's /.well-known/openid-configuration URL to find out its collection of endpoints, then set:
# Modern OIDC IdP/OP settings introspection_url = <userinfo_endpoint> username_attribute = <some claim, eg 'email'> # not sure but seems common in Dovecot configs? pass_attrs = pass=%{oauth2:access_token} # optionally: openid_configuration_url = <stick in the URL> # you may need: tls_ca_cert_file = /etc/ssl/certs/ca-certificates.crt
The OIDC scopes that IMAP clients should request when getting tokens
should include a scope that gives the username_attribute
claim,
which is 'email' if the claim is 'email', and also apparently the
requested scopes should include the offline_access
scope.
If you want a test client to see if you've set up Dovecot correctly, one option is to appropriately modify a contributed Python program for Mutt (also the README), which has the useful property that it has an option to check all of IMAP, POP3, and authenticated SMTP once you've obtained a token. If you're just using it for testing purposes, you can change the 'gpg' stuff to 'cat' to just store the token with no fuss (and no security). Another option, which can be used for real IMAP clients too if you really want to, is an IMAP/etc OAuth2 proxy.
(If you want to use Mutt with OAuth2 with your IMAP server, see this article on it also, also, also. These days I would try quite hard to use age instead of GPG.)
2025-03-09
How I got my nose rubbed in my screens having 'bad' areas for me
I wrote a while back about how my desktop screens now had areas that were 'good' and 'bad' for me, and mentioned that I had recently noticed this, calling it a story for another time. That time is now. What made me really notice this issue with my screens and where I had put some things on them was our central mail server (temporarily) stopping handling email because its load was absurdly high.
In theory I should have noticed this issue before a co-worker rebooted the mail server, because for a long time I've had an xload window from the mail server (among other machines, I have four xloads). Partly I did this so I could keep an eye on these machines and partly it's to help keep alive the shared SSH connection I also use for keeping an xrun on the mail server.
(In the past I had problems with my xrun SSH connections seeming to spontaneously close if they just sat there idle because, for example, my screen was locked. Keeping an xload running seemed to work around that; I assumed it was because xload keeps updating things even with the screen locked and so forced a certain amount of X-level traffic over the shared SSH connection.)
When the mail server's load went through the roof, I should have noticed that the xload for it had turned solid green (which is how xload looks under high load). However, I had placed the mail server's xload way off on the right side of my office dual screens, which put it outside my normal field of attention. As a result, I never noticed the solid green xload that would have warned me of the problem.
(This isn't where the xload was back on my 2011 era desktop, but at some point since then I moved it and some other xloads over to the right.)
In the aftermath of the incident, I relocated all of those xloads to a more central location, and also made my new Prometheus alert status monitor appear more or less centrally, where I'll definitely notice it.
(Some day I may do a major rethink about my entire screen layout, but most of the time that feels like yak shaving that I'd rather not touch until I have to, for example because I've been forced to switch to Wayland and an entirely different window manager.)
Sidebar: Why xload turns green under high load
Xload draws a horizontal tick line for every integer load average it needs to display the maximum load that fits in its moving histogram. If the highest load average is 1.5, there will be one tick; if the highest load average is 10.2, there will be ten. Ticks are normally drawn in green. This means that as the load average climbs, xload draws more and more ticks, and after a certain point the entire xload display is just solid green from all of the tick lines.
This has the drawback that you don't know the shape of the load average (all you know is that at some point it got quite high), but the advantage that it's quite visually distinctive and you know you have a problem.
2025-03-05
A Prometheus gotcha with alerts based on counting things
Suppose, not entirely hypothetically, that you have some backup servers that use swappable HDDs as their backup media and expose that 'media' as mounted filesystems. Because you keep swapping media around, you don't automatically mount these filesystems and when you do manually try to mount them, it's possible to have some missing (if, for example, a HDD didn't get fully inserted and engaged with the hot-swap bay). To deal with this, you'd like to write a Prometheus alert for 'not all of our backup disks are mounted'. At first this looks simple:
count( node_filesystem_size_bytes{ host = "backupserv", mountpoint =~ "/dumps/tapes/slot.*" } ) != <some number>
This will work fine most of the time and then one day it will fail to alert you to the fact that none of the expected filesystems are mounted. The problem is the usual one of PromQL's core nature as a set-based query language (we've seen this before). As long as there's at least one HDD 'tape' filesystem mounted, you can count them, but once there are none, the result of counting them is not 0 but nothing. As a result this alert rule won't produce any results when there are no 'tape' filesystems on your backup server.
Unfortunately there's no particularly good fix, especially if you
have multiple identical backup servers and so the real version uses
'host =~ "bserv1|bserv2|..."
'. In the single-host case, you can
use either absent()
or vector()
to provide a default value. There's no good solution in the multi-host
case, because there's no version of vector() that lets you set labels.
If there was, you could at least write:
count( ... ) by (host) or vector(0, "host", "bserv1") or vector(0, "host", "bserv2") ....
(Technically you can set labels via label_replace(). Let's not go there; it's a giant pain for simply adding labels, especially if you want to add more than one.)
In my particular case, our backup servers always have some additional
filesystems (like their root filesystem), so I can write a different
version of the count()
based alert rule:
count( node_filesystem_size_bytes{ host =~ "bserv1|bserv2|...", fstype =~ "ext.*' } ) by (host) != <other number>
In theory this is less elegant because I'm not counting exactly what I care about (the number of 'tape' filesystems that are mounted) but instead something more general and potentially more variable (the number of extN filesystems that are mounted) that contains various assumptions about the systems. In practice the number is just as fixed as the number of 'taoe' filesystems, and the broader set of labels will always match something, producing a count of at least one for each host.
(This would change if the standard root filesystem type changed in a future version of Ubuntu, but if that happened, we'd notice.)
PS: This might sound all theoretical and not something a reasonably experienced Prometheus person would actually do. But I'm writing this entry partly because I almost wrote a version of my first example as our alert rule, until I realized what would happen when there were no 'tape' filesystems mounted at all, which is something that happens from time to time for reasons outside the scope of this entry.
2025-03-04
What SimpleSAMLphp's core:AttributeAlter does with creating new attributes
SimpleSAMLphp is a SAML identity provider (and other stuff). It's of deep interest to us because it's about the only SAML or OIDC IdP I can find that will authenticate users and passwords against LDAP and has a plugin that will do additional full MFA authentication against the university's chosen MFA provider (although you need to use a feature branch). In the process of doing this MFA authentication, we need to extract the university identifier to use for MFA authentication from our local LDAP data. Conveniently, SimpleSAMLphp has a module called core:AttributeAlter (a part of authentication processing filters) that is intended to do this sort of thing. You can give it a source, a pattern, a replacement that includes regular expression group matches, and a target attribute. In the syntax of its examples, this looks like the following:
// the 65 is where this is ordered 65 => [ 'class' => 'core:AttributeAlter', 'subject' => 'gecos', 'pattern' => '/^[^,]*,[^,]*,[^,]*,[^,]*,([^,]+)(?:,.*)?$/', 'target' => 'mfaid', 'replacement' => '\\1', ],
If you're an innocent person, you expect that your new 'mfaid' attribute will be undefined (or untouched) if the pattern does not match because the required GECOS field isn't set. This is not in fact what happens, and interested parties can follow along the rest of this in the source.
(All of this is as of SimpleSAMLphp version 2.3.6, the current release as I write this.)
The short version of what happens is that when the target is a
different attribute and the pattern doesn't match, the target will
wind up set but empty. Any previous value is lost. How this happens
(and what happens) starts with that 'attributes' here are actually
arrays of values under the covers (this is '$attributes
'). When
core:AttributeAlter has a different target attribute than the source
attribute, it takes all of the source attribute's values, passes
each of them through a regular expression search and replace (using
your replacement), and then gathers up anything that changed and
sets the target attribute to this gathered collection. If the pattern
doesn't match any values of the attribute (in the normal case, a
single value), the array of changed things is empty and your target
attribute is set to an empty PHP array.
(This is implemented with an array_diff() between the results of preg_replace() and the original attribute value array.)
My personal view is that this is somewhere around a bug; if the pattern doesn't match, I expect nothing to happen. However, the existing documentation is ambiguous (and incomplete, as the use of capture groups isn't particularly documented), so it might not be considered a bug by SimpleSAMLphp. Even if it is considered a bug I suspect it's not going to be particularly urgent to fix, since this particular case is unusual (or people would have found it already).
For my situation, perhaps what I want to do is to write some PHP code to do this extraction operation by hand, through core:PHP. It would be straightforward to extract the necessary GECOS field (or otherwise obtain the ID we need) in PHP, without fooling around with weird pattern matching and module behavior.
(Since I just looked it up, I believe that in the PHP code that core:PHP runs for you, you can use a PHP 'return' to stop without errors but without changing anything. This is relevant in my case since not all GECOS entries have the necessary information.)
2025-03-03
If you get the chance, always run more extra network fiber cabling
Some day, you may be in an organization that's about to add some more fiber cabling between two rooms in the same building, or maybe two close by buildings, and someone may ask you for your opinion about many fiber pairs should be run. My personal advice is simple: run more fiber than you think you need, ideally a bunch more (this generalizes to network cabling in general, but copper cabling is a lot more bulky and so harder to run (much) more of). THere is an unreasonable amount of fiber to run, but mostly it comes up when you'd have to put in giant fiber patch panels.
The obvious reason to run more fiber is that you may well expand your need for fiber in the future. Someone will want to run a dedicated, private network connection between two locations; someone will want to trunk things to get more bandwidth; someone will want to run a weird protocol that requires its own network segment (did you know you can run HDMI over Ethernet?); and so on. It's relatively inexpensive to add some more fiber pairs when you're already running fiber but much more expensive to have to run additional fiber later, so you might as well give yourself room for growth.
The less obvious reason to run extra fiber is that every so often fiber pairs stop working, just like network cables go bad, and when this happens you'll need to replace them with spare fiber pairs, which means you need those spare fiber pairs. Some of the time this fiber failure is (probably) because a raccoon got into your machine room, but some of the time it just happens for reasons that no one is likely to ever explain to you. And when this happens, you don't necessarily lose only a single pair. Today, for example, we lost three fiber pairs that ran between two adjacent buildings and evidence suggests that other people at the university lost at least one more pair.
(There are a variety of possible causes for sudden loss of multiple pairs, probably all running through a common path, which I will leave to your imagination. These fiber runs are probably not important enough to cause anyone to do a detailed investigation of where the fault is and what happened.)
Fiber comes in two varieties, single mode and multi-mode. I don't know enough to know if you should make a point of running both (over distances where either can be used) as part of the whole 'run more fiber' thing. Locally we have both SM and MM fiber and have switched back and forth between them at times (and may have to do so as a result of the current failures).
PS: Possibly you work in an organization where broken inside-building fiber runs are regularly fixed or replaced. That is not our local experience; someone has to pay for fixing or replacing, and when you have spare fiber pairs left it's easier to switch over to them rather than try to come up with the money and so on.
(Repairing or replacing broken fiber pairs will reduce your long term need for additional fiber, but obviously not the short term need. If you lose N pairs of fiber, you need N spare pairs to get back into operation.)
2025-02-25
MFA's "push notification" authentication method can be easier to integrate
For reasons outside the scope of this entry, I'm looking for an OIDC or SAML identity provider that supports primary user and password authentication against our own data and then MFA authentication through the university's SaaS vendor. As you'd expect, the university's MFA SaaS vendor supports all of the common MFA approaches today, covering push notifications through phones, one time codes from hardware tokens, and some other stuff. However, pretty much all of the MFA integrations I've been able to find only support MFA push notifications (eg, also). When I thought about it, this made a lot of sense, because it's often going to be much easier to add push notification MFA than any other form of it.
A while back I wrote about exploiting password fields for multi-factor authentication, where various bits of software hijacked password fields to let people enter things like MFA one time codes into systems (like OpenVPN) that were never set up for MFA in the first place. With most provider APIs, authentication through push notification can usually be inserted in a similar way, because from the perspective of the overall system it can be a synchronous operation. The overall system calls a 'check' function of some sort, the check function calls out the the provider's API and then possibly polls for a result for a while, and then it returns a success or a failure. There's no need to change the user interface of authentication or add additional high level steps.
(The exception is if the MFA provider's push authentication API only returns results to you by making a HTTP query to you. But I think that this would be a relatively weird API; a synchronous reply or at least a polled endpoint is generally much easier to deal with and is more or less required to integrate push authentication with non-web applications.)
By contrast, if you need to get a one time code from the person, you have to do things at a higher level and it may not fit well in the overall system's design (or at least the easily exposed points for plugins and similar things). Instead of immediately returning a successful or failed authentication, you now need to display an additional prompt (in many cases, a HTML page), collect the data, and only then can you say yes or no. In a web context (such as a SAML or OIDC IdP), the provider may want you to redirect the user to their website and then somehow call you back with a reply, which you'll have to re-associate with context and validate. All of this assumes that you can even interpose an additional prompt and reply, which isn't the case in some contexts unless you do extreme things.
(Sadly this means that if you have a system that only supports MFA push authentication and you need to also accept codes and so on, you may be in for some work with your chainsaw.)
2025-02-23
JSON has become today's machine-readable output format (on Unix)
Recently, I needed to delete about 1,200 email messages to a
particular destination from the mail queue on one of our systems.
This turned out to be trivial, because this system was using Postfix
and modern versions of Postfix can output mail queue status information
in JSON format. So I could dump the mail queue status, select the
relevant messages and print the queue IDs with jq
, and feed this to Postfix to delete the
messages. This experience has left me with the definite view that
everything should have the option to output JSON for 'machine-readable'
output, rather than some bespoke format. For new programs, I think
that you should only bother producing JSON as your machine readable
output format.
(If you strongly object to JSON, sure, create another machine readable output format too. But if you don't care one way or another, outputting only JSON is probably the easiest approach for programs that don't already have such a format of their own.)
This isn't because JSON is the world's best format (JSON is at
best the least bad format). Instead it's
because JSON has a bunch of pragmatic virtues on a modern Unix
system. In general, JSON provides a clear and basically unambiguous
way to represent text data and much numeric data, even if it has
relatively strange characters in it (ie, JSON has escaping rules
that everyone knows and all tools can deal with); it's also generally
extensible to add additional data without causing heartburn in tools
that are dealing with older versions of a program's output. And
on Unix there's an increasingly rich collection of tools to deal
with and process JSON, starting with jq
itself (and hopefully
soon GNU Awk in common configurations). Plus, JSON can generally
be transformed to various other formats if you need them.
(JSON can also be presented and consumed in either multi-line or single line formats. Multi-line output is often much more awkward to process in other possible formats.)
There's nothing unique about JSON in all of this; it could have been any other format with similar virtues where everything lined up this way for the format. It just happens to be JSON at the moment (and probably well into the future), instead of (say) XML. For individual programs there are simpler 'machine readable' output formats, but they either have restrictions on what data they can represent (for example, no spaces or tabs in text), or require custom processing that goes well beyond basic grep and awk and other widely available Unix tools, or both. But JSON has become a "narrow waist" for Unix programs talking to each other, a common coordination point that means people don't have to invent another format.
(JSON is also partially self-documenting; you can probably look at a program's JSON output and figure out what various parts of it mean and how it's structured.)
PS: Using JSON also means that people writing programs don't have to design their own machine-readable output format. Designing a machine readable output format is somewhat more complicated than it looks, so I feel that the less of it people need to do, the better.
(I say this as a system administrator who's had to deal with a certain amount of output formats that have warts that make them unnecessarily hard to deal with.)
2025-02-20
It's good to have offline contact information for your upstream networking
So I said something on the Fediverse:
Current status: it's all fun and games until the building's backbone router disappears.
A modest suggestion: obtain problem reporting/emergency contact numbers for your upstream in advance and post them on the wall somewhere. But you're on your own if you use VOIP desk phones.
(It's back now or I wouldn't be posting this, I'm in the office today. But it was an exciting 20 minutes.)
(I was somewhat modeling the modest suggestion after nuintari's Fediverse series of "rules of networking", eg, also.)
The disappearance of the building's backbone router took out all local networking in the particular building that this happened in (which is the building with our machine room), including the university wireless in the building. THe disappearance of the wireless was especially surprising, because the wireless SSID disappeared entirely.
(My assumption is that the university's enterprise wireless access points stopped advertising the SSID when they lost some sort of management connection to their control plane.)
In a lot of organizations you might have been able to relatively easily find the necessary information even with this happening. For example, people might have smartphones with data plans and laptops that they could tether to the smartphones, and then use this to get access to things like the university directory, the university's problem reporting system, and so on. For various reasons, we didn't really have any of this available, which left us somewhat at a loss when the external networking evaporated. Ironically we'd just managed to finally find some phone numbers and get in touch with people when things came back.
(One bit of good news is that our large scale alert system worked great to avoid flooding us with internal alert emails. My personal alert monitoring (also) did get rather noisy, but that also let me see right away how bad it was.)
Of course there's always things you could do to prepare, much like there are often too many obvious problems to keep track of them all. But in the spirit of not stubbing our toes on the same problem a second time, I suspect we'll do something to keep some problem reporting and contact numbers around and available.
2025-02-19
Shared (Unix) hosting and the problem of managing resource limits
Yesterday I wrote about how one problem with shared Unix hosting was the lack of good support for resource limits in the Unixes of the time. But even once you have decent resource limits, you still have an interlinked set of what we could call 'business' problems. These are the twin problems of what resource limits you set on people and how you sell different levels of these resources limits to your customers.
(You may have the first problem even for purely internal resource allocation on shared hosts within your organization, and it's never a purely technical decision.)
The first problem is whether you overcommit what you sell and in general how you decide on the resource limits. Back in the big days of the shared hosting business, I believe that overcommitting was extremely common; servers were expensive and most people didn't use much resources on average. If you didn't overcommit your servers, you had to charge more and most people weren't interested in paying that. Some resources, such as CPU time, are 'flow' resources that can be rebalanced on the fly, restricting everyone to a fair share when the system is busy (even if that share is below what they're nominally entitled to), but it's quite difficult to take memory back (or disk space). If you overcommit memory, your systems might blow up under enough load. If you don't overcommit memory, either everyone has to pay more or everyone gets unpopularly low limits.
(You can also do fancy accounting for 'flow' resources, such as allowing bursts of high CPU but not sustained high CPU. This is harder to do gracefully for things like memory, although you can always do it ungracefully by terminating things.)
The other problem entwined with setting resource limits is how (and if) you sell different levels of resource limits to your customers. A single resource limit is simple but probably not what all of your customers want; some will want more and some will only need less. But if you sell different limits, you have to tell customers what they're getting, let them assess their needs (which isn't always clear in a shared hosting situation), deal with them being potentially unhappy if they think they're not getting what they paid for, and so on. Shared hosting is always likely to have complicated resource limits, which raises the complexity of selling them (and of understanding them, for the customers who have to pick one to buy).
Viewed from the right angle, virtual private servers (VPSes) are a great abstraction to sell different sets of resource limits to people in a way that's straightforward for them to understand (and which at least somewhat hides whether or not you're overcommitting resources). You get 'a computer' with these characteristics, and most of the time it's straightforward to figure out whether things fit (the usual exception is IO rates). So are more abstracted, 'cloud-y' ways of selling computation, database access, and so on (at least in areas where you can quantify what you're doing into some useful unit of work, like 'simultaneous HTTP requests').
It's my personal suspicion that even if the resource limitation problems had been fully solved much earlier, shared hosting would have still fallen out of fashion in favour of simpler to understand VPS-like solutions, where what you were getting and what you were using (and probably what you needed) were a lot clearer.
2025-02-18
One problem with "shared Unix hosting" was the lack of resource limits
I recently read Comments on Shared Unix Hosting vs. the Cloud (via), which I will summarize as being sad about how old fashioned shared hosting on a (shared) Unix system has basically died out, and along with it web server technology like CGI. As it happens, I have a system administrator's view of why shared Unix hosting always had problems and was a down-market thing with various limitations, and why even today people aren't very happy with providing it. In my view, a big part of the issue was the lack of resource limits.
The problem with sharing a Unix machine with other people is that by default, those other people can starve you out. They can take up all of the available CPU time, memory, process slots, disk IO, and so on. On an unprotected shared web server, all you need is one person's runaway 'CGI' code (which might be PHP code or etc) or even an unusually popular dynamic site and all of the other people wind up having a bad time. Life gets worse if you allow people to log in, run things in the background, run things from cron, and so on, because all of these can add extra load. In order to make shared hosting be reliable and good, you need some way of forcing a fair sharing of resources and limiting how much resources a given customer can use.
Unfortunately, for much of the practical life of shared Unix hosting, Unixes did not have that. Some Unixes could create various sorts of security boundaries, but generally not resource usage limits that applied to an entire group of processes. Even once this became possibly to some degree in Linux through cgroup(s), the kernel features took some time to mature and then it took even longer for common software to support running things in isolated and resource controlled cgroups. Even today it's still not necessarily entirely there for things like running CGIs from your web server, never mind a potential shared database server to support everyone's database backed blog.
(A shared database server needs to implement its own internal resource limits for each customer, otherwise you have to worry about a customer gumming it up with expensive queries, a flood of queries, and so on. If they need separate database servers for isolation and resource control, now they need more server resources.)
My impression is that the lack of kernel supported resource limits forced shared hosting providers to roll their own ad-hoc ways of limiting how much resources their customers could use. In turn this created the array of restrictions that you used to see on such providers, with things like 'no background processes', 'your CGI can only run for so long before being terminated', 'your shell session is closed after N minutes', and so on. If shared hosting had been able to put real limits on each of their customers, this wouldn't have been as necessary; you could go more toward letting each customer blow itself up if it over-used resources.
(How much resources to give each customer is also a problem, but that's another entry.)