Wandering Thoughts

2025-03-20

Go's choice of multiple return values was the simpler option

Yesterday I wrote about Go's use of multiple return values and Go types, in reaction to Mond's Were multiple return values Go's biggest mistake?. One of the things that I forgot to mention in that entry is that I think Go's choice to have multiple values for function returns and a few other things was the simpler and more conservative approach in its overall language design.

In a statically typed language that expects to routinely use multiple return values, as Go was designed to with the 'result, error' pattern, returning multiple values as a typed tuple means that tuple-based types are pervasive. This creates pressures on both the language design and the API of the standard library, especially if you start out (as Go did) being a fairly strongly nominally typed language, where different names for the same concrete type can't be casually interchanged. Or to put it another way, having a frequently used tuple container (meta-)type significantly interacts with and affects the rest of the language.

(For example, if Go had handled multiple values through tuples as explicit typed entities, it might have had to start out with something like type aliases (added only in Go 1.9) and it might have been pushed toward some degree of structural typing, because that probably makes it easier to interact with all of the return value tuples flying around.)

Having multiple values as a special case for function returns, range, and so on doesn't create anywhere near this additional influence and pressure on the rest of the language. There are a whole bunch of questions and issues you don't face because multiple values aren't types and can't be stored or manipulated as single entities. Of course you have to be careful in the language specification and it's not trivial, but it's simpler and more contained than going the tuple type route. I also feel it's the more conservative approach, since it doesn't affect the rest of the language as much as a widely used tuple container type would.

(As Mond criticizes, it does create special cases. But Go is a pragmatic language that's willing to live with special cases.)

programming/GoMultipleReturnValuesSimpler written at 22:56:14;

2025-03-19

Go's multiple return values and (Go) types

Recently I read Were multiple return values Go's biggest mistake? (via), which wishes that Go had full blown tuple types (to put my spin on it). One of the things that struck me about Go's situation when I read the article is exactly the inverse of what the article is complaining about, which is that because Go allows multiple values for function return types (and in a few other places), it doesn't have to have tuple types.

One problem with tuple types in a statically typed language is that they must exist as types, whether declared explicitly or implicitly. In a language like Go, where type definitions create new distinct types even if the structure is the same, it isn't particularly difficult to wind up with an ergonomics problem. Suppose that you want to return a tuple that is a net.Conn and an error, a common pair of return values in the net package today. If that tuple is given a named type, everyone must use that type in various places; merely returning or storing an implicitly declared type that's structurally the same is not acceptable under Go's current type rules. Conversely, if that tuple is not given a type name in the net package, everyone is forced to stick to an anonymous tuple type. In addition, this up front choice is now an API; it's not API compatible to give your previously anonymous tuple type a name or vice versa, even if the types are structurally compatible.

(Since returning something and error is so common an idiom in Go, we're also looking at either a lot of anonymous types or a lot more named types. Consider how many different combinations of multiple return values you find in the net package alone.)

One advantage of multiple return values (and the other forms of tuple assignment, and for range clauses) is that they don't require actual formal types. Functions have a 'result type', which doesn't exist as an actual type, but you also needed to handle the same sort of 'not an actual type' thing for their 'parameter type'. My guess is that this let Go's designers skip a certain amount of complexity in Go's type system, because they didn't have to define an actual tuple (meta-)type or alternately expand how structs worked to cover the tuple usage case,

(Looked at from the right angle, structs are tuples with named fields, although then you get into questions of nested structs act in tuple-like contexts.)

A dynamically typed language like Python doesn't have this problem because there are no explicit types, so there's no need to have different types for different combinations of (return) values. There's simply a general tuple container type that can be any shape you want or need, and can be created and destructured on demand.

(I assume that some statically typed languages have worked out how to handle tuples as a data type within their type system. Rust has tuples, for example; I haven't looked into how they work in Rust's type system, for reasons.)

programming/GoMultipleReturnValuesAndTypes written at 23:31:30;

2025-03-18

How ZFS knows and tracks the space usage of datasets

Anyone who's ever had to spend much time with 'zfs list -t all -o space' knows the basics of ZFS space usage accounting, with space used by the datasets, data unique to a particular snapshot (the 'USED' value for a snapshot), data used by snapshots in total, and so on. But today I discovered that I didn't really know how it all worked under the hood, so I went digging in the source code. The answer is that ZFS tracks all of these types of space usage directly as numbers, and updates them as blocks are logically freed.

(Although all of these are accessed from user space as ZFS properties, they're not conventional dataset properties; instead, ZFS materializes the property version any time you ask, from fields in its internal data structures. Some of these fields are different and accessed differently for snapshots and regular datasets, for example what 'zfs list' presents as 'USED'.)

All changes to a ZFS dataset happen in a ZFS transaction (group), which are assigned ever increasing numbers, the 'transaction group number(s)' (txg). This includes allocating blocks, which remember their 'birth txg', and making snapshots, which carry the txg they were made in and necessarily don't contain any blocks that were born after that txg. When ZFS wants to free a block in the live filesystem (either because you deleted the object or because you're writing new data and ZFS is doing its copy on write thing), it looks at the block's birth txg and the txg of the most recent snapshot; if the block is old enough that it has to be in that snapshot, then the block is not actually freed and the space for the block is transferred from 'USED' (by the filesystem) to 'USEDSNAP' (used only in snapshots). ZFS will then further check the block's txg against the txgs of snapshots to see if the block is unique to a particular snapshot, in which case its space will be added to that snapshot's 'USED'.

ZFS goes through a similar process when you delete a snapshot. As it runs around trying to free up the snapshot's space, it may discover that a block it's trying to free is now used only by one other snapshot, based on the relevant txgs. If so, the block's space is added to that snapshot's 'USED'. If the block is freed entirely, ZFS will decrease the 'USEDSNAP' number for the entire dataset. If the block is still used by several snapshots, no usage numbers need to be adjusted.

(Determining if a block is unique in the previous snapshot is fairly easy, since you can look at the birth txgs of the two previous snapshots. Determining if a block is now unique in the next snapshot (or for that matter is still in use in the dataset) is more complex and I don't understand the code involved; presumably it involves somehow looking at what blocks were freed and when. Interested parties can look into the OpenZFS code themselves, where there are some surprises.)

PS: One consequence of this is that there's no way after the fact to find out when space shifted from being used by the filesystem to used by snapshots (for example, when something large gets deleted in the filesystem and is now present only in snapshots). All you can do is capture the various numbers over time and then look at your historical data to see when they changed. The removal of snapshots is captured by ZFS pool history, but as far as I know this doesn't capture how the deletion affected the various space usage numbers.

solaris/ZFSSpaceUsageHowTracked written at 22:44:37;

2025-03-17

I don't think error handling is a solved problem in language design

There are certain things about programming language design that are more or less solved problems, where we generally know what the good and bad approaches are. For example, over time we've wound up agreeing on various common control structures like for and while loops, if statements, and multi-option switch/case/etc statements. The syntax may vary (sometimes very much, as for example in Lisp), but the approach is more or less the same because we've come up with good approaches.

I don't believe this is the case with handling errors. One way to see this is to look at the wide variety of approaches and patterns that languages today take to error handling. There is at least 'errors as exceptions' (for example, Python), 'errors as values' (Go and C), and 'errors instead of results and you have to check' combined with 'if errors happen, panic' (both Rust). Even in Rust there are multiple idioms for dealing with errors; some Rust code will explicitly check its Result types, while other Rust code sprinkles '?' around and accepts that if the program sails off the happy path, it simply dies.

If you were creating a new programming language from scratch, there's no clear agreed answer to what error handling approach you should pick, not the way we have more or less agreed on how for, while, and so on should work. You'd be left to evaluate trade offs in language design and language ergonomics and to make (and justify) your choices, and there probably would always be people who think you should have chosen differently. The same is true of changing or evolving existing languages, where there's no generally agreed on 'good error handling' to move toward.

(The obvious corollary of this is that there's no generally agreed on keywords or other syntax for error handling, the way 'for' and 'while' are widely accepted as keywords as well as concepts. The closest we've come is that some forms of error handling have generally accepted keywords, such as try/catch for exception handling.)

I like to think that this will change at some point in the future. Surely there actually is a good pattern for error handling out there and at some point we will find it (if it hasn't already been found) and then converge on it, as we've converged on programming language things before. But I feel it's clear that we're not there yet today.

programming/ErrorHandlingNotSolvedProblem written at 22:53:22;

2025-03-16

OIDC claim scopes and their interactions with OIDC token authentication

When I wrote about how SAML and OIDC differed in sharing information, where SAML shares every SAML 'attribute' by default and OIDC has 'scopes' for its 'claims', I said that the SAML approach was probably easier within an organization, where you already have trust in the clients. It turns out that there's an important exception to this I didn't realize at the time, and that's when programs (like mail clients) are using tokens to authenticate to servers (like IMAP servers).

In OIDC/OAuth2 (and probably in SAML as well), programs that obtain tokens can open them up and see all of the information that they contain, either inspecting them directly or using a public OIDC endpoint that allows them to 'introspect' the token for additional information (this is the same endpoint that will be used by your IMAP server or whatever). Unless you enjoy making a bespoke collection of (for example) IMAP clients, the information that programs need to obtain tokens is going to be more or less public within your organization and will probably (or even necessarily) leak outside of it.

(For example, you can readily discover all of the OIDC client IDs used by Thunderbird for the various large providers it supports. There's nothing stopping you from using those client IDs and client secrets yourself, although large providers may require your target to have specifically approved using Thunderbird with your target's accounts.)

This means that anyone who can persuade your people to authenticate through a program's usual flow can probably extract all of the information available in the token. They can do this either on the person's computer (capturing the token locally) or by persuading people that they need to 'authenticate to this service with IMAP OAuth2' or the like and then extracting the information from the token.

In the SAML world, this will by default be all of the information contained in the token. In the OIDC world, you can restrict the information made available through tokens issued through programs by restricting the scopes that you allow programs to ask for (and possibly different scopes for different programs, although this is a bit fragile; attackers may get to choose which program's client ID and so on they use).

(Realizing this is going to change what scopes we allow in our OIDC IdP for program client registrations. So far I had reflexively been giving them access to everything, just like our internal websites; now I think I'm going to narrow it down to almost nothing.)

Sidebar: How your token-consuming server knows what created them

When your server verifies OAuth2/OIDC tokens presented to it, the minimum thing you want to know is that they come from the expected OIDC identity provider, which is normally achieved automatically because you'll ask that OIDC IdP to verify that the token is good. However, you may also want to know that the token was specifically issued for use with your server, or through a program that's expected to be used for your server. The normal way to do this is through the 'aud' OIDC claim, which has at least the client ID (and in theory your OIDC IdP could add additional entries). If your OIDC IdP can issue tokens through multiple identities (perhaps to multiple parties, such as the major IdPs of, for example, Google and Microsoft), you may also want to verify the 'iss' (issuer) field instead or in addition to 'aud'.

tech/OIDCScopesAndPrograms written at 22:31:21;

2025-03-15

Some notes on the OpenID Connect (OIDC) 'redirect uri'

The normal authentication process for OIDC is web-based and involves a series of HTTP redirects, interspersed with web pages that you interact with. Something that wants to authenticate you will redirect you to the OIDC identity server's website, which will ask you for your login and password and maybe MFA authentication, check them, and then HTTP redirect you back to a 'callback' or 'redirect' URL that will transfer a magic code from the OIDC server to the OIDC client (generally as a URL query parameter). All of this happens in your browser, which means that the OIDC client and server don't need to be able to directly talk to each other, allowing you to use an external cloud/SaaS OIDC IdP to authenticate to a high-security internal website that isn't reachable from the outside world and maybe isn't allowed to make random outgoing HTTP connections.

(The magic code transferred in the final HTTP redirect is apparently often not the authentication token itself but instead something the client can use for a short time to obtain the real authentication token. This does require the client to be able to make an outgoing HTTP connection, which is usually okay.)

When the OIDC client initiates the HTTP redirection to the OIDC IdP server, one of the parameters it passes along is the 'redirect uri' it wants the OIDC server to use to pass the magic code back to it. A malicious client (or something that's gotten a client's ID and secret) could do some mischief by manipulating this redirect URL, so the standard specifically requires that OIDC IdP have a list of allowed redirect uris for each registered client. The standard also says that in theory, the client's provided redirect uri and the configured redirect uris are compared as literal string values. So, for example, 'https://example.org/callback' doesn't match 'https://example.org/callback/'.

This is straightforward when it comes to websites as OIDC clients, since they should have well defined callback urls that you can configure directly into your OIDC IdP when you set up each of them. It gets more hairy when what you're dealing with is programs as OIDC clients, where they are (for example) trying to get an OIDC token so they can authenticate to your IMAP server with OAuth2, since these programs don't normally have a website. Historically, there are several approaches that people have taken for programs (or seem to have, based on my reading so far).

Very early on in OAuth2's history, people apparently defined the special redirect uri value 'urn:ietf:wg:oauth:2.0:oob' (which is now hard to find or identify documentation on). An OAuth2 IdP that saw this redirect uri (and maybe had it allowed for the client) was supposed to not redirect you but instead show you a HTML page with the magic OIDC code displayed on it, so you could copy and paste the code into your local program. This value is now obsolete but it may still be accepted by some IdPs (you can find it listed for Google in mutt_oauth2.py, and I spotted an OIDC IdP server that handles it).

Another option is that the IdP can provide an actual website that does the same thing; if you get HTTP redirected to it with a valid code, it will show you the code on a HTML page and you can copy and paste it. Based on mutt_oauth2.py again, it appears that Microsoft may have at one point done this, using https://login.microsoftonline.com/common/oauth2/nativeclient as the page. You can do this too with your own IdP (or your own website in general), although it's not recommended for all sorts of reasons.

The final broad approach is to use 'localhost' as the target host for the redirect. There are several ways to make this work, and one of them runs into complications with the IdP's redirect uri handling.

The obvious general approach is for your program to run a little HTTP server that listens on some port on localhost, and capture the code when the (local) browser gets the HTTP redirect to localhost and visits the server. The problem here is that you can't necessarily listen on port 80, so your redirect uri needs to include the port you're listening (eg 'http://localhost:7000'), and if your OIDC IdP is following the standard it must be configured not just with 'http://localhost' as the allowed redirect uri but the specific port you'll use. Also, because of string matching, if the OIDC IdP lists 'http://localhost:7000', you can't send 'http://localhost:7000/' despite them being the same URL.

(And your program has to use 'localhost', not '127.0.0.1' or the IPv6 loopback address; although the two have the same effect, they're obviously not string-identical.)

Based on experimental evidence from OIDC/OAuth2 client configurations, I strongly suspect that some large IdP providers have non-standard, relaxed handling of 'localhost' redirect uris such that their client configuration lists 'http://localhost' and the IdP will accept some random port glued on in the actual redirect uri (or maybe this behavior has been standardized now). I suspect that the IdPs may also accept the trailing slash case. Honestly, it's hard to see how you get out of this if you want to handle real client programs out in the wild.

(Some OIDC IdP software definitely does the standard compliant string comparison. The one I know of for sure is SimpleSAMLphp's OIDC module. Meanwhile, based on reading the source code, Dex uses a relaxed matching for localhost in its matching function, provided that there are no redirect uris register for the client. Dex also still accepts the urn:ietf:wg:oauth:2.0:oob redirect uri, so I suspect that there are still uses out there in the field.)

If the program has its own embedded web browser that it's in full control of, it can do what Thunderbird appears to do (based on reading its source code). As far as I can tell, Thunderbird doesn't run a local listening server; instead it intercepts the HTTP redirection to 'http://localhost' itself. When the IdP sends the final HTTP redirect to localhost with the code embedded in the URL, Thunderbird effectively just grabs the code from the redirect URL in the HTTP reply and never actually issues a HTTP request to the redirect target.

The final option is to not run a localhost HTTP server and to tell people running your program that when their browser gives them an 'unable to connect' error at the end of the OIDC authentication process, they need to go to the URL bar and copy the 'code' query parameter into the program (or if you're being friendly, let them copy and paste the entire URL and you extract the code parameter). This allows your program to use a fixed redirect uri, including just 'http://localhost', because it doesn't have to be able to listen on it or on any fixed port.

(This is effectively a more secure but less user friendly version of the old 'copy a code that the website displayed' OAuth2 approach, and that approach wasn't all that user friendly to start with.)

PS: An OIDC redirect uri apparently allows things other than http:// and https:// URLs; there is, for example, the 'openid-credential-offer' scheme. I believe that the OIDC IdP doesn't particularly do anything with those redirect uris other than accept them and issue a HTTP redirect to them with the appropriate code attached. It's up to your local program or system to intercept HTTP requests for those schemes and react appropriately, much like Thunderbird does, but perhaps easier because you can probably register the program as handling all 'whatever-special://' URLs so the redirect is automatically handed off to it.

(I suspect that there are more complexities in the whole OIDC and OAuth2 redirect uri area, since I'm new to the whole thing.)

tech/OIDCRedirectUriNotes written at 22:57:35;

2025-03-14

Some notes on configuring Dovecot to authenticate via OIDC/OAuth2

Suppose, not hypothetically, that you have a relatively modern Dovecot server and a shiny new OIDC identity provider server ('OP' in OIDC jargon, 'IdP' in common usage), and you would like to get Dovecot to authenticate people's logins via OIDC. Ignoring certain practical problems, the way this is done is for your mail clients to obtain an OIDC token from your IdP, provide it to Dovecot via SASL OAUTHBEARER, and then for Dovecot to do the critical step of actually validating that token it received is good, still active, and contains all the information you need. Dovecot supports this through OAuth v2.0 authentication as a passdb (password database), but in the usual Dovecot fashion, the documentation on how to configure the parameters for validating tokens with your IdP is a little bit lacking in explanations. So here are some notes.

If you have a modern OIDC IdP, it will support OpenID Connect Discovery, including the provider configuration request on the path /.well-known/openid-configuration. Once you know this, if you're not that familiar with OIDC things you can request this URL from your OIDC IdP, feed the result through 'jq .', and then use it to pick out the specific IdP URLs you want to set up in things like the Dovecot file with all of the OAuth2 settings you need. If you do this, the only URL you want for Dovecot is the userinfo_endpoint URL. You will put this into Dovecot's introspection_url, and you'll leave introspection_mode set to the default of 'auth'.

You don't want to set tokeninfo_url to anything. This setting is (or was) used for validating tokens with OAuth2 servers before the introduction of RFC 7662. Back then, the defacto standard approach was to make a HTTP GET approach to some URL with the token pasted on the end (cf), and it's this URL that is being specified. This approach was replaced with RFC 7662 token introspection, and then replaced again with OpenID Connect UserInfo. If both tokeninfo_url and introspection_url are set, as in Dovecot's example for Google, the former takes priority.

(Since I've just peered deep into the Dovecot source code, it appears that setting 'introspection_mode = post' actually performs an (unauthenticated) token introspection request. The 'get' mode seems to be the same as setting tokeninfo_url. I think that if you set the 'post' mode, you also want to set active_attribute and perhaps active_value, but I don't know what to, because otherwise you aren't necessarily fully validating that the token is still active. Does my head hurt? Yes. The moral here is that you should use an OIDC IdP that supports OpenID Connect UserInfo.)

If your IdP serves different groups and provides different 'issuer' ('iss') values to them, you may want to set the Dovecot 'issuers =' to the specific issuer that applies to you. You'll also want to set 'username_attribute' to whatever OIDC claim is where your IdP puts what you consider the Dovecot username, which might be the email address or something else.

It would be nice if Dovecot could discover all of this for itself when you set openid_configuration_url, but in the current Dovecot, all this does is put that URL in the JSON of the error response that's sent to IMAP clients when they fail OAUTHBEARER authentication. IMAP clients may or may not do anything useful with it.

As far as I can tell from the Dovecot source code, setting 'scope =' primarily requires that the token contains those scopes. I believe that this is almost entirely a guard against the IMAP client requesting a token without OIDC scopes that contain claims you need elsewhere in Dovecot. However, this only verifies OIDC scopes, it doesn't verify the presence of specific OIDC claims.

So what you want to do is check your OIDC IdP's /.well-known/openid-configuration URL to find out its collection of endpoints, then set:

# Modern OIDC IdP/OP settings
introspection_url = <userinfo_endpoint>
username_attribute = <some claim, eg 'email'>

# not sure but seems common in Dovecot configs?
pass_attrs = pass=%{oauth2:access_token}

# optionally:
openid_configuration_url = <stick in the URL>

# you may need:
tls_ca_cert_file = /etc/ssl/certs/ca-certificates.crt

The OIDC scopes that IMAP clients should request when getting tokens should include a scope that gives the username_attribute claim, which is 'email' if the claim is 'email', and also apparently the requested scopes should include the offline_access scope.

If you want a test client to see if you've set up Dovecot correctly, one option is to appropriately modify a contributed Python program for Mutt (also the README), which has the useful property that it has an option to check all of IMAP, POP3, and authenticated SMTP once you've obtained a token. If you're just using it for testing purposes, you can change the 'gpg' stuff to 'cat' to just store the token with no fuss (and no security). Another option, which can be used for real IMAP clients too if you really want to, is an IMAP/etc OAuth2 proxy.

(If you want to use Mutt with OAuth2 with your IMAP server, see this article on it also, also, also. These days I would try quite hard to use age instead of GPG.)

sysadmin/DovecotOAuth2ForAuthNotes written at 23:01:20;

2025-03-13

Doing multi-tag matching through URLs on the modern web

So what happened is that Mike Hoye had a question about a perfectly reasonable ideas:

Question: is there wiki software out there that handles tags (date, word) with a reasonably graceful URL approach?

As in, site/wiki/2020/01 would give me all the pages tagged as 2020 and 01, site/wiki/foo/bar would give me a list of articles tagged foo and bar.

I got nerd-sniped by a side question but then, because I'd been nerd-sniped, I started thinking about the whole thing and it got more and more hair-raising as a thing done in practice.

This isn't because the idea of stacking selections like this is bad; 'site/wiki/foo/bar' is a perfectly reasonable and good way to express 'a list of articles tagged foo and bar'. Instead, it's because of how everything on the modern web eventually gets visited combined with how, in the natural state of this feature, 'site/wiki/bar/foo' is just a valid a URL for 'articles tagged both foo and bar'.

The combination, plus the increasing tendency of things on the modern web to rattle every available doorknob just to see what happens, means that even if you don't advertise 'bar/foo', sooner or later things are going to try it. And if you do make the combinations discoverable through HTML links, crawlers will find them very fast. At a minimum this means crawlers will see a lot of essentially duplicated content, and you'll have to go through all of the work to do the searches and generate the page listings and so on.

If I was going to implement something like this, I would define a canonical tag order and then, as early in request processing as possible, generate a HTTP redirect from any non-canonical ordering to the canonical one. I wouldn't bother checking if the tags were existed or anything, just determine that they are tags, put them in canonical order, and if the request order wasn't canonical, redirect. That way at least all of your work (and all of the crawler attention) is directed at one canonical version. Smart crawlers will notice that this is a redirect to something they already have (and hopefully not re-request it), and you can more easily use caching.

(And if search engines still matter, the search engines will see only your canonical version.)

This probably holds just as true for doing this sort of tag search through query parameters on GET queries; if you expose the result in a URL, you want to canonicalize it. However, GET query parameters are probably somewhat safer if you force people to form them manually and don't expose links to them. So far, web crawlers seem less likely to monkey around with query parameters than with URLs, based on my limited experience with the blog.

web/TagsViaURLsAndModernWeb written at 22:46:44;

2025-03-12

The commodification of desktop GUI behavior

Over on the Fediverse, I tried out a thesis:

Thesis: most desktop GUIs are not opinionated about how you interact with things, and this is why there are so many GUI toolkits and they make so little difference to programs, and also why the browser is a perfectly good cross-platform GUI (and why cross-platform GUIs in general).

Some GUIs are quite opinionated (eg Plan 9's Acme) but most are basically the same. Which isn't necessarily a bad thing but it creates a sameness.

(Custom GUIs are good for frequent users, bad for occasional ones.)

Desktop GUIs differ in how they look and to some extent in how you do certain things and how you expect 'native' programs to behave; I'm sure the fans of any particular platform can tell you all about little behaviors that they expect from native applications that imported ones lack. But I think we've pretty much converged on a set of fundamental behaviors for how to interact with GUI programs, or at least how to deal with basic ones, so in a lot of cases the question about GUIs is how things look, not how you do things at all.

(Complex programs have for some time been coming up with their own bespoke alternatives to, for example, huge cascades of menus. If these are successful they tend to get more broadly adopted by programs facing the same problems; consider the 'ribbon', which got what could be called a somewhat mixed reaction on its modern introduction.)

On the desktop, changing the GUI toolkit that a program uses (either on the same platform or on a different one) may require changing the structure of your code (in addition to ordinary code changes), but it probably won't change how your program operates. Things will look a bit different, maybe some standard platform features will appear or disappear, but it's not a completely different experience. This often includes moving your application from the desktop into the browser (a popular and useful 'cross-platform' environment in itself).

This is less true on mobile platforms, where my sense is that the two dominant platforms have evolved somewhat different idioms for how you interact with applications. A proper 'native' application behaves differently on the two platforms even if it's using mostly the same code base.

GUIs such as Plan 9's Acme show that this doesn't have to be the case; for that matter, so does GNU Emacs. GNU Emacs has a vague shell of a standard looking GUI but it's a thin layer over a much different and stranger vastness, and I believe that experienced Emacs people do very little interaction with it.

tech/DesktopGUIsSimilarBehavior written at 23:08:14;

2025-03-11

Some views on the common Apache modules for SAML or OIDC authentication

Suppose that you want to restrict access to parts of your Apache based website but you want something more sophisticated and modern than Apache Basic HTTP authentication. The traditional reason for this was to support 'single sign on' across all your (internal) websites; the modern reason is that a central authentication server is the easiest place to add full multi-factor authentication. The two dominant protocols for this are SAML and OIDC. There are commonly available Apache authentication modules for both protocols, in the form of Mellon (also) for SAML and OpenIDC for OIDC.

I've now used or at least tested the Ubuntu 24.04 version of both modules against the same SAML/OIDC identity provider, primarily because when you're setting up a SAML/OIDC IdP you need to be able to test it with something. Both modules work fine, but after my experiences I'm more likely to use OpenIDC than Mellon in most situations.

Mellon has two drawbacks and two potential advantages. The first drawback is that setting up a Mellon client ('SP') is more involved. Most of annoying stuff is automated for you with the mellon_create_metadata script (which you can get from the Mellon repository if it's not in your Mellon package), but you still have to give your IdP your XML blob and get their XML blob. The other drawback is that Mellon isn't integrated into the Apache 'Require' framework for authorization decisions; instead you have to make do with Mellon-specific directives.

The first potential advantage is that Mellon has a straightforward story for protecting two different areas of your website with two different IdPs, if you need to do that for some reason; you can just configure them in separate <Location> or <Directory> blocks and everything works out. If anything, it's a bit non-obvious how to protect various disconnected bits of your URL space with the same IdP without having to configure multiple SPs, one for each protected section of URL space. The second potential advantage is that in general SAML has an easier story for your IdP giving you random information, and Mellon will happily export every SAML attribute it gets into the environment your CGI or web application gets.

The first advantage of OpenIDC is that it's straightforward to configure when you have a single IdP, with no XML and generally low complexity. It's also straightforward to protect multiple disconnected URL areas with the same IdP but possibly different access restrictions. A third advantage is that OpenIDC is integrated into Apache's 'Require' system, although you have to use OpenIDC specific syntax like 'Require claim groups:agroup' (see the OpenIDC wiki on authorization).

In exchange for this, it seems to be quite involved to use OpenIDC if you need to use multiple OIDC identity providers to protect different bits of your website. It's apparently possible to do this in the same virtual host but it seems quite complex and requires a lot of parts, so if I was confronted with this problem I would try very hard to confine each web thing that needed a different IdP into a different virtual host. And OpenIDC has the general OIDC problem that it's harder to expose random information.

(All of the important OpenIDC Apache directives about picking an IdP can't be put in <Location> or <Directory> blocks, only in a virtual host as a whole. If you care about this, see the wiki on Multiple Providers and also access to different URL paths on a per-provider basis.)

We're very likely to only ever be working with a single IdP, so for us OpenIDC is likely to be easier, although not hugely so.

Sidebar: The easy approach for group based access control with either

Both Mellon and OpenIDC work fine together with the traditional Apache AuthGroupFile directive, provided (of course) that you have or build an Apache format group file using what you've told Mellon or OpenIDC to use as the 'user' for Apache authentication. If your IdP is using the same user (and group) information as your regular system is, then you may well already have this information around.

(This is especially likely if you're migrating from Apache Basic HTTP authentication, where you already needed to build this sort of stuff.)

Building your own Apache group file has the additional benefit that you can augment and manipulate group information in ways that might not fit well into your IdP. Your IdP has the drawback that it has to be general; your generated Apache group file can be narrowly specific for the needs of a particular web area.

web/ApacheMellonAndOpenIDCViews written at 23:01:54;

(Previous 10 or go back to March 2025 at 2025/03/10)

Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.