Wandering Thoughts archives

2020-09-29

Implementing 'and' conditions in Exim SMTP ACLs the easy way (and in Exim routers too)

One of the things that makes Exim a powerful mailer construction kit is that it has a robust set of string expansions, which can be used to implement conditional logic among other things (examples include 1, 2, 3, and 4). However, this power comes with an obscure, compact syntax that's more or less like a Lisp, but not as nice, and in practice is surprisingly easy to get lost in. String expansions have an 'if' so you can implement conditional things and since they have an 'if' they have the usual boolean operators, including 'and'. I've written my share of these complicated conditions, but I've never really been happy with how the result came out.

Today, I was writing another Exim ACL statement that was conditional, with two parts that needed to be anded together, and I realized that there was a simpler approach than using the power and complexity of '${if and{....}'. Namely, multiple 'condition =' requirements are ANDed together (just as all requirements are, to be clear). In my case it was clearly simpler to write my two parts separately as:

deny
   condition = ....
   condition = ....
   [... etc ...]

I actually did the multiple 'condition =' version as a quick first version for testing, then rewrote it as a nominally proper single 'condition =' using '${if and{...}', then went back and reversed the change because the revised version was both longer and less clear.

This works in Exim routers as well as ACLs, since routers also use 'condition =' as a general precondition.

(This isn't going to be universally more readable, but nothing ever is. Also, I carefully put a comment in my new ACL statement to explain why I was doing such an odd looking thing, although in retrospect the comment's not quite as clear and explanatory as I'd like.)

PS: I'm not sure why this hasn't occurred to me before, since I've always known that multiple requirements in ACLs and Exim routers must all be true (ie are ANDed together). Possibly I just never thought to use 'condition =' twice and fell into the trap of always using the obvious hammer of '${if and{...}'.

sysadmin/EximMultiConditionsForAnd written at 20:56:15; Add Comment

Why the Unix newgrp command exists (sort of)

Recently in the Fediverse, I read this toot:

Did you know that #Unix groups have passwords? Apparently if you set one, you then have to use newgrp to log in to that group.

I have never seen anyone use unix group passwords.

(Via @mhoye.)

There are some things to say about this, but the first thing you might wonder is why the newgrp command exists at all. The best answer is that it's mostly a Unix historical relic (or, to put it another way, a fossil).

In basically all current Unixes, processes can be in multiple groups at once, often a lot of them. However this is a feature added in BSD; it wasn't the case in the original Research Unixes, including V7, and for a long time it wasn't the case in System V either. In those Unixes, you could be listed as a member of various groups in /etc/groups, but a given process was only in one group at a time. The newgrp command was how you switched back and forth between groups.

In general, newgrp worked in the way you'd expect, given Unix. It was a setuid root program that switched itself into the new group and then exec'd into your shell (after carefully dropping its setuid powers).

(The actual behavior of newgrp in V7 is an interesting topic, but that's for another entry.)

As far as I can tell from tuhs.org, a newgrp command appears in Research Unix V6, but it doesn't seem to be in V5. You could have written one, though, as there was a setgid() system call at least as far back as V4 (and V4 may be where the idea of groups was invented). Somewhat to my surprise, the existence of group passwords also dates back to V6 Unix.

(Before I started looking into this, I would have guessed that group passwords were added somewhere in the System III/System V line of AT&T Unix, as AT&T adopted it to 'production' usage.)

PS: I'm pleased to see that OpenBSD seems to have dropped the newgrp command at some point. Linux and FreeBSD both continue to have it, and I can't imagine that Illumos, Solaris, or any other surviving commercial Unixes have gotten rid of it either.

unix/NewgrpCommandWhy written at 19:39:23; Add Comment

Where (and how) you limit your concurrency in Go can matter

At the start of September, I wrote about how concurrency is still not easy even in Go, using a section of real code with a deadlock as the example. In that entry, I proposed three fixes to remove the deadlock. Since Hillel Wayne's Finding Goroutine Bugs with TLA+ has now formally demonstrated that all three of my proposed fixes work, I can talk about the practical differences between them.

For convenience, here's the original code from the first entry:

func FindAll() []P {
   pss, err := ps.Processes()
   [...]
   found := make(chan P)
   limitCh := make(chan struct{}, concurrencyProcesses)

   for _, pr := range pss {
      // deadlocks here:
      limitCh <- struct{}{}
      pr := pr
      go func() {
         defer func() { <-limitCh }()
         [... get a P with some error checking ...]
         // and deadlocks here:
         found <- P
      }()
   }
   [...]

   var results []P
   for p := range found {
      results = append(results, p)
   }
   return results
}

The buffered limitCh channel is used to implement a limited supply of tokens, to hold down the number of goroutines that are getting P's at once. The bug in this code is that the goroutines only receive from limitCh to release their token after sending their result to the unbuffered found channel, while the main code only starts receiving from found after running through the entire for loop, and the main code takes the token in the loop and blocks if no tokens are available. (For more, see the original entry.)

There are at least three fixes possible: the goroutines can send to limitCh instead of the main function doing it, the goroutines can receive from limitCh before sending to found, or the entire for loop can be in an additional goroutine so that it doesn't block the main function from starting to receive from found. All three of these fixes work, as proven by Hillel Wayne, but they have different effects on the number of goroutines that this code will run if pss is large and what the state of those goroutines is.

If our goal is to minimize resource usage, the worst fix is for goroutines to receive from limitCh before sending to found. This fix will cause almost all goroutines to stall in the send to found, because all but a few of them must be started and run almost to completion before the main code can finish the for loop and start receiving from found to unblock all of those sends and let the goroutines exit. These waiting to send goroutines are keeping used their fully expanded goroutine stacks, and possibly other resources that have not yet been released by them exiting and things becoming unused so the garbage collector can collect them (or by additional defer statements releasing things).

The middling fix is for goroutines to receive from limitCh instead of the for loop doing it. We will probably immediately create and start almost all of the full pss worth of goroutines, which could be bad if pss is very large, but at least they all block immediately with almost no resources used and with very small goroutine stacks. Still, this is a bunch of memory and a bunch of (Go) scheduler churn to start all of those goroutines only to have most of them immediately block receiving from limitCh. There's also going to be a lot of contention on internal runtime locks associated with limitCh, since a lot of goroutines are queueing up on it.

The best fix for resource usage is to push the for loop into its own goroutine but to otherwise keep things the same. Because the for loop is still receiving from limitCh before it creates a new goroutine, the number of simultaneous goroutines we ever have will generally be limited to around our desired concurrency level (there will be some extra that have received from limitCh but not yet finished completely exiting).

It's likely that none of this matters if the for loop only has to deal with a few hundred entries, and that's probably the case for this code (at least most of the time). But it makes for a useful illustration. When you're writing code with enforced limited concurrency it's probably worthwhile to think about where you want to limit the concurrency and what effects that has on things overall. As we can see here, small implementation choices can have potentially large impacts.

(Also, sometimes I think too much about this sort of thing.)

programming/GoConcurrencyLimitsWhere written at 00:57:04; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.