Wandering Thoughts archives

2018-09-21

Why I mostly don't use ed(1) for non-interactive edits in scripts

One of the things that is frequently said about ed(1) is that it remains useful for non-interactive modifications to files, for example as part of shell scripts. I even mentioned this as a good use of ed today in my entry on why ed is not a good (interactive) editor today, and I stand by that. But, well, there is a problem with using ed this way, and that problem is why I only very rarely actually use ed for scripted modifications to files.

The fundamental problem is that non-interactive editing with ed has no error handling. This is perfectly reasonable, because ed was originally written for interactive editing and in interactive editing the human behind the keyboard does the error handling, but when you apply this model to non-interactive editing it means that your stream of ed commands is essentially flying blind. If the input file is in the state that you expected it to be, all will go well. If there is something different about the input file, so that your line numbers are off, or a '/search/' address doesn't match what you expect (or perhaps at all), or any number of other things go wrong, then you can get a mess, sometimes a rapidly escalating one, and then you will get to the end of your ed commands and 'w' the resulting mess into your target file.

As a result of this, among other issues, ed tends to be my last resort for non-interactive edits in scripts. I would much rather use sed or something else that is genuinely focused on stream editing if I can, or put together some code in a language where I can include explicit error checking so I'll handle the situation where my input file is not actually the way I thought it was going to be.

(If I did this very often, I would probably dust off my Perl.)

If I was creating an ideal version of ed for non-interactive editing, I would definitely have it include some form of conditionals and 'abort with a non-zero exit status if ...' command. Perhaps you'd want to model a lot of this on what sed does here with command blocks, b, t (and T in GNU sed), and so on, but I can't help but think that there has to be a more readable and clear version with things like relatively explicit if conditions.

(I have a long standing sed script that uses some clever tricks with b and the pattern space and so on. I wrote it in sed to deliberately explore these features and it works, but it's basically a stunt and I would probably be better off if I rewrote the script in a language where the actual logic was not hiding in the middle of a Turing tarpit.)

PS: One place this comes up, or rather came up years ago and got dealt with then, is in what diff format people use for patch. In theory you can use ed scripts; in practice, everyone considers those to be too prone to problems and uses other formats. These days, about the only thing I think ed format diffs are used for is if you want to see a very compact version of the changes. Even then I'm not convinced by their merits against 'diff -u0', although we still use ed format diffs in our worklogs out of long standing habit.

Sidebar: Where you definitely need ed instead of sed

The obvious case is if you want to move text around (or copy it), especially if you need to move text backwards (to earlier in the file). As a stream editor, sed can change lines and it can move text to later in the file if you work very hard at it, but it can never move text backward. I think it's also easier to delete a variable range of lines in ed, for example 'everything from a start line up to but not including an end marker'.

Ed will also do in-place editing without the need to write to a temporary file and then shuffle the temporary file into place. I'm neutral on whether this is a feature or not, and you can certainly get ed to write your results to a new file if you want to.

EdScriptErrorProblem written at 00:18:31; Add Comment

2018-09-12

A surprise discovery about procmail (and wondering about what next)

I've been using procmail for a very long time now, and over that time I generally haven't paid much attention to the program itself. It was there in the operating systems I used, it worked, and so everything was fine; it was just sort of there, like cat. Thus, I was rather surprised to stumble over the 2010 LWN article Reports of procmail's death are not terribly exaggerated (via, sort of via, via, via Planet Debian), which covers how procmail development and maintenance had stopped. Things don't exactly seem to have gotten more lively since 2010 (for example, the procmail domain seems to have mostly vanished, and then there's the message from Philip Guenther that's linked to from the wikipedia page). This raises a number of questions.

The obvious question is whether this even matters (as LWN notes in the original article). Procmail still works fine, and just as importantly, it's still being packaged by Debian, Ubuntu, and so on. There are outstanding Debian bugs, but Debian appears to also be fixing issues in their patches (and there's a 2017 patch in there, so it's not all old stuff). While we have quite a few users that depend a lot on procmail and we'd thus have real problems if, say, Ubuntu stopped packaging it, this doesn't appear likely to happen any time soon.

(Actually, if Ubuntu dropped procmail our answer would likely be to start building the package ourselves. It's not like it changes much.)

But, well, procmail is sort of Internet software, and I've said before that Internet software decays if not actively maintained. Knowing that procmail is only sort of being looked after does make me a little bit uncomfortable. However, this raises the question of what alternatives I (and we) would have for equivalent mail filtering systems. Many people seem to use Sieve, but I believe that has to be integrated into your MTA instead of run through a program in the way that procmail operates, and I don't think it can run external programs (which is important for some people). The closest thing to procmail that I've read about is maildrop, but it's slightly more limited than procmail in several spots and I'm not sure it could fully cover the various ways people here use procmail for spam filtering and running spam filters.

Exim itself has its own filtering system (documented here). These are more powerful than Exim-based Sieve filters (they can deliver to external programs, for example) but of course they require Exim specifically and couldn't be moved to another mailer. They're still not quite as capable as procmail; specifically Exim filters can't directly write to MH format directories (which matters to me because of how I now do a bunch of mail filtering).

We've historically declined to enable either Sieve based filtering or Exim's own filtering in our mail system on the grounds that we wanted to preserve our freedom to change mailers. In light of what I've now learned about procmail, I'm wondering if that's still the right choice. We also don't currently have maildrop installed on our central mail machine (where people already run procmail); perhaps we should change that as well, to give people the option (even if they most likely won't take it).

PS: A quick check suggests that we have around 195 people or so who are using procmail (in that they have it set up in their .forward), which is actually more than I expected. Not all of them are necessarily using our mail system much any more, though.

ProcmailWhatNext written at 01:48:09; Add Comment

2018-09-06

Our future IPv6 access control problems due to non-DHCP6 machines

Back almost two years ago, I wrote about how I suspected a lot of IPv6 hosts wouldn't have reverse DNS because they would be using stateless address autoconfiguration (SLAAC) where they essentially assign themselves one or more random IPv6 addresses when they show up on your network. For us, this presents a problem much larger than just DNS, because control over what hosts DHCP will give addresses to (and what addresses it will assign) are how we force machines to be registered on our laptop network and our wireless network before we give them network access.

The specific driver of IPv6 SLAAC is Android devices, which don't do DHCP6 at all; unfortunately this also includes ChromeOS, which means Chromebooks. But once you enable SLAAC on your network, any number of things may decide to grab themselves SLAAC addresses and then use them, even if they also do DHCP6 and so get whatever address you give them there (this is the iOS behavior I observed a couple of years ago; I don't know how Windows, macOS, and so on behave here). If the IPv6 address and routing they get via DHCP6 doesn't seem to work, I suspect that quite a lot of devices will be perfectly happy to route via their SLAAC address and route, and if that doesn't work, well, the Android and ChromeOS devices aren't getting on the Internet.

There are a number of approaches I can think of. One possible brute force answer is to simply not do SLAAC, only DHCP6 and (IPv4) DHCP. This would mean that SLAAC-only devices would only get IPv4 addresses, but that's not likely to be a practical problem for a long time to come. I think this is our most likely short term answer, because it's the easiest approach and we can always get more complicated later. The other brute force approach is some sort of MAC filtering on our firewalls, but we use OpenBSD and my understanding is that there are a number of issues around MAC filtering in OpenBSD PF.

The officially approved answer is probably to move to IEEE 802.1X on our networks that require this sort of access control. This is infeasible for multiple reasons, including that I believe it would require a wholesale replacement of our network switches on the affected networks. For extra bonus points we don't even run much of the infrastructure that provides our wireless network, which is one of the networks we need this access control on (this is not as crazy as it sounds, but that's another entry).

All of this is yet another reason why any migration to IPv6 will be neither fast nor easy for us, and thus why we still haven't done more than vaguely look in the direction of IPv6. Someday, maybe, when IPv6 appears to actually be important for something.

(And when we do start doing IPv6, it's highly likely to start out being only for a few servers with static IP addresses. Extending it to people's own 'client' devices is likely to be one of the last things we get around to.)

(I was reminded of all of this today by cweiske's question on my old entry.)

IPv6AccessControlProblem written at 00:00:52; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.