Wandering Thoughts archives

2016-07-31

I've become mostly indifferent to what language something is written in

In a comment on this entry, Opk wrote (in part):

Will be interesting to hear what you make of git-series. I saw the announcement and somewhat lost interest when I saw that it needed rust to build.

I absolutely get this reaction to git-series being written in Rust, and to some extent I share it. I have my language biases and certainly a program being written in some languages will make me turn my nose up at it even if it sounds attractive, and in the past I was generally strongly dubious about things written in new or strange languages. However, these days I've mostly given up (and given in) on this, in large part because I've become lazy.

What I really care about these days is how much of a hassle a program is going to be to deal with. It's nice if a program is written in a language that I like, or at least one that I'm willing to look through to figure things out, but it's much more important that the program not be a pain in the rear to install and to operate. And these days, many language environments have become quite good at not being pains in the rear.

(The best case is when everything is already packaged for my OSes. Next best is when at least the basic language stuff is packaged and everything else has nice command line tools and can be used as a non-privileged user, so 'go get ...' and the like just work and don't demand to spray things all over system directories. The worst case is manual installation of things and things that absolutely demand to be installed in system directories; those get shown the exit right away.)

In short, I will (in theory) accept programs written in quite a lot of languages if all I have to do to deal with them is the equivalent of '<cmd> install whatever', maybe with a prequel of a simple install of the base language. I don't entirely enjoy having a $HOME/.<some-dir> populated by a pile of Python or Ruby or Perl or Rust or whatever artifacts that were dragged on to my system in order to make this magically work, but these days 'out of sight, out of mind'.

There are language environments that remain hassles and I'm unlikely to touch; the JVM is my poster child here. Languages that I've never had to deal with before add at least the disincentive of uncertainty; if I try out their package system and so on, will it work or will it blow up in my face and waste my time? As a result, although I'm theoretically willing to consider something written in node.js or Haskell or the like, in practice I don't think I've checked out any such programs. Someday something will sound sufficiently attractive to overcome my biases, but not today.

(As mentioned, I generally don't care if the program is available as a prebuilt package for my OS, because at that point there's almost no hassle; I just do 'apt-get install' or 'dnf install' and it's done. The one stumbling block is if I do 'dnf install' and suddenly three pages of dependent packages show up. That can make me decide I don't want to check out your program that badly.)

In the specific case of git-series and Rust, Rust by itself is not quite in this 'no hassle' zone just yet, at least on Fedora; if I had nothing else that wanted Rust, I probably would have ruled out actively looking into git-series as a result. But I'd already eaten the cost of building Rust and getting it working in order to keep being able to build Firefox, and thus at the start of the whole experience adding Cargo so I'd be able to use git-series seemed simple enough.

(Also, I can see the writing on the wall. People are going to keep on writing more and more interesting things in Rust, so sooner or later I was going to give in. It was just a question of when. I could have waited until Rust and Cargo made it into Fedora, but in practice I'm impatient and I sometimes enjoy fiddling around with this sort of thing.)

This casual indifference to the programming language things are using sort of offends my remaining purist insticts, but I'm a pragmatist these days. Laziness has trumped pickiness.

ProgramLanguageIndifference written at 19:47:07; Add Comment

2016-07-19

How not to set up your DNS (part 23)

Presented in the traditional illustrated form, more or less:

; dig ns megabulkmessage218.com @a.gtld-servers.net.
[...]
megabulkmessage218.com. IN NS ns1.megabulkmessage218.com.
megabulkmessage218.com. IN NS ns2.megabulkmessage218.com.
[...]
ns1.megabulkmessage218.com. IN A 5.8.32.218
ns2.megabulkmessage218.com. IN A 8.8.8.8
[...]

One of these two listed nameservers is not like the other.

8.8.8.8 is of course the famous open resolving DNS server that Google operates. It is in no way an authoritative DNS server for anyone, even if you try to use it as one. Lookups will probably fail, because I believe that most DNS resolvers set the 'no recursion' flag in their queries to what they believe are authoritative DNS servers and when it sees that, 8.8.8.8 doesn't answer even when it almost certainly has the data in cache (instead it returns a SERVFAIL).

(This is thus an extreme case of an informal secondary, although I suppose it was probably inevitable and there are likely plenty of other people using 8.8.8.8 this way with other domains. After all, it appears to work if you test it by hand, since tools like dig normally set the recursive flag on their queries.)

Since this is a spammer's DNS server (as you might have guessed from the domain name), things are a little bit peculiar with its results.

; dig ns megabulkmessage218.com. @5.8.32.218
[nothing; we get the standard 'no such data' response]
; sdig a gadswoonsg.megabulkmessage218.com. @5.8.32.218
178.235.61.115
; sdig mx gadswoonsg.megabulkmessage218.com. @5.8.32.218
10 mail.megabulkmessage218.com.
; sdig a mail.megabulkmessage218.com. @5.8.32.218
149.154.64.43

(The MX target is SBL295728, the A record is in the SBL CSS and listed in the CBL and so on. Basically, you name a DNS blocklist and 178.235.61.115 is probably in it. And the domain name is currently in the Spamhaus DBL.)

But:

; dig a randomname.megabulkmessage218.com. @5.8.32.218
[nothing; we get the standard 'no such data' response]

So this spammer is clearly making up random names for their spam run and running a very custom nameserver that only responds to them. Anything else gets a no such data response, including SOA and NS queries for the domain itself. Since there's nothing entirely new under the sun, we've seen this sort of DNS server cleverness before.

It's interesting that trying to get the NS records for the domain from your local resolving DNS server will fail even after you've looked up the A record for the hostname. The NS records (and glue) from the .com nameservers don't have particularly low TTLs, and given that the A record resolves your local DNS server was able to get and use them. But these days clearly it immediately throws them away again to avoid cache poisoning attacks (or at least won't return them for direct queries).

HowNotToDoDNSXXIII written at 14:24:05; Add Comment

2016-07-17

A good solution to our Unbound caching problem that sadly won't work

In response to my entry on our Unbound caching problem with local zones, Jean Paul Galea left a comment with the good suggestion of running two copies of Unbound with different caching policies. One instance, with normal caching, would be used to resolve everything but our local zones; the second instance, with no caching, would simply forward queries to either the authoritative server for our local zones or the general resolver instance, depending on what the query was for.

(Everything would be running on a single host, so the extra hops queries and replies take would be very fast.)

In many organizational situations, this is an excellent solution. Even in ours, at first glance it looks like it should work perfectly, because the issue we'd have is pretty subtle. I need to set the stage by describing a bit of our networking.

In our internal networks we have some machines with RFC 1918 addresses that need to be publicly reachable, for example so that research groups can expose a web server on a machine that they run in their sandbox. This is no problem; our firewalls can do 'bidirectional NAT' to expose each such machine on its own public IP. However, this requires that external people see a different IP address for the machine's official name than internal people do, because internal people are behind the BINAT step. This too is no problem, as we have a full 'split horizon' DNS setup.

So let's imagine that a research group buys a domain name for some project or conference and has the DNS hosted externally. In that domain's DNS, they want to CNAME some name to an existing BINAT'd server that they have. Now have someone internally do a lookup on that name, say 'www.iconf16.org':

  1. the frontend Unbound sees that this is a query for an external name, not one of our own zones, so it sends it to the general resolver Unbound.
  2. the general resolver Unbound issues a query to the iconf16.org nameservers and gets back a CNAME to somehost.cs.toronto.edu.
  3. the general resolver must now look up somehost.cs itself and will wind up caching the result, which is exactly what we want to avoid.

This problem happens because DNS resolution is not segmented. Once we hand an outside query to the general resolver, there's no guarantee that it stays an outside query and there's no mechanism I know of to make the resolving Unbound stop further resolution and hot-potato the CNAME back to the frontend Unbound. We can set the resolving Unbound instance up so that it gives correct answers here, but since there's no per-zone cache controls we can't make it not cache the answers.

This situation can come up even without split horizon DNS (although split horizon makes it more acute). All you need is for outside people to be able to legitimately CNAME things to your hosts for names in DNS zones that you don't control and may not even know about. If this is forbidden by policy, then you win (and I think you can enforce this by configuring the resolving Unbound to fail all queries involving your local zones).

UnboundZoneRefreshProblemII written at 23:05:07; Add Comment

2016-07-16

A caching and zone refresh problem with Unbound

Like many people, we have internal resolving DNS servers that everyone's laptops and so on are supposed to use for their DNS. These used to run Bind and now run Unbound, mostly because OpenBSD switched which nameservers they like. Also like many people, we have a collection of internal zones and internal zone views, which are managed from a private internal master DNS server. This has led to a problem with our Unbound setup that we actually don't know how to solve.

When we make an update to internal DNS and reload the private master with it, we want this to be reflected on the resolving DNS servers essentially immediately so that people see the DNS change right away. In the days when we ran Bind on the resolving servers, this was easy; we configured the resolving Bind to be a secondary for our internal zones and set the master up with also-notify entries for it. When the master detected a zone change, it sent notifications out and the resolving Bind immediately loaded the new data. Done.

It's not clear how to achieve something like this with Unbound. Unbound doesn't listen to NOTIFY messages and do anything with them (although it's listed as a TODO item in the source code). While you can force an incredibly low TTL on DNS records so that the new DNS information will be seen within a minute or two, this setting is global; you can't apply it to just some zones, like say your own, and leave everything else cached as normal. In theory we could set absurdly low TTLs on everything in the internal views on the private master, which would propagate through to Unbound. In practice, how the internal views are build makes this infeasible; it would be a major change in what is a delicate tangle of a complex system.

(With low TTLs there's also the issue of cached negative entries, since we're often adding names that didn't exist before but may have been looked up to see that nope, there is no such hostname.)

Unbound can be told to flush specific zones via unbound-control and in theory this works remotely (if you configure it explicitly, among other things). In practice I have a number of qualms about this approach, even if it's scripted, and Unbound's documentation explicitly says that flushing zones is a slow operation.

Given that Unbound explicitly doesn't support NOTIFY yet, there's probably no real good solution to this. That's sometimes how it goes.

(We have what I'm going to call a workaround that we currently feel we can live with, but I'm not going to tell you what it is because it's not really an appealing one.)

UnboundZoneRefreshProblem written at 00:30:58; Add Comment

2016-07-12

How we do MIME attachment type logging with Exim

Last time around I talked about the options you have for how to log attachment information in an Exim environment. Out of our possible choices, we opted to do attachment logging using an external program that's run through Exim's MIME ACL, and to report the result to syslog in the program. All of this is essentially the least-effort choice. Exim parses MIME for us, and having the program do the logging means that it gets to make the decisions about just what to log.

However, the details are worth talking about, so let's start with the actual MIME ACL stanza we use:

# used only for side effects
warn
  # only act on potentially interesting parts
  condition = ${if or { \
     {and{{def:mime_content_disposition}{!eq{$mime_content_disposition}{inline}}}} \
     {match{$mime_content_type}{\N^(application|audio|video|text/xml|text/vnd)\N}} \
    } }
  # 
  decode = default
  # set a dummy variable to get ${run} executed
  set acl_m1_astatus = ${run {/etc/exim4/alogger/alogger.py \
     --subject ${quote:$header_subject:} \
     --csdnsbl ${quote:$header_x-cs-dnsbl:} \
     $message_exim_id \
     ${quote:$mime_content_type} \
     ${quote:$mime_content_disposition} \
     ${quote:$mime_filename} \
     ${quote:$mime_decoded_filename} }}

(See my discussion of quoting for ${run} for what's happening here.)

The initial 'condition =' is an attempt to only run our external program (and writing decoded MIME parts out to disk) for MIME parts that are likely to be interesting. Guessing what is an attachment is complicated and the program makes the final decision, but we can pre-screen some things. The parts we consider interesting are any MIME parts that explicitly declare themselves as non-inline, plus any inline MIME parts that have a Content-Type that's not really an inline thing.

There is one complication here, which is our check that $mime_content_disposition is defined. You might think that there's always going to be some content-disposition, but it turns out that when Exim says the MIME ACL is invoked on every MIME part it really means every part. Specifically, the MIME ACL is also invoked on the message body in a MIME email that is not a multipart (just, eg, a text/plain or text/html message). These single-part MIME messages can be detected because they don't have a defined content-disposition; we consider this to basically be an implicit 'inline' disposition and thus not interesting by itself.

The entire warn stanza exists purely to cause the ${run} to execute (this is a standard ACL trick; warn stanzas are often used just as a place to put ACL verbs). The easiest way to get that to happen is to (nominally) set the value of an ACL variable, as we do here. Setting an ACL variable makes Exim do string expansion in a harmless context that we can basically make into a no-op, which is what we need here.

(Setting a random ACL variable to cause string expansion to be done for its side effects is a useful Exim pattern in general. Just remember to add a comment saying it's deliberate that this ACL variable is never used.)

The actual attachment logger program is written in Python because basically the moment I started writing it, it got too complicated to be a shell script. It looks at the content type, the content disposition, and any claimed MIME filename in order to decide whether this part should be logged about or ignored (using the set of heuristics I outlined here). It uses the decoded content to sniff for ZIP and RAR archives and get their filenames (slightly recursively). We could have run more external programs for this, but it turns out that there are handy Python modules (eg the zipfile module) that will do the work for us. Working in pure Python probably doesn't perform as well as some of the alternatives, but it works well enough for us with our current load.

(In accord with my general principles, the program is careful to minimize the information it logs. For instance, we log only information about extensions, not filenames.)

The program is also passed the contents of some of the email headers so that it can add important information from them to the log message. Our anti-spam system adds a spam or virus marker to the Subject: header for recognized bad stuff, so we look for that marker and log if the attachment is part of a message scored that way. This is important for telling apart file types in real email that users actually care about from file types in spam that users probably don't.

(We've found it useful to log attachment type information on inbound email both before and after it passes through our anti-spam system. The 'before' view gives us a picture of what things look like before virus attachment stripping and various rejections happen, while the 'after' view is what our users actually might see in their mailboxes, depending on how they filter things marked as spam.)

Sidebar: When dummy variables aren't

I'll admit it: our attachment logger program prints out a copy of what it logs and our actual configuration uses $acl_m1_astatus later, which winds up containing this copy. We currently immediately reject all messages with ZIP files with .exes in them, and rather than parse MIME parts twice it made more sense to reuse the attachment logger's work by just pattern-matching its output.

EximOurAttachmentLogging written at 00:51:59; Add Comment

2016-07-10

Some options for logging attachment information in an Exim environment

Suppose, not entirely hypothetically, that you use Exim as your mailer and you would like to log information about the attachments your users get sent. There are a number of different ways in Exim that you can do this, each with their own drawbacks and advantages. As a simplifying measure, let's assume that you want to do this during the SMTP conversation so that you can potentially reject messages with undesirable attachments (say ZIP files with Windows executables in them).

The first decision to make is whether you will scan and analyze the entire message in your own separate code, or let Exim break the message up into various MIME parts and look at them one-by-one. Examining the entire message at once means that you can log full information about its structure in one place, but it also means that you're doing all of the MIME processing yourself. The natural place to take a look at the whole message is with Exim's anti-virus content-scanning system; you would hook into it in a way similar to how we hooked our milter-based spam rejection into Exim.

(You'll want to use a warn stanza to just cause the scanner to run, and maybe to give you some stuff that you'll get Exim to log with the log_message ACL directive.)

If you want to let Exim section the message up into various different MIME parts for you, then you want a MIME ACL (covered in the Content scanning at ACL time chapter of the documentation). At this point you have another decision to make, which is whether you want to run an external program to analyze the MIME part or whether to rely only on Exim. The advantage of doing things entirely inside Exim is that Exim doesn't have to decode the MIME part to a file for your external program (and then run an outside program for each MIME part); the disadvantage is that you can only log MIME part information and can't do things like spot suspicious attempts to conceal ZIP files.

Mechanically, having Exim do it all means you'd just have a warn stanza in your MIME ACL that logged information like $mime_content_disposition, $mime_content_type, $mime_filename or its extension, and so on, using log_message =. You wouldn't normally use decode = because you have little use for decoding the part to a file unless you're going to have an outside program look at it. If you wanted to run a program against MIME parts, you'd use decode = default and then run the program with $mime_decoded_filename and possibly other arguments via ${run} in, for example, a 'set acl_m1_blah = ...' line.

(There are some pragmatic issues here that I'm deferring to another entry.)

Allowing Exim to section the message up for you is easier in many ways, but has two drawbacks. First, Exim doesn't really provide any way to get the MIME structure of the message, because you just get a stream of parts; you don't necessarily see, for example, how things are nested. The second is that processing things part by part obviously makes it harder to log all the information about a message's file types in a single line; the natural way is to log a separate line for each part, as you process it.

Speaking of logging, if you're running an external program (either for the entire message or for each MIME part) you need to decide whether your program will do the logging or whether you're going to have the program pass information back to Exim and have Exim log it. Passing information back to Exim is more work but means that you'll see your attachment information along with the other log lines for the message. Logging to a place like syslog may make the information more conveniently visible and it's generally going to be easier.

Sidebar: Exim's MIME parsing versus yours

Exim's MIME parsing is in C and is presumably done on an in-place version of the message that Exim already has on disk. It thus should be quite efficient (until you start decoding parts) and hopefully reasonably security hardened. Parsing a message's MIME structure yourself means relying on the speed, quality, resilience against broken MIME messages, and security of whatever code either you write or your language of choice already has for MIME parsing, and it requires Exim to reconstitute a full copy of the message for you.

My experience with Python's standard MIME parsing module was that it's at least somewhat fragile against malformed input. This isn't a security risk (it's Python), but it did mean that my code wound up spending a bunch of time recovering from MIME parsing explosions and trying to extract some information from the mess anyways. I wouldn't be surprised if other languages had standard packages that assumed well-formed input and threw errors otherwise (and it's hard to blame them; dealing with malformed MIME messages is a specialized need).

(Admittedly I don't know how well Exim itself deals with malformed MIME messages and MIME parts. Hopefully it parses them as much as possible, but it may just throw up its hands and punt.)

EximAttachmentLoggingOptions written at 01:07:41; Add Comment

2016-07-09

How Exim's ${run ...} string expansion operator does quoting

Exim has a complicated string expansion system with various expansion operations. One of these is ${run}, which runs a command to get its output (or just its exit status if you only care about that). The documentation for ${run} says, in part:

${run{<command> <args>}{<string1>}{<string2>}}
The command and its arguments are first expanded as one string. The string is split apart into individual arguments by spaces, [...]

Since the arguments are split by spaces, when there is a variable expansion which has an empty result, it will cause the situation that the argument will simply be omitted when the program is actually executed by Exim. If the script/program requires a specific number of arguments and the expanded variable could possibly result in this empty expansion, the variable must be quoted. [...]

What this documentation does not say is just how the command line is supposed to be quoted. For reasons to be covered later I have recently become extremely interested in this question, so I now have some answers.

The short answer is that the command is interpreted in the same way as it is in the pipe transport. Specifically:

Unquoted arguments are delimited by white space. If an argument appears in double quotes, backslash is interpreted as an escape character in the usual way. If an argument appears in single quotes, no escaping is done.

The usual way that backslashed escape sequences are handled is covered in character escape sequences in expanded strings.

Although the documentation for ${run} suggests using the sg operator to substitute dangerous characters, it appears that the much better approach is to use the quote operator instead. Using quote is simple and will allow you to pass through arguments unchanged, instead of either mangling characters with sg or doing complicated insertions of backslashes and so on. Note that this 'passing through unchanged' will include passing through literal newlines, which may be something you have to guard against in the command you're running. In fact, it appears that almost any time you're putting an Exim variable into a ${run} command line you should slap a ${quote:...} around it. Maybe the variable can't have whitespace or other dangerous things in it, but why take the chance?

(I suspect that the ${run} documentation was written at a time that quote didn't exist, but I haven't checked this.)

This documentation situation is less than ideal, to put it one way. It's possible that you can work all of this out without reading the Exim source code if you read all of the documentation at once and can hold it all in your head, but that's often not how documentation is used; instead it gets consulted as a sporadic reference. The ${run} writeup should at least have pointers to the sections with specific information on quoting, and ideally would have at least a brief inline discussion of its quoting rules.

(I also believe that the rules surrounding ${run}'s handling of argument expansion are dangerous and wrong, but it's too late to fix them now. See this entry and also this one.)

Sidebar: where in the Exim source this is

Since I had to read the Exim source to get my answer, I might as well note down where I found things.

${run} itself is handled in the EITEM_RUN case in expand_string_internal in expand.c. The actual command handling is done by calling transport_set_up_command, which is in transport.c. This handles single quotes itself in inline code but defers double quote handling to string_dequote in string.c, which calls string_interpret_escape to handle backslashed escape sequences.

(It looks like transport_set_up_command is called by various different things under various circumstances that I'm not going to try to decode the Exim source code to nail down.)

EximRunAndQuoting written at 00:54:14; Add Comment

2016-07-06

Keeping around an index to the disk bays on our iSCSI backends

Today, for the first time in a year or perhaps more, we had a disk failure in one of our iSCSI backends (or at least we detected it today). As part of discovering that I was out of practice at dealing with this, I wound up having to hunt around to find our documentation on how iSCSI disk identifiers mapped to drive bays in our 16-bay Supermicro cases. This made me realize that probably we should do better at this.

The gold standard would be to label the actual drive sleds themselves with what iSCSI disk they are, but there are two problems with this. First, there's just not much good space on the front of each drive sled. Second and more severe, the actual assignment isn't tied to HD's serial number or anything to do with the drive sled, but on what bay the drive sled is in. Labeling the drive sleds thus has the same problem as comments in code, and we must be absolutely certain that each drive sled is put in the proper bay or has its label removed or updated if it's moved. I'm not quite willing to completely believe this any more than I ever completely believe code comments, and that means we're always going to need to double check.

While we have documentation (as mentioned), what we don't have is a printed out version of that documentation in our machine room. The whole thing fits on a single printed page, so there are plenty of obvious places it could go; probably the best place is on the exposed side of one of the fileserver racks. The heart of the documentation is a little chart, so we could cut out a few copies and put them up on the front of some of the racks that hold the fileservers and iSCSI backends. That would make the full documentation accessible when we're in the machine room, and keep the most important part visible when we're in front of the iSCSI backends about to pull a disk.

This isn't the only bit of fileserver and iSCSI backend information that would be handy to have immediately accessible in paper form in the machine room, either. I think I have some thinking, some planning, and some printing out to do in my future.

(We used to keep printed documentation about some things in the machine room, but then time passed, it got out of date or irrelevant, and we removed the old racks that it had previously been attached to. And we're a lot less likely to need things like 'this is theoretically the order to turn things on after a total power failure' than more routine documentation like 'what disk maps to what drive bay'.)

DriveChassisBayLabels written at 22:51:43; Add Comment

2016-07-01

How backwards compatibility causes us pain with our IMAP servers

One of the drawbacks of operating a general purpose Unix environment for decades is that backwards compatibility can wind up causing you to get trapped in troublesome situations. In particular, the weight of backwards compatibility has wound up requiring us to configure our IMAP server environment in a way that causes various problems.

Unix IMAP servers generally have a setting for where the various IMAP mailboxes and folders are stored on disk. Back when we first set up UW-IMAP at least two decades ago, we wound up with a situation where UW-IMAP looked for and expected to find people's mail under their home directory, $HOME. People could manually put them in a subdirectory of $HOME if they wanted, or they could just drop them straight in $HOME.

(Based on old historical UW-IMAP source, this wasn't even a configuration option at the time. It was just what UW-IMAP was hard-coded to assume about mailbox and folder layout for everything except your INBOX mailbox.)

Some people left things as they were and had various mailboxes in $HOME. Some people decided to be neater and put everything in a subdirectory, but which subdirectory they picked was varied; some people used $HOME/Mail, some people used $HOME/IMAP, and so on. As we upgraded our IMAP server software over the years, eventually moving from UW-IMAP to Dovecot, we had to keep this configuration setting intact. If we dared change it, for example to say that all IMAP mailboxes and folders would henceforth be in $HOME/IMAP, we would be forcing lots of people to either change their client's IMAP configuration or relocate files and directories at the Unix level (and probably both for some people). This would have been a massive flag day and a massive disruption to our entire user base, not all of which are even on campus, with serious effects on their access to much of their email if things didn't go exactly right.

Now, there are two problems with an IMAP server that thinks your mailboxes and folders start in $HOME. The lesser problem is that if you ask the IMAP server for a list of all of your top level folders and mailboxes, you get a ls of $HOME (complete with all of your dotfiles). This is at least a bit annoying and it turns out that some software doesn't cope well with this, including our webmail system.

(We wound up having to force our webmail system to confine itself to a subfolder of the IMAP namespace and thus a subdirectory of $HOME. People who wanted to use webmail had to do some Unix and IMAP rearrangement, but at least this was an opt-in change; people who didn't care about webmail were unaffected.)

The more serious problem is that there is an IMAP operation that requires recursively finding all of your folders, subfolders, and mailboxes. This obviously requires recursing through the actual directory structure, and Dovecot will do this without limit and it follows directory symlinks. If you have a symlink somewhere under your $HOME that creates a cycle, Dovecot will follow this endlessly. If you have a symlink that escapes from your $HOME into the wider filesystem, Dovecot will also follow this and start trying to walk around (where it may hit someone else's symlink cycle). In either case, your Dovecot process basically hangs there and hammers away at our fileservers.

We're very fortunate in that very few clients seem to invoke this IMAP operation and so hung Dovecot processes using up CPU and NFS bandwidth are pretty uncommon. But they're not unknown; we get a few every so often. And it's mostly because of this backwards compatibility need.

IMAPOurCompatibilityPain written at 23:17:53; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.