Some things about Dovecot, its index files, and the IMAP SELECT command

January 5, 2022

As part of our Prometheus and Grafana setup we monitor our IMAP servers (we have more than one for reasons), and as part of that monitoring we verify not just that we can log in but that we can do IMAP SELECT operations on various mailboxes that are on various different fileservers (using a custom Python program). When I added the mailbox SELECTs and their success metrics, I also decided to collect and report the time it took, since we're not certain how well the IMAP server performs. When I set this up, I somewhat unthinkingly assumed that the timing information I was getting was representative of how long it took to read each mailbox on its fileserver. Recently I had reasons to look into what Dovecot was actually doing in those IMAP SELECTs and it turns out that I was quite wrong.

The first thing going on is that Dovecot likes to keep indexes to mbox mailboxes (also the old v1 documentation, with the motivation) and does its best to answer questions about a mailbox from the mailbox's index when possible (Dovecot building these indexes has caused us problems). If you're frequently and repeatedly SELECTing an otherwise unchanging test mailbox, your index files are basically absolutely sure to be up to date, and so what you're actually testing and timing is a stat() of the mailbox and then reading the index.

(Closely related to this, if Dovecot doesn't think it needs to read the mailbox during a SELECT because the index is up to date, it doesn't attempt to lock the mailbox at all. This can make you scratch your head a lot if you're trying to test mailbox locking using your obvious test mailboxes and test program.)

At this point you might reasonably ask where Dovecot stores its index files, and the answer turns out to be that it depends. The normal location of index files is set by the INDEX key for the mail_location configuration setting. On our IMAP servers, this is always a local directory on the server's SSDs. However, there is a surprise in Dovecot's implementation of SELECT. In Dovecot, you can give SELECT either a name relative to your IMAP root or an absolute Unix path that potentially can be anywhere on the IMAP server's filesystem (if Dovecot itself is unconfined by eg chroot). If you SELECT an absolute path that Dovecot thinks is outside your IMAP root (and this is done by text comparison on the paths), Dovecot actually stores the index under a hidden .imap/ subdirectory in the directory of the mailbox.

That's complicated, so let's do some examples. Let's suppose that your user's IMAP root is /u/fred/IMAP, where /u is a collection of symlinks to the real home directories. Then things go as follows:

  • 'SELECT testmbox' and 'SELECT /u/fred/IMAP/testmbox' both use your regular Dovecot INDEX= setting for their indexes.
  • 'SELECT INBOX' also uses the regular Dovecot INDEX= setting, even if you told Dovecot that people's inboxes were located somewhere else, such as /var/mail.
  • 'SELECT /h/100/fred/IMAP/testmbox' uses /h/100/fred/IMAP/.imap/, even if /u/fred is a symlink to /h/100/fred. Dovecot doesn't notice that this is really /u/fred and so under your IMAP root.
  • 'SELECT /w/101/shared/fred' tries to use /w/101/shared/.imap/, as you'd expect, but it may fail completely and block access to the mailbox.

The problem with mailboxes in shared directories is that if Dovecot has to create a .imap subdirectory, it makes it owned by the UID of the first person to access a mailbox in the directory (and only accessible by the owner). This means that if Barney comes along after Fred and does 'SELECT /w/101/shared/barney', Dovecot will attempt to access the Fred-owned /w/101/shared/.imap as Barney, fail, and refuse to let Barney read their mailbox at all (reporting a 'no permission' error, which may be puzzling since Barney does have permission on their own mailbox). I don't think there's any way around this for SELECT'ing arbitrary absolute paths.

(This does explain why people have never been able to SELECT their own 'oldmail' mailboxes, which store the last week or so of their email in a mailbox in a shared but non-writeable directory. Possibly I knew this at some point but then forgot it.)

So if you SELECT an unchanging mailbox that's (visibly) under your IMAP root and time this, you're mostly timing Dovecot's ability to read its index files from your INDEX setting, regardless of where the mailbox is. If you SELECT an unchanging mailbox by an absolute path that's outside the IMAP root, you're mostly timing Dovecot's ability to read the index file from a subdirectory of the same directory, which will at least normally be on the same filesystem as the mailbox.

(You're also timing Dovecot's ability to lock and unlock the index files.)

All of this is probably okay for us, but it does make the timing metrics I've been gathering less meaningful than I thought they were. Alternately, it means that some timing blips (especially for SELECTs of INBOX) are even more alarming than they looked because they probably aren't for actually reading files over NFS, just stat()'ing them.

Written on 05 January 2022.
« Some ways to implement /dev/fd in Unix kernels
An annoyance with Debian postinstall scripts during package upgrades »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jan 5 22:48:02 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.