2022-01-05
Some things about Dovecot, its index files, and the IMAP SELECT command
As part of our Prometheus and Grafana setup
we monitor our IMAP servers (we have more than one for reasons),
and as part of that monitoring we verify not just that we can log
in but that we can do IMAP SELECT
operations on various mailboxes
that are on various different fileservers (using a custom Python program). When I added the mailbox SELECT
s
and their success metrics, I also decided to collect and report the
time it took, since we're not certain how well the IMAP server
performs. When I set this
up, I somewhat unthinkingly assumed that the timing information I
was getting was representative of how long it took to read each
mailbox on its fileserver. Recently I had reasons to look into what
Dovecot was actually doing in those IMAP SELECTs and it turns out
that I was quite wrong.
The first thing going on is that Dovecot likes to keep indexes
to mbox mailboxes
(also the old v1 documentation, with the motivation) and does its best to answer
questions about a mailbox from the mailbox's index when possible
(Dovecot building these indexes has caused us problems). If you're frequently and repeatedly
SELECTing an otherwise unchanging test mailbox, your index files
are basically absolutely sure to be up to date, and so what you're
actually testing and timing is a stat()
of the mailbox and then
reading the index.
(Closely related to this, if Dovecot doesn't think it needs to read
the mailbox during a SELECT
because the index is up to date, it
doesn't attempt to lock the mailbox at all. This can make you scratch
your head a lot if you're trying to test mailbox locking using your
obvious test mailboxes and test program.)
At this point you might reasonably ask where Dovecot stores its
index files, and the answer turns out to be that it depends. The
normal location of index files is set by the INDEX
key for the
mail_location
configuration setting. On our IMAP servers, this is always a local
directory on the server's SSDs. However, there is a surprise in
Dovecot's implementation of SELECT. In Dovecot, you can give SELECT
either a name relative to your IMAP root
or an absolute Unix path that potentially can be anywhere on the
IMAP server's filesystem (if Dovecot itself is unconfined by eg
chroot). If you SELECT an absolute path that Dovecot thinks is
outside your IMAP root (and this is done by text comparison on the
paths), Dovecot actually stores the index under a hidden .imap/
subdirectory in the directory of the mailbox.
That's complicated, so let's do some examples. Let's suppose that
your user's IMAP root is /u/fred/IMAP
, where /u
is a collection
of symlinks to the real home directories. Then things
go as follows:
- 'SELECT testmbox' and 'SELECT /u/fred/IMAP/testmbox' both use
your regular Dovecot
INDEX=
setting for their indexes. - 'SELECT INBOX' also uses the regular Dovecot
INDEX=
setting, even if you told Dovecot that people's inboxes were located somewhere else, such as /var/mail. - 'SELECT /h/100/fred/IMAP/testmbox' uses /h/100/fred/IMAP/.imap/, even if /u/fred is a symlink to /h/100/fred. Dovecot doesn't notice that this is really /u/fred and so under your IMAP root.
- 'SELECT /w/101/shared/fred' tries to use /w/101/shared/.imap/, as you'd expect, but it may fail completely and block access to the mailbox.
The problem with mailboxes in shared directories is that if Dovecot
has to create a .imap
subdirectory, it makes it owned by the UID
of the first person to access a mailbox in the directory (and only
accessible by the owner). This means that if Barney comes along
after Fred and does 'SELECT /w/101/shared/barney', Dovecot will
attempt to access the Fred-owned /w/101/shared/.imap as Barney,
fail, and refuse to let Barney read their mailbox at all (reporting
a 'no permission' error, which may be puzzling since Barney does
have permission on their own mailbox). I don't think there's any
way around this for SELECT'ing arbitrary absolute paths.
(This does explain why people have never been able to SELECT their own 'oldmail' mailboxes, which store the last week or so of their email in a mailbox in a shared but non-writeable directory. Possibly I knew this at some point but then forgot it.)
So if you SELECT
an unchanging mailbox that's (visibly) under
your IMAP root and time this, you're mostly timing Dovecot's ability
to read its index files from your INDEX
setting, regardless of
where the mailbox is. If you SELECT
an unchanging mailbox by an
absolute path that's outside the IMAP root, you're mostly timing
Dovecot's ability to read the index file from a subdirectory of the
same directory, which will at least normally be on the same filesystem
as the mailbox.
(You're also timing Dovecot's ability to lock and unlock the index files.)
All of this is probably okay for us, but it does make the timing
metrics I've been gathering less meaningful than I thought they
were. Alternately, it means that some timing blips (especially for
SELECTs of INBOX) are even more alarming than they looked because
they probably aren't for actually reading files over NFS, just
stat()
'ing them.