Some things about Dovecot, its index files, and the IMAP LIST command

March 23, 2018

We have a backwards compatibility issue with our IMAP server, where people's IMAP roots are $HOME, their home directory, and then clients ask the IMAP server to search all through the IMAP namespace; this causes various bad things to happen, including running out of inodes. The reason we ran out of inodes is that Dovecot maintains some index files for every mailbox it looks at.

We have Dovecot store its index files on our IMAP server's local disk, in /var/local/dovecot/<user>. Dovecot puts these in a hierarchy that mirrors the actual Unix (and IMAP) hierarchy of the mailboxes; if there is a subdirectory Mail in your home directory with a mailbox Drafts, the Dovecot index files will be in .../<user>/Mail/.imap/Drafts/. It follows that you can hunt through someone's Dovecot index files to see what mailboxes their clients have looked at, although this may tell you less than you think and what their active mailboxes are.

(One reason that Dovecot might look at a mailbox is that your client has explicitly asked it to, with an IMAP SELECT command or perhaps an APPEND, COPY, or MOVE operation. However, there are other reasons.)

When I began digging into our IMAP pain and working on our planned migration (which has drastically changed directions since then), I was operating under the charming idea that most clients used IMAP subscriptions and only a few of them asked the IMAP server to inventory everything in sight. One of the reasons for this is that only a few people had huge numbers of Dovecot index files, and I assumed that the two were tied together. It turns out that both sides of this are wrong.

Perhaps I had the idea that it was hard to do an IMAP LIST operation that asked the server to recursively descend through everything under your IMAP root. It isn't; it's trivial. Here's the IMAP command to do it:

m LIST "" "*"

That's all it takes (the unrestricted * is the important bit). The sort of good news is that this operation by itself won't cause Dovecot to actually look at those mailboxes and thus to build index files for them. However, there is a close variant of this LIST command that does force Dovecot to look at each file, because it turns out that you can ask your IMAP server to not just list all your mailboxes but to tell you which ones have unseen messages. That looks like this:


Some clients use one LIST version, some use the other, and some seem to use both. Importantly, the standard iOS Mail app appears to use the 'LIST UNSEEN' version at least some of the time. iDevices are popular around the department, and it's not all that easy to find the magic setting for what iOS calls the 'IMAP path prefix'.

For us, a user with a lot of Dovecot index files was definitely someone who had a client with the 'search all through $HOME' problem (especially if the indexes were for things that just aren't plausible mailboxes). However, a user with only a few index files wasn't necessarily someone without the problem, because their client could be using the first version of the LIST command and thus not creating all those tell-tale index files. As far as I know, stock Dovecot has no way of letting you find out about these people.

(We hacked logging in to the Ubuntu version of Dovecot, which involved some annoyances. In theory Dovecot has a plugin system that we might have been able to use for this; in practice, figuring out the plugin API seemed likely to be at least as much work as hacking the Dovecot source directly.)

Sidebar: Limited LISTs

IMAP LIST commands can be limited in two ways, both of which have more or less the same effect for us:

m LIST "" "mail/*"
m LIST "mail/" "*"

For information on what the arguments to the basic LIST command mean, I will refer you to the IMAP RFC. The extended form is discussed in RFC 5819 and is based on things from, I believe, RFC 5258. See also RFC 6154 and here for the special-use stuff.

(The unofficial IMAP protocol wiki may be something I'll be consulting periodically now that I've stumbled over it, eg this matrix of all of the IMAP RFCs.)

Comments on this page:

I don't think there is a way out-of-the-box to find out who's doing a particular query, but as the sysadmin you can enable rawlog ( and that'll dump all IMAP traffic into a bunch of files. It works surprisingly well for debugging some IMAP issues.

By cks at 2018-03-23 08:45:51:

It's useful to learn about rawlog and maybe we'll have to use it someday, but as a sysadmin I can't imagine us using it except in emergencies because it appears to capture the text of email that people are reading. Even telling it to capture only 'in' traffic would (I think) capture email in APPEND operations, eg clients appending outgoing email to 'Sent' folders. Or does rawlog skip logging that information?

(Thanks for telling me about it, though!)

I think rawlog pipes the whole stream into the in/out files, so I think it would include APPEND & FETCH contents. It's useful for debugging weird issues.

Hrm, now that I think about it, the new stats stuff could help you get the info you are after:

Written on 23 March 2018.
« Why seeing what current attributes a Python object has is hard
Our current ugly hacks to Dovecot to help mitigate our IMAP problems »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Mar 23 01:53:22 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.