Some options for logging attachment information in an Exim environment
Suppose, not entirely hypothetically, that you use Exim as your mailer and you would like to log information about the attachments your users get sent. There are a number of different ways in Exim that you can do this, each with their own drawbacks and advantages. As a simplifying measure, let's assume that you want to do this during the SMTP conversation so that you can potentially reject messages with undesirable attachments (say ZIP files with Windows executables in them).
The first decision to make is whether you will scan and analyze the entire message in your own separate code, or let Exim break the message up into various MIME parts and look at them one-by-one. Examining the entire message at once means that you can log full information about its structure in one place, but it also means that you're doing all of the MIME processing yourself. The natural place to take a look at the whole message is with Exim's anti-virus content-scanning system; you would hook into it in a way similar to how we hooked our milter-based spam rejection into Exim.
(You'll want to use a warn
stanza to just cause the scanner to
run, and maybe to give you some stuff that you'll get Exim to log
with the log_message
ACL directive.)
If you want to let Exim section the message up into various different MIME parts for you, then you want a MIME ACL (covered in the Content scanning at ACL time chapter of the documentation). At this point you have another decision to make, which is whether you want to run an external program to analyze the MIME part or whether to rely only on Exim. The advantage of doing things entirely inside Exim is that Exim doesn't have to decode the MIME part to a file for your external program (and then run an outside program for each MIME part); the disadvantage is that you can only log MIME part information and can't do things like spot suspicious attempts to conceal ZIP files.
Mechanically, having Exim do it all means you'd just have a warn
stanza in your MIME ACL that logged information like
$mime_content_disposition
, $mime_content_type
, $mime_filename
or its extension, and so on, using log_message =
. You wouldn't
normally use decode =
because you have little use for decoding
the part to a file unless you're going to have an outside program
look at it. If you wanted to run a program against MIME parts, you'd
use decode = default
and then run the program with
$mime_decoded_filename
and possibly other arguments via ${run}
in, for example, a 'set acl_m1_blah = ...
' line.
(There are some pragmatic issues here that I'm deferring to another entry.)
Allowing Exim to section the message up for you is easier in many ways, but has two drawbacks. First, Exim doesn't really provide any way to get the MIME structure of the message, because you just get a stream of parts; you don't necessarily see, for example, how things are nested. The second is that processing things part by part obviously makes it harder to log all the information about a message's file types in a single line; the natural way is to log a separate line for each part, as you process it.
Speaking of logging, if you're running an external program (either for the entire message or for each MIME part) you need to decide whether your program will do the logging or whether you're going to have the program pass information back to Exim and have Exim log it. Passing information back to Exim is more work but means that you'll see your attachment information along with the other log lines for the message. Logging to a place like syslog may make the information more conveniently visible and it's generally going to be easier.
Sidebar: Exim's MIME parsing versus yours
Exim's MIME parsing is in C and is presumably done on an in-place version of the message that Exim already has on disk. It thus should be quite efficient (until you start decoding parts) and hopefully reasonably security hardened. Parsing a message's MIME structure yourself means relying on the speed, quality, resilience against broken MIME messages, and security of whatever code either you write or your language of choice already has for MIME parsing, and it requires Exim to reconstitute a full copy of the message for you.
My experience with Python's standard MIME parsing module was that it's at least somewhat fragile against malformed input. This isn't a security risk (it's Python), but it did mean that my code wound up spending a bunch of time recovering from MIME parsing explosions and trying to extract some information from the mess anyways. I wouldn't be surprised if other languages had standard packages that assumed well-formed input and threw errors otherwise (and it's hard to blame them; dealing with malformed MIME messages is a specialized need).
(Admittedly I don't know how well Exim itself deals with malformed MIME messages and MIME parts. Hopefully it parses them as much as possible, but it may just throw up its hands and punt.)
|
|