2017-04-05
Making your SMTP rejection messages be useful for you
Our external mail gateway will reject (some) incoming messages during the SMTP conversation if our anti-spam system thinks they have too high a spam score. Until today, they were rejected with a deliberately bland and uninformative SMTP error message:
550 Rejected: this message looks too much like spam
When I designed this message, I wrote a comment about it saying 'rejections for spam deliberately give the sender an uninformative message because I don't feel like giving spammers clues'. Then today we got called in to help troubleshoot an issue where a (valid) email message from outside had bounced, and all we had to go on was this message.
Well, you know what: spammers probably aren't reading our SMTP rejection messages anyways, but we certainly do every so often. If we're reading the message this version is exceedingly unhelpful; in fact it's so generic that it's not immediately clear if it's from our system or some other system. So now our SMTP time rejection message for spam says this:
550 Rejected: CSLab PMX spam score too high (milter id <something>)
This new form does several things. First, it clearly identifies to us that the message comes from our external mail gateway. Then, between the 'milter id' and the 'PMX spam score' wording, it tells us which SMTP-time rejection is being triggered here; it's our milter-based system. Finally, the <something> is the Exim (log) ID that was assigned to the proto-message as it was being received. Using this ID we can efficiently retrieve all of the other information about the message from our logs, including the specifics of its spam score (such as they are, given that Sophos PureMessage's spam scoring is basically a black box).
Having done this exercise for one SMTP rejection message, I'm sort of tempted to do it for others. If I start from the premise that someday a user will turn up saying 'someone trying to mail me got this message', what do I want to see in the message so we can explain the situation to people?
(The good news is that I took a quick look and almost all of our other SMTP rejection messages seem to include the crucial information. For example, our 'rejected because the sending IP is in Spamhaus' SMTP rejection message actually includes the IP address, so we don't have to try to correlate logs with whatever vague information we have about the rough time the message was sent to the particular user in order to find it.)
By the way, one consideration here is that you don't necessarily want these messages to be too long, because some SMTP senders will truncate your rejection message when they report it to users (or at least they used to). I believe I've seen ones that only report the first line, for example. This is why our current rejection message is going to be relatively cryptic to anyone but us; I cautiously squeezed it down to something that I felt had a relatively high chance of making it back to us intact.
I don't get many bounce messages these days, so it's possible that modern mail systems no longer suffer from this issue. Certainly mail providers like Google and Yahoo generate quite long and verbose multi-line SMTP rejections and temporary failures. Perhaps I should add a second line with a clear, normal person focused explanation for anyone who trips over this as a legitimate false positive.
Why the modern chown command uses a colon to separate the user and group
In the beginning, all chown(1)
did was change the owner of a file;
if you wanted to change a file's group too, you had to use chgrp(1)
as well. This is actually more unusual than I realized before I started
to write this entry, because even in V7 Unix the chown(2)
system call
itself could change both user and group, per the V7 chown(2)
manpage.
Restricting chown(1)
to only changing the owner did make the command
itself
pretty simple, though.
By the time of 4.1c BSD, chown(1)
had become chown(8)
,
because, per the manual page:
Only the super-user can change owner,
in order to simplify as yet unimplemented accounting procedures.
(The System V line of Unixes would retain an unrestricted chown(2)
system call for some time and thus I believe they kept the chown
command in section 1, for general commands anyone could use.)
In 4.3 BSD, someone decided that chown(8)
might as well let you change
the group at the same time, to match the system call. As the manual page
covers, they used this syntax:
/etc/chown [ -f -R ] owner[.group] file ...
That is, to chown a file to user cks
, group staff
, you did
'chown cks.staff file
'.
This augmented version of the chown
command was picked up by various
Unixes that descended from 4.x BSD, although not immediately (like many
things from 4.3 BSD, it took a while to propagate around). Sometimes
this was the primary version of chown
, found in /usr/bin
or the
like; sometimes this was a compatibility version, in /usr/ucb
(Solaris
through fairly late, for example). Depending on how you set up your
$PATH
on such systems, you could wind up using this version of chown
and thus get used to having 'user:group' rejected as an error.
Then, when it came time for POSIX to standardize this, someone woke
up and created the modern syntax for changing both owner and group
at once. As seen in the Single Unix Specification for chown
,
this is 'chown owner[:group] file
', ie the separator is now a
colon. Since POSIX and the SUS normally standardized existing
practice (where it actually existed), you might wonder why they
changed it. The answer is simple: a colon is not a valid character
in a login, while a dot is.
Sure, dots are unusual in Unix logins in most places, but they're
legal and they do show up in some environments (and they're legal
in group names as well). Colons are outright illegal unless you
like explosions, fundamentally because they're the field separator
character in /etc/passwd
and /etc/group
. The SUS manpage
actually has an explicit discussion of this in the RATIONALE section,
although it doesn't tell you what it means by 'historical
implementations'.
(The SUS manpage also discusses a scenario where using chown
and
chgrp
separately isn't sufficient, and you have to make the change
in a single chown()
system call.)
PS: Since I think I ran into this dot-versus-colon issue on our
old Solaris 10 fileservers, I
probably had /usr/ucb
before /usr/bin
in my $PATH
there. I
generally prefer UCB versions of things to the stock Solaris versions,
but in this case it tripped me up.
PPS: It turns out the GNU chown accepts the dot form as well provided that it's unambiguous, although this is covered only in the chown info file, and is not mentioned in the normal manpage.