Handling lines with something-separated fields for Python
As a system administrator, I spend a bunch of my time dealing with
files made up of lines that are composed of fields separated by some
character. A classical example is
/etc/passwd, with colon-separated
fields. These file formats are ordered lists with named fields, which
should sound familiar, but they don't show
up as Python lists, they show up as lines of text and they want to
be output as text; that we use lists to represent them is just an
This only takes a little bit of extra work to implement on top of
class FieldLine(SetMixin, list): separator = ":" def __init__(self, line): n = line.split(self.separator) super(FieldLine, self).__init__(n) def __str__(self): return self.separator.join(self) class PasswdLine(FieldLine): fields = gen_fields('name', 'passwd', 'uid', 'gid', 'gecos', 'dir', 'shell')
gen_fields is basically the dict()-ized version of
enum_args from here.)
Now that I've written these entries, I have a confession: this is
actually what I started out doing. I didn't first build a general
ordered list with named fields class and then realized it could be used
to deal with
/etc/passwd lines; I started out needing to deal with
/etc/passwd lines, decided that I wanted read/write access to named
fields, and then built downwards. I just wrote it up backwards because
it looks neater that way.
(In fact this is the cleaned up and idealized version of this class. The
real one in my program does not subclass
list; instead it is a normal
class with a private '
field_store' list and everything just directly
manipulates that. It also doesn't handle the slicing cases, because I
didn't need to. I did the new version for here for various reasons,
including that it was a good excuse to play around with subclassing
built in types.)
Weekly spam summary on March 3rd, 2007
This machine had a planned twelve hour power outage today, so many of these statistics are really only for six days. Having said that, this week we:
- got 16,376 messages from 272 different IP addresses.
- handled 20,396 sessions from 1,270 different IP addresses.
- received at least 212,857 connections from at least 63943 different IP addresses.
- hit a highwater of 5 connections being checked at once.
This is down from last week, but not hugely so; we might have been in the same ballpark if not for the downtime.
This is reasonably similar to last week's, although smoother.
Kernel level packet filtering top ten (up to 02:26 am on March 3rd):
Host/Mask Packets Bytes 188.8.131.52/24 10914 495K 184.108.40.206 9216 505K 220.127.116.11 6660 346K 18.104.22.168 5673 272K 22.214.171.124 5360 286K 126.96.36.199/24 4317 259K 188.8.131.52 4051 189K 184.108.40.206 3569 171K 220.127.116.11 3019 149K 18.104.22.168/15 2919 175K
This is down significantly from last week, and it seems unlikely that one more day would have made a major difference.
- 22.214.171.124/24 is Bellsouth, still hammering on us with advance fee fraud spammers through their webmail system. (Well, probably. Since we're not accepting their packets I can't be sure.)
- 126.96.36.199 and 188.8.131.52 return from last week.
- 184.108.40.206 returns from recently and is resuming its usual presence in the listing.
- 220.127.116.11 is mail.mydiscountoffer.com, and was blocked for being in AccelerateBiz network space; after too many spammers, we no longer accept connections from their IP ranges.
- 18.104.22.168 is a telecomitalia.it IP address, last seen in January.
- 22.214.171.124 and 126.96.36.199 kept trying with bad
Connection time rejection stats:
66671 total 40920 dynamic IP 16846 bad or no reverse DNS 5790 class bl-cbl 1512 class bl-sbl 462 acceleratebiz.com 225 class bl-pbl 109 class bl-njabl 104 class bl-dsbl 79 cuttingedgemedia.com 64 class bl-sdul
This is pretty close to last week, and might even have been over it if not for the 12 hour downtime. I'd do a breakdown of the SBL rejections, but there's no real point; 1440 of them come from SBL50892, which is a colocentral.com spammer hosting escalation listing from Feburary 6th, and the next highest one is 12 rejections. (The colocentral.com rejections were spread over 248 different IP addresses, with none of them having more than 9 rejections. The hostnames suggest that we didn't miss anything.)
Three of the top 30 most rejected IP addresses were rejected 100
times or more this week: 188.8.131.52 (181 times), 184.108.40.206
(176 times, a Covad something or other), and 220.127.116.11 (149
times, no reverse DNS). Twelve of the top 30 are currently in the
CBL, ten are currently in
bl.spamcop.net, eight are in the PBL, and a grand total of 15 of the 30 are in
This week Hotmail did:
- 4 messages accepted, at least two of them legitimate and one almost certainly spam.
- no messages rejected because they came from non-Hotmail email addresses.
- 41 messages sent to our spamtraps.
- 30 messages refused because their sender addresses had already hit our spamtraps.
- 8 messages refused due to their origin IP address (4 in the CBL, 3 from the Cote d'Ivoire, and one in SBL45516).
And the final numbers:
|what||# this week||(distinct IPs)||# last week||(distinct IPs)|
There was no particularly flagrant source of bad
HELOs this week,
just the usual crowd with middle double digit rejections before we
dumped them in the kernel filters. Bad bounces once again came from
all over, although possibly with more North American sources than
Bad bounces were sent to 15 different usernames this week, once again mostly to real ex-users and plausible usernames (and one valid ex-user with some numbers glued on the front). The most popular target, with three bounces, was an ex-user.