Wandering Thoughts archives

2007-03-04

Handling lines with something-separated fields for Python

As a system administrator, I spend a bunch of my time dealing with files made up of lines that are composed of fields separated by some character. A classical example is /etc/passwd, with colon-separated fields. These file formats are ordered lists with named fields, which should sound familiar, but they don't show up as Python lists, they show up as lines of text and they want to be output as text; that we use lists to represent them is just an implementation detail.

This only takes a little bit of extra work to implement on top of our previous SetMixin class:

class FieldLine(SetMixin, list):
    separator = ":"
    def __init__(self, line):
        n = line.split(self.separator)
        super(FieldLine, self).__init__(n)
    def __str__(self):
        return self.separator.join(self)

class PasswdLine(FieldLine):
    fields = gen_fields('name', 'passwd',
                        'uid', 'gid', 'gecos',
                        'dir', 'shell')

(Where gen_fields is basically the dict()-ized version of enum_args from here.)

Now that I've written these entries, I have a confession: this is actually what I started out doing. I didn't first build a general ordered list with named fields class and then realized it could be used to deal with /etc/passwd lines; I started out needing to deal with /etc/passwd lines, decided that I wanted read/write access to named fields, and then built downwards. I just wrote it up backwards because it looks neater that way.

(In fact this is the cleaned up and idealized version of this class. The real one in my program does not subclass list; instead it is a normal class with a private 'field_store' list and everything just directly manipulates that. It also doesn't handle the slicing cases, because I didn't need to. I did the new version for here for various reasons, including that it was a good excuse to play around with subclassing built in types.)

python/LinesWithSeparatedFields written at 22:44:56; Add Comment

Weekly spam summary on March 3rd, 2007

This machine had a planned twelve hour power outage today, so many of these statistics are really only for six days. Having said that, this week we:

  • got 16,376 messages from 272 different IP addresses.
  • handled 20,396 sessions from 1,270 different IP addresses.
  • received at least 212,857 connections from at least 63943 different IP addresses.
  • hit a highwater of 5 connections being checked at once.

This is down from last week, but not hugely so; we might have been in the same ballpark if not for the downtime.

Day Connections different IPs
Sunday 35,230 +12,485
Monday 32,638 +10,273
Tuesday 38,623 +11,238
Wednesday 34,186 +10,274
Thursday 36,476 +10,272
Friday 31,556 +9,401

This is reasonably similar to last week's, although smoother.

Kernel level packet filtering top ten (up to 02:26 am on March 3rd):

Host/Mask           Packets   Bytes
205.152.59.0/24       10914    495K
206.223.168.238        9216    505K
213.4.149.12           6660    346K
69.25.186.66           5673    272K
81.115.40.8            5360    286K
213.29.7.0/24          4317    259K
68.22.111.226          4051    189K
65.14.221.82           3569    171K
204.202.15.102         3019    149K
211.94.0.0/15          2919    175K

This is down significantly from last week, and it seems unlikely that one more day would have made a major difference.

  • 205.152.59.0/24 is Bellsouth, still hammering on us with advance fee fraud spammers through their webmail system. (Well, probably. Since we're not accepting their packets I can't be sure.)
  • 206.223.168.238 and 204.202.15.102 return from last week.
  • 213.4.149.12 returns from recently and is resuming its usual presence in the listing.
  • 69.25.186.66 is mail.mydiscountoffer.com, and was blocked for being in AccelerateBiz network space; after too many spammers, we no longer accept connections from their IP ranges.
  • 81.115.40.8 is a telecomitalia.it IP address, last seen in January.
  • 68.22.111.226 and 65.14.221.82 kept trying with bad HELOs.

Connection time rejection stats:

  66671 total
  40920 dynamic IP
  16846 bad or no reverse DNS
   5790 class bl-cbl
   1512 class bl-sbl
    462 acceleratebiz.com
    225 class bl-pbl
    109 class bl-njabl
    104 class bl-dsbl
     79 cuttingedgemedia.com
     64 class bl-sdul

This is pretty close to last week, and might even have been over it if not for the 12 hour downtime. I'd do a breakdown of the SBL rejections, but there's no real point; 1440 of them come from SBL50892, which is a colocentral.com spammer hosting escalation listing from Feburary 6th, and the next highest one is 12 rejections. (The colocentral.com rejections were spread over 248 different IP addresses, with none of them having more than 9 rejections. The hostnames suggest that we didn't miss anything.)

Three of the top 30 most rejected IP addresses were rejected 100 times or more this week: 69.25.186.66 (181 times), 67.102.251.238 (176 times, a Covad something or other), and 210.176.52.139 (149 times, no reverse DNS). Twelve of the top 30 are currently in the CBL, ten are currently in bl.spamcop.net, eight are in the PBL, and a grand total of 15 of the 30 are in zen.spamhaus.org.

This week Hotmail did:

  • 4 messages accepted, at least two of them legitimate and one almost certainly spam.
  • no messages rejected because they came from non-Hotmail email addresses.
  • 41 messages sent to our spamtraps.
  • 30 messages refused because their sender addresses had already hit our spamtraps.
  • 8 messages refused due to their origin IP address (4 in the CBL, 3 from the Cote d'Ivoire, and one in SBL45516).

And the final numbers:

what # this week (distinct IPs) # last week (distinct IPs)
Bad HELOs 953 95 877 101
Bad bounces 17 16 16 12

There was no particularly flagrant source of bad HELOs this week, just the usual crowd with middle double digit rejections before we dumped them in the kernel filters. Bad bounces once again came from all over, although possibly with more North American sources than anywhere else.

Bad bounces were sent to 15 different usernames this week, once again mostly to real ex-users and plausible usernames (and one valid ex-user with some numbers glued on the front). The most popular target, with three bounces, was an ex-user.

spam/SpamSummary-2007-03-03 written at 00:04:19; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.