Wandering Thoughts archives

2005-12-18

Emulating C structs in Python

One of the few data types from C that I miss when writing Python code is structs. The simplest replacement is dictionaries, but that means you have to write thing['field'] instead of thing.field. I can't stand that (it's the extra characters).

If you want thing.field syntax in Python, you need an object. The simplest C struct emulation is just to use a blank object and set fields on it:

class MyStruct:
  pass

ms = MyStruct()
ms.foo = 10
ms.bar = "abc"

Some people will say that this is an abuse of objects, since they don't have any code, just data. I say to heck with such people; sometimes all I want is data.

(Avoid the temptation to just use 'ms = object()', because it hurts your ability to tell different types of structs apart via introspection.)

Initialization this way is tedious, though. We can do it easier and more compactly by using keyword arguments when we create the object, with a little help from the class. Like so:

class Struct:
  def __init__(self, **kwargs):
    for k, v in kwargs.items():
      setattr(self, k, v)

class MyStruct(Struct):
  pass

ms = MyStruct(foo = 10, bar = "abc")

(And look, now our objects have some code.)

It's possible to write the __init__ function as 'self.__dict__.update(kwargs)', but that is fishing a little too much into the implementation of objects for me. I would rather use the explicit setattr loop just to be clear about what's going on.

(I am absolutely sure people have been using this idiom for years before I got here.)

Sidebar: dealing with packed binary data

If you need to deal with packed binary data in Python, you want the struct module.

This is a much better tool than C has, because structs are not good for this (contrary to what some people think); structs do not actually fully specify the memory layout. C compilers are free to insert padding to make field access more efficient, which makes struct memory layout machine and compiler dependent.

(I sometimes find it ironic that supposedly 'high level' languages like Python and Perl have better tools to deal with binary structures than 'low level' C.)

python/EmulatingStructsInPython written at 17:30:55; Add Comment

Weekly spam summary on December 17th, 2005

To start with, Hotmail's numbers:

  • 3 emails accepted from Hotmail, at least two of them likely spam.
  • 263 messages rejected because they came from non-Hotmail email addresses.
  • 106 messages sent to our spamtraps.
  • 33 messages refused because their sender addresses had already hit our spamtraps.
  • 6 messages refused due to their origin IP address (five for being in the SBL, and one from Nigeria).

Despite all of these crappy numbers, we've determined that we get at least some legitimate and wanted email from Hotmail, so we will not be blocking them entirely. Oh well. Dear Hotmail: please fix your spam problems.

On the rest of the numbers:

This week we received 16,179 email messages from 209 different IP addresses. Our SMTP server handled 23,552 sessions from 2,014 different IP addresses. Email volume is slightly down from last week, although session volume is up significantly and the number of sources has doubled.

Connection volume is up significantly from last week: 150,000 connections from at least 42,800 different IP addresses. Again there is a significant jump in the number of different IP addresses trying to talk to us.

Day Connections different IPs
Sunday 20,500 +6,330
Monday 18,490 +5,920
Tuesday 19,600 +5,110
Wednesday 17,850 +4,330
Thursday 16,950 +5,540
Friday 22,000 +8,030
Saturday 33,770 +7,550

Most of the week looks relatively ordinary (although overall higher than last week), but come Friday and we see a significant upturn. I suspect that this trend will continue on through next week.

Kernel level packet filtering top ten:

Host/Mask           Packets   Bytes
213.140.2.68           6694    402K
210.215.122.10         5600    269K
207.145.162.56         5552    266K
83.170.21.250          5047    242K
222.166.82.174         4860    292K
212.216.176.0/24       3766    187K
195.135.141.22         2620    131K
217.34.169.49          2475    126K
194.102.202.34         2209    106K
81.193.116.226         2108    101K

Apart from Telecom Italia's outgoing mail servers, this is all individual hosts.

  • Only 222.166.82.174 returns from before.
  • 213.140.2.68 is a fastwebnet.it machine; we don't talk to any of them due to too much spam.
  • 83.170.21.250, 222.166.82.174, and 217.34.169.49 are all what we consider 'dialup' machines.
  • 207.145.162.56 is on the ORDB.
  • 195.135.141.22 is on the CBL.
  • 210.215.122.10 and 81.193.116.226 are both lacking in good reverse DNS.
  • 194.102.202.34 sent us too many unresolvable HELO greetings.

The overall packet counts are up somewhat over last week.

Connection time rejection stats:

  29999 total
  14435 dynamic IP
   8935 bad or no reverse DNS
   4243 class bl-cbl
    620 class bl-sbl
    497 class bl-ordb
    326 class bl-sdul
    249 class bl-dsbl
    222 class bl-spews
     54 class bl-njabl
     11 class bl-opm

The 'dynamic IP' and CBL numbers have jumped significantly, without having any one single source. It looks like spammers have started up targeting our users with significant spam runs, most of which we have hopefully refused.

what # this week (distinct IPs) # last week (distinct IPs)
Bad HELOs 2088 169 716 67
Bad bounces 2751 738 135 99

I'm not surprised by the sudden jump in both of these numbers, although I'm not thrilled either (especially by the jump in bad bounces, since that means spammers are back to forging us into the origin addresses of their spams).

spam/SpamSummary-2005-12-17 written at 00:47:46; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.