2005-12-18
Emulating C structs in Python
One of the few data types from C that I miss when writing Python code
is struct
s. The simplest replacement is dictionaries, but that means
you have to write thing['field']
instead of thing.field
. I can't
stand that (it's the extra characters).
If you want thing.field
syntax in Python, you need an object. The
simplest C struct
emulation is just to use a blank object and
set fields on it:
class MyStruct: pass ms = MyStruct() ms.foo = 10 ms.bar = "abc"
Some people will say that this is an abuse of objects, since they don't have any code, just data. I say to heck with such people; sometimes all I want is data.
(Avoid the temptation to just use 'ms = object()
', because it hurts
your ability to tell different types of structs apart via
introspection.)
Initialization this way is tedious, though. We can do it easier and more compactly by using keyword arguments when we create the object, with a little help from the class. Like so:
class Struct: def __init__(self, **kwargs): for k, v in kwargs.items(): setattr(self, k, v) class MyStruct(Struct): pass ms = MyStruct(foo = 10, bar = "abc")
(And look, now our objects have some code.)
It's possible to write the __init__
function as
'self.__dict__.update(kwargs)
', but that is fishing a little
too much into the implementation of objects for me. I would rather
use the explicit setattr loop just to be clear about what's going on.
(I am absolutely sure people have been using this idiom for years before I got here.)
Sidebar: dealing with packed binary data
If you need to deal with packed binary data in Python, you want the struct module.
This is a much better tool than C has, because struct
s are not good
for this (contrary to what some people think); struct
s do not
actually fully specify the memory layout. C compilers are free to
insert padding to make field access more efficient, which makes
struct
memory layout machine and compiler dependent.
(I sometimes find it ironic that supposedly 'high level' languages like Python and Perl have better tools to deal with binary structures than 'low level' C.)
Weekly spam summary on December 17th, 2005
To start with, Hotmail's numbers:
- 3 emails accepted from Hotmail, at least two of them likely spam.
- 263 messages rejected because they came from non-Hotmail email addresses.
- 106 messages sent to our spamtraps.
- 33 messages refused because their sender addresses had already hit our spamtraps.
- 6 messages refused due to their origin IP address (five for being in the SBL, and one from Nigeria).
Despite all of these crappy numbers, we've determined that we get at least some legitimate and wanted email from Hotmail, so we will not be blocking them entirely. Oh well. Dear Hotmail: please fix your spam problems.
On the rest of the numbers:
This week we received 16,179 email messages from 209 different IP addresses. Our SMTP server handled 23,552 sessions from 2,014 different IP addresses. Email volume is slightly down from last week, although session volume is up significantly and the number of sources has doubled.
Connection volume is up significantly from last week: 150,000 connections from at least 42,800 different IP addresses. Again there is a significant jump in the number of different IP addresses trying to talk to us.
Day | Connections | different IPs |
Sunday | 20,500 | +6,330 |
Monday | 18,490 | +5,920 |
Tuesday | 19,600 | +5,110 |
Wednesday | 17,850 | +4,330 |
Thursday | 16,950 | +5,540 |
Friday | 22,000 | +8,030 |
Saturday | 33,770 | +7,550 |
Most of the week looks relatively ordinary (although overall higher than last week), but come Friday and we see a significant upturn. I suspect that this trend will continue on through next week.
Kernel level packet filtering top ten:
Host/Mask Packets Bytes 213.140.2.68 6694 402K 210.215.122.10 5600 269K 207.145.162.56 5552 266K 83.170.21.250 5047 242K 222.166.82.174 4860 292K 212.216.176.0/24 3766 187K 195.135.141.22 2620 131K 217.34.169.49 2475 126K 194.102.202.34 2209 106K 81.193.116.226 2108 101K
Apart from Telecom Italia's outgoing mail servers, this is all individual hosts.
- Only 222.166.82.174 returns from before.
- 213.140.2.68 is a fastwebnet.it machine; we don't talk to any of them due to too much spam.
- 83.170.21.250, 222.166.82.174, and 217.34.169.49 are all what we consider 'dialup' machines.
- 207.145.162.56 is on the ORDB.
- 195.135.141.22 is on the CBL.
- 210.215.122.10 and 81.193.116.226 are both lacking in good reverse DNS.
- 194.102.202.34 sent us too many unresolvable
HELO
greetings.
The overall packet counts are up somewhat over last week.
Connection time rejection stats:
29999 total 14435 dynamic IP 8935 bad or no reverse DNS 4243 class bl-cbl 620 class bl-sbl 497 class bl-ordb 326 class bl-sdul 249 class bl-dsbl 222 class bl-spews 54 class bl-njabl 11 class bl-opm
The 'dynamic IP' and CBL numbers have jumped significantly, without having any one single source. It looks like spammers have started up targeting our users with significant spam runs, most of which we have hopefully refused.
what | # this week | (distinct IPs) | # last week | (distinct IPs) |
Bad HELO s |
2088 | 169 | 716 | 67 |
Bad bounces | 2751 | 738 | 135 | 99 |
I'm not surprised by the sudden jump in both of these numbers, although I'm not thrilled either (especially by the jump in bad bounces, since that means spammers are back to forging us into the origin addresses of their spams).