2007-07-15
Problems I see with the ATA-over-Ethernet protocol
I've been experimenting with AoE lately, and as a result I've been looking at the protocol more than I did in my earlier exposure. Unfortunately, the more I look at the AoE protocol, the more uncomfortable I get.
The AoE protocol is quite simple; requests and replies are simple Ethernet frames, and a request's result must fit in a single reply packet. This means that the maximum read and write sizes per request are bounded by the size of the Ethernet frame, and thus on a normal Ethernet the maximum is 1K per request. (AoE does all IO in 512-byte sectors.)
So, the problems I see:
- AoE effectively requires the target to do buffering in order to
bridge the gap between AoE's small requests and the large IO
requests that modern disk systems need to see to get decent
performance.
Buffering writes makes targets less transparent and more dangerous. Requiring read buffering means that target performance goes down dramatically if the target can't do it, either because it can't predict the necessary readaheads pattern or because it's run out of spare memory.
(I am especially worried about readahead prediction because we will be using this for NFS servers that are used by a lot of people at once, so the targets will see what looks like random IO. I do not expect target-based readahead to do at all well in that situation.)
- because AoE uses such small requests and replies it must send and
receive a huge number of packets a second to get full bandwidth.
For example, on a normal Ethernet getting 100 Mbytes/sec of read
bandwidth requires handling over 200,000 packets per second (about
100,000 pps sent and 100,000 pps received).
This is a problem because most systems are much better at handling high network bandwidth than they are at handling high numbers of packets per second. (And historically, the pps rate machines can handle has grown more slowly than network bandwidth has.)
The packets per second issue probably only really affects reads; there are few disk systems that can sustain 100 Mbytes/sec of writes, but it is not difficult to build one that can do 100 Mbytes/sec of reads.
(And the interesting thing for us is to build a system that will still manage to use the full network bandwidth when it is not one streaming read but 30 different people each doing their own streaming reads, all being mixed together on the target.)
I find all of this unfortunate. I would like to like AoE, because it has an appealing simplicity; however, I'm a pragmatist, so simplicity without performance is not good enough.
Sidebar: the buffer count problem
There's a third, smaller problem. The 'Buffer Count' in the server configuration reply (section 3.2 of the AoE specification) cannot mean what it says it means. The protocol claims that this is a global limit, that it is:
The maximum number of outstanding messages the server can queue for processing.
The problem is that one initiator has no idea how many messages other initiators are currently sending the server. So this has to actually be the number of outstanding messages a single initiator can send the server, and it is the server's responsibility to divide up a global pool among all of the initiators.
(In practice this means that the server needs to be manually configured to know how many initiators it has.)
Weekly spam summary on July 14th, 2007
Our SMTP frontend died (twice) around 8am on Friday morning, so some of the stats for this are partial stats and some of them are missing about two hours of data. That said, this week we:
- got 10,583 messages from 249 different IP addresses.
- handled 17,948 sessions from 1,258 different IP addresses.
- received 257,246 connections from over 50,000 different IP addresses.
- hit a highwater of 7 connections being checked at once.
This is pretty similar to last week. I've managed to reconstruct more or less the per day information:
Day | Connections | different IPs |
Sunday | 39,600 | +11,157 |
Monday | 34,312 | +9,774 |
Tuesday | 37,764 | +10,198 |
Wednesday | 44,447 | +10,857 |
Thursday | 31,044 | +8,086 |
Friday | 41,368 | +11,090 |
Saturday | 28,711 | +8,448 |
(The one caution is that the 'different IPs' information is not reliable for Friday and Saturday, since it effectively starts from scratch.)
I continue to have no idea why spammers like Wednesday, but clearly they do.
Kernel level packet filtering top ten:
Host/Mask Packets Bytes 68.230.240.0/23 35913 1744K cox.net 213.4.149.12 27599 1435K terra.es 205.152.59.0/24 19880 901K bellsouth.net 24.155.195.124 18546 890K 213.29.7.0/24 11509 691K centrum.cz 68.167.174.246 10155 477K 76.65.201.70 6896 317K 68.168.78.0/24 5478 263K adelphia.net 206.221.36.51 4443 204K 69.94.123.79 4195 252K
Volume is down from last week.
- 213.4.149.12 returns from last week and many previous appearances, and I'm probably going to stop explicitly noting it since it doesn't seem like it's going to go away any time soon.
- 24.155.195.124 is on the CBL.
- 68.167.174.246 is a covad.net address that we consider dynamic, and returns from the end of June.
- 76.65.201.70, 206.221.36.51, and 69.94.123.79 all kept trying with
bad
HELO
s.
Connection time rejection stats:
104296 total 68773 bad or no reverse DNS 29517 dynamic IP 4092 class bl-cbl 492 qsnews.net 246 class bl-pbl 103 class bl-dsbl 80 class bl-sbl 23 class bl-njabl 4 class bl-sdul
The highest source of SBL rejections this week was a tie between SBL48694 (known spam source) and SBL44995 (hinet.net mail hosts for the ROKSO listed 'Mei Lung Handicrafts / Chang Wen-Sheng') with thirteen each. Following them is SBL56453 (0catch.com, listed as a repeat advance fee fraud spam source) with seven.
Twelve of the top 30 most rejected IP addresses were rejected 100 times or more this week. Rather than write them out, I'm going to make a table:
2567 | 58.186.29.226 |
752 | 58.69.147.80 |
484 | 121.97.172.73 |
419 | 200.69.153.217 |
414 | 216.213.172.11 |
368 | 61.252.110.3 |
282 | 86.76.43.248 |
263 | 125.234.232.88 |
194 | 41.250.128.243 |
126 | 85.107.94.89 |
124 | 83.214.74.133 |
102 | 59.95.207.131 |
With the exception of 216.213.172.11, all of these were rejected for bad
or missing reverse DNS, although almost all are in the CBL and/or the
PBL. In general, fifteen of the top 30 are currently in the CBL, four
are currently in bl.spamcop.net
, seventeen are currently in the PBL,
and a grand total of 25 are in zen.spamhaus.org.
(Locally, 27 were rejected for bad or missing reverse DNS, two for being qsnews.net, and one for being a dynamic IP address.)
This week, Hotmail had:
- 6 messages accepted, and I am pretty sure that most of them were spam.
- no messages rejected because they came from non-Hotmail email addresses.
- 47 messages sent to our spamtraps.
- 2 messages refused because their sender addresses had already hit our spamtraps.
- 6 messages refused due to their origin IP address (two in the CBL, two in SBL52368, one from a United Arab Emirates satellite ISP provider, and one from the Cote d'Ivoire).
what | # this week | (distinct IPs) | # last week | (distinct IPs) |
Bad HELO s |
705 | 95 | 825 | 99 |
Bad bounces | 219 | 94 | 222 | 149 |
There is no really leading source of bad HELO
s this week, by my
standards (I draw the line somewhere around 50 to 75 rejections;
no single one got over 45 this week).
Bad bounces were sent to 90 different bad usernames this week, with the
most popular one being qp3902
with 82 attempts (the same as last
week); the second most popular was actually an internal error, so I'm
not going to list it (without it, we actually only had 181 bad bounces
this week). The NoemiDotson
bad username pattern is still popular,
but it's joined by things like mikoponpon
, d21terrano
, and a number
of ex-users.
The biggest single source of bad bounces was 194.242.226.91, with other contributions from all over (including some hinet.net machines; clearly the SBL hasn't listed all of their mail machines yet).