Problems I see with the ATA-over-Ethernet protocolI've been experimenting with AoE lately, and as a result I've been looking at the protocol more than I did in my earlier exposure. Unfortunately, the more I look at the AoE protocol, the more uncomfortable I get. The AoE protocol is quite simple; requests and replies are simple Ethernet frames, and a request's result must fit in a single reply packet. This means that the maximum read and write sizes per request are bounded by the size of the Ethernet frame, and thus on a normal Ethernet the maximum is 1K per request. (AoE does all IO in 512-byte sectors.) So, the problems I see:
The packets per second issue probably only really affects reads; there are few disk systems that can sustain 100 Mbytes/sec of writes, but it is not difficult to build one that can do 100 Mbytes/sec of reads. (And the interesting thing for us is to build a system that will still manage to use the full network bandwidth when it is not one streaming read but 30 different people each doing their own streaming reads, all being mixed together on the target.) I find all of this unfortunate. I would like to like AoE, because it has an appealing simplicity; however, I'm a pragmatist, so simplicity without performance is not good enough. Sidebar: the buffer count problemThere's a third, smaller problem. The 'Buffer Count' in the server configuration reply (section 3.2 of the AoE specification) cannot mean what it says it means. The protocol claims that this is a global limit, that it is:
The problem is that one initiator has no idea how many messages other initiators are currently sending the server. So this has to actually be the number of outstanding messages a single initiator can send the server, and it is the server's responsibility to divide up a global pool among all of the initiators. (In practice this means that the server needs to be manually configured to know how many initiators it has.) (2 comments.)
|
These are my WanderingThoughts GettingAround This is part of CSpace, and is written by ChrisSiebenmann. * * * Atom feeds are available; see the bottom of most pages. Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web |