Possible limits on our port multiplied ESATA performance

June 27, 2009

Our iSCSI targets use port multiplied ESATA to talk to their data disks. This, for those who have not had to deal with it yet, is a way of undoing one of the advantages of SATA; instead of having one disk per port, channel and cable, now you have some number of disks per port (I've seen as many as five). The advantage is that if you want lots of disks, you don't have to find room for (say) 12 individual connectors on the back of your enclosure, and to somehow get 12 individual ESATA ports into your system (with the resulting proliferation of PCI cards); instead you can just have three or four.

It's clear that we are running into some sort of overall bandwidth limit when doing streaming reads. The question is where. I can think of a number of possible limits:

  • the PCI Express bus bandwidth of the ESATA controller card, since all 12 disks are being handled on one controller card (which is absolutely what you want in a 1U server).

    (I don't know enough about reading lspci -vv output to know how fast the card claims to be able to go. The theoretical bus bandwidth seems unlikely to be the limiting factor.)

  • channel contention or bandwidth limits, since we have four disks on each port. I did some earlier testing that didn't see any evidence of this, but it wasn't exhaustive and it was done on somewhat different hardware.

  • kernel performance limits, where either the overall kernel or the driver for the SiI 3124 based cards we're using can't drive things at full speed.

    (I'm dubious about this; issuing large block-aligned IOs straight to the disks does not seem like it would be challenging.)

  • some kind of general (hardware) contention issue, where there is too much going on at once so that requests are unable to be issued and serviced at full speed for all disks.

Fortunately, this performance shortfall is basically irrelevant in our environment; for sequential IO, the iSCSI targets will be limited by total network bandwidth well before they'll run into this limit, and for random IO you care far more about IO operations a second (per disk) than you do about bandwidth.

(We are not rich enough to afford 10G Ethernet. And I have seen an iSCSI target easily do 200 Mbytes/sec of read IO, saturating both of its iSCSI networks.)

Looking into this has shown me that I don't know as much about the current state of PC hardware and its performance characteristics as I'd like to. (Yes, I know, it's sort of like the weather; wait a bit and it will change again.)

Written on 27 June 2009.
« How not to set up your DNS (part 19)
How we solve the multiuser PHP problem »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Jun 27 02:49:27 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.