My view on iSCSI performance troubleshooting

July 8, 2011

I've been asked about troubleshooting iSCSI performance issues (ie, slow IO over iSCSI) a few times now, so here's my views. Now I need a disclaimer: this is somewhat theoretical, as we haven't really had to troubleshoot iSCSI performance issues yet in our environment (and to the extent that we've had slow filesystem IO, we haven't nailed down a clear cause among all of the moving parts of our setup).

As always, the first thing you need to do is nail down just what IO is slow and when; the big questions are generally random IO vs sequential IO and reads vs writes (and also latency vs bandwidth). The easy situation is that the slow IO happens all of the time and you can reproduce it on demand. The difficult situation, well, that can get quite tricky; sometimes a large part of the challenge is trying to figure out just why your system is slow and what's going on when it is.

(Also, one should not forget that the filesystem may be doing things that look like IO performance problems.)

Past that, my general rule of iSCSI troubleshooting is to remember that iSCSI is a bunch of disks coupled to a network (specifically, to a TCP stream). This means that before looking at iSCSI itself, you want to look at each of the parts separately to see if they're working fine and delivering the performance you expect.

(Doing so is much easier if you have full access to the targets and can run general programs on them. Closed target appliances make this much more challenging.)

Modern servers and gigabit switches and so on should reliably deliver gigabit wire speeds in both directions at once with basically zero packet loss (and negligible latency, at least for LANs). If your initiator and target cannot do this over your iSCSI network fabric, you have a network performance problem that you need to fix. Note that you do not need jumbo frames to saturate a gigabit network (even with iSCSI, cf) and I think that you should turn them off to make life simpler. There are lots of programs to measure all sorts of aspects of network performance, but for streaming TCP I just use ttcp.

(If you are working with 10G networking you absolutely want to do your own networking performance measurements and performance tuning before you even start looking at the iSCSI layer. But you did all of that before you decided to spend all of that money on 10G networking hardware, right?)

Modern disk systems vary tremendously based on what technology you're using, so there is no real substitute for measuring your system. My rule of thumb is that a modern SATA drive will do 60 to 100 Mbytes/sec of streaming IO (read or write doesn't seem to make much of a difference) and somewhat over 100 seeks a second, but drives that have quietly gone bad can perform much worse, IO to multiple drives at once may slow this down, RAID implementations can be slower than you think, and so on. When checking this stuff I prefer to start out measuring things as close to the raw hardware as possible and then move up the target's software stack if I think there's a need.

Once you've verified that the disks and the network are both fine, it's time to move up to iSCSI. Unfortunately this is where I start waving my hands vaguely through lack of hard experience; I can only suggest obvious things like getting network traces to see how long various iSCSI operations are taking. Given what I've read about iSCSI and its tuning parameters, my current opinion is that iSCSI tuning parameters aren't likely to make a significant performance difference in normal circumstances unless they're catastrophically mis-set.

(In an ideal world your iSCSI initiator and target would both support iSCSI-specific performance statistics, either built in or through tracing hooks. I'm not holding my breath on that.)

At least some iSCSI target software can create dummy targets, ones that have no physical disk behind them. Such targets can be useful for testing the iSCSI protocol overhead introduced by the initiator and the target, although you need a test environment for this. Sometimes you can put real test data on the dummy target, and sometimes it's just the iSCSI equivalent of /dev/null and /dev/zero combined together; the former is obviously more useful.

(In theory the iSCSI target software could have some mismatch between it and the real disk drivers that introduces extra overhead only when you're talking to real disks. Testing this may need some sort of ultra-fast disk such as a SSD.)

On a side note, one thing that may be different between iSCSI and local disk IO (and between filesystems and raw iSCSI IO) is the presence of write barriers. If you see fast local writes and slow remote writes, this is one possible cause of the difference. Since there are quite a lot of moving parts involved in generating real write barriers over iSCSI, it's possible for software updates to suddenly cause them to be generated (or not be generated, but that's harder to notice).

Written on 08 July 2011.
« An interesting gotcha with Exim and .forward processing
Exploiting polymorphic WSGI again to create cat »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Jul 8 22:48:54 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.