iSCSI Enterprise Target and disk write caches (and ZFS)

June 12, 2010

First, a note about IO modes. IET has two ways of doing IO to whatever actual storage is behind the iSCSI targets it advertises, called 'fileio' and 'blockio' respectively. Fileio more or less does regular filesystem IO, as if you had opened the backing storage at user level and were doing read() and write() to it. Blockio does low-level direct block IO to the backing storage. All of the following is specific to blockio.

There are two levels of possible write caching in any Linux iSCSI target implementation; any write caching done in the target server's RAM and the write caches of the physical disks themselves. IET in blockio mode has no in-memory caching; iSCSI targets are always in write through mode, where all writes are immediately sent to the underlying storage (whatever it is). IET does nothing in particular about any write caching in the underlying storage.

IET advertises all blockio iSCSI LUNs as having write caching disabled ('WCD'); this is theoretically justifiable because it doesn't do any in-memory write caching. IET doesn't allow the initiator to change this; the write cache status of a LUN (whether blockio or fileio) is a local configuration decision that is not subject to remote overrides. Besides, it can't turn on an in-RAM write cache for blockio LUNs, as there simply isn't any code for it.

(In actual fact the code is generic and advertises WCD versus WCE based on whether the LUN is in writethrough or writeback mode. Since blockio LUNs are always writethrough, IET always advertises them as WCD.)

IET ignores cache flush operations on blockio LUNs; cache flush commands do not get errors the way MODE SELECT does, but they don't have any effect. In particular, cache flushes do not get passed to the underlying disk. This is somewhat unfortunate. While IET itself has no cached writes to flush, the underlying physical disk may have its write cache enabled and if so, you would like it to get flushed.

Thus, the only truly safe way to use IET (or any prior version) in blockio mode is to turn the write caches off on your physical disks unless your 'disks' themselves have some sort of non-volatile write caches (for example, a hardware RAID card with NVRAM). Nor is there any way for an initiator to discover the true end-to-end state of write caching, since IET always claims blockio LUNs have write caching disabled.

(There is a patch under development to add real cache flush support for blockio, but I don't know if or when it will appear in an IET release.)

PS: note that IET cannot sanely support allowing initiators to turn on and off the disk write caches of the disk(s) behind a LUN. Since a single physical disk may be shared between multiple LUNs and it only has one global setting for its write cache, allowing this would allow one initiator on one LUN to change the write cache setting behind another LUN's back.

Sidebar: how ZFS interacts with all of this

Earlier I wrote:

(The situation with ZFS, IET, and write caches is beyond the scope of this entry, but ZFS's inability to nominally enable the nominal disk write caches is not currently a problem for us.)

Now I can explain what I meant by that. For blockio LUNs, the only thing that enabling or disabling the nominal disk write cache could do is change whether IET reports the LUN as WCD or WCE. Per yesterday, the net effect of reporting LUNs as WCD is to cause ZFS to not send cache flush requests for them. And in turn this is 'okay' because IET wouldn't do anything with the cache flush requests even if ZFS did send them.

The patch being developed for IET currently makes blockio LUNs report as WCE if the underlying storage appears to support cache flush operations. With the patch, ZFS sees that the 'disks' need cache flushes, sends them, and the cache flushes are propagated to the physical disks. As a result, you can safely run the physical disks with their write caches enabled.

Written on 12 June 2010.
« What I know about ZFS and disk write caches
A (surprising) missing Unix tool program »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Jun 12 01:28:05 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.