A peculiar use of ZFS L2ARC that we're planning

August 16, 2013

In our SAN-based fileserver infrastructure we have a relatively small but very important and very busy pool. We need to be able to fail over this pool to another physical fileserver, so its data storage has to live on our iSCSI backends. But even with it on SSDs on the backends, going over the network with iSCSI adds latency and probably reduces bandwidth somewhat. We're not willing to move the pool to local storage on a fileserver; it's much more important that the pool stay up than that it be blindingly fast (especially since it's basically fast enough now). Oh, and it's generally much more important that reads be fast than writes.

But there is a way around this, assuming that you're willing to live with failover taking manual work (which we are): a large local L2ARC plus the regular SAN data storage. This particular pool is small enough that we basically get all of its data into an affordable L2ARC SSD (and certainly all of the active data). A local L2ARC gives us the local (read) IO for speed and effectively reduces the actual backend data storage to a persistence mechanism.

What makes this work is that a pool will import and run without its L2ARC device(s). Because L2ARC is only a cache, ZFS is willing to bring up a pool with missing L2ARC devices. If we have to fail over the pool to another fileserver it will come up without L2ARC and be slower, but at least it will come up.

(A local L2ARC plus SAN data storage works for any pool and is what we're planning in general when we renew our fileserver infrastructure (hopefully soon). But it may have limited effectiveness for large pools, based on usage patterns and so on. What makes this particular pool special is that it's small enough that the L2ARC can basically store all of it. And the L2ARC doesn't need to be mirrored or anything expensive.)

PS: given that this pool is already on SSDs, I don't think that there's any point to a separate log device. Since a SLOG is essential to the pool, it would have to live in the SAN and be mirrored; we couldn't get away with a local SLOG plus the data in the SAN.

Comments on this page:

From at 2013-08-16 17:42:07:

While fairly atypical, when the L2ARC and ZIL concepts came out in OpenSolaris 2008.05, one of the then-Sun folks came up with just this idea:


I certainly thought it was a clever trick then too.

On a parenthetical note: since I've living in the Linux the last little while, I find it so annoying that I can't make use of these ideas because there is no practical file system that implements them (FUSE is not very practical). Given that I use a Mac at home, I really wish that Apple and Sun/Oracle could have come to some agreement on ZFS on Mac OS X too.

All of these wonder features and concepts, and some of the most mainstream operating systems are still limping along with things like LVM and such. Sigh.

By trs80 at 2013-08-17 00:34:10:

Linux has fs caching layers now (two actually - bcache and dm-cache, in 3.10 and 3.9 respectively)

Written on 16 August 2013.
« Funding and the size of hardware you want to buy
SSDs may make ZFS raidz viable for general use »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Aug 16 11:52:51 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.