2017-12-22
Our next generation of fileservers will not use any sort of SAN
We've been using SAN-based fileservers here for a long time, partly for reasons that I once wrote about in Painless long term storage management without disturbing users. Our current and past generations of ZFS fileservers have been based around an iSCSI SAN, and before that we had at least one generation of Fibre Channel based fileservers using Solaris (with DiskSuite and relatively inexpensive hardware RAID-5 boxes. Some of the things we've wanted from a SAN haven't worked out lately but others have, and I wouldn't say we're unhappy with our current SAN setup.
We're in the process of putting together our next generation of fileservers and despite everything I just wrote, we've decided that they won't use a SAN. The core reason is that a SAN isn't necessary for us any more and moving away from having one both simplifies our life and means we need less hardware (which means everything costs less, which is an important consideration for us). It does matter that we want smaller fileservers and this affects the economics, but our decision goes beyond that; we have no regrets about the shift and don't feel we're being forced into it.
One not insignificant reason for this is that our ideas about long term storage management simply haven't worked out in practice (as I once theorized might happen). Even if we used iSCSI in our next generation, it was clear to us that the migration would once again involve copying all of the data with user-visible impact, just as it did the last time around. But beyond that, while I won't say that the iSCSI network has been useless, we haven't actually needed any of the advantages a SAN gives us in this generation. With solid hardware this time around, we haven't had a backend or a fileserver fail, or at least we've never had them fail for hardware reasons. Nor have we needed two iSCSI networks, as we've never had a switch or network failure.
Using iSCSI has unfortunately complicated our lives. It requires two extra networks and two extra sets of cabling, switches, and so on. It has to be monitored and software configurations have to be fiddled with, and we've actually had software issues because we have two iSCSI networks (every so often an OmniOS fileserver will refuse to use both iSCSI networks, especially after a reboot). And of course the split between fileservers and backends means more machines to look after.
(It also reduces the IO bandwidth we can get, which is an issue for various things including ZFS scrubs and resilvers, and means there's extra spots to monitor for performance impacts.)
A non-SAN fileserver environment is just going to be simpler, with fewer moving parts (in the sysadmin sense), and these days we can build it without needing to use anything that we consider chancy or unproven. Our existing iSCSI backends have provided us with the basic template; a server case with somewhere in the range of 16 to 24 disks and dual power supplies, a suitable motherboard, and connecting to all of the disks using some combination of SAS controllers and motherboard SAS and SATA ports (these days we no longer need to resort to chancy stuff like eSATA, the way we had to in our first generation). Using moderately sized servers with moderate amounts of disks goes well with our overall goals of smaller individual fileservers, and all of the pieces are well understood and generally work well (and are widely used, unlike eg iSCSI).
Will I miss having a SAN? My honest answer is that I won't. Like my co-workers, I'm looking forward to a simpler and more straightforward overall fileserver environment, with more isolation between fileservers and less to worry about and look at.