Our never-used system for user-provided NFS accessible storage
Reading most of my entries here on Wandering Thoughts, you might get the impression that all of the projects we do are a good idea and successful. This is not in fact the case. Instead, it's selection bias in what I write about, partly because it's often not very interesting to write about things we decided we couldn't implement, or that we implemented and then they never went anywhere until we quietly decommissioned them. But today I have a good lead in to talk about one particular quiet failure, especially since it also showcases a sysadmin approach to dealing with new problems by reducing them to previously solved ones.
We allow professors to purchase reliable NFS storage, most usually backed up, in the form of space on our fileservers. However, this space is significantly more expensive than just buying raw disks, even if you forgo permanently backups in exchange for a discount. People are perennially unhappy about this, for natural reasons, and every so often we try to do something about it. The general form our attempted solutions have taken is a model where you pay for the physical disks and we put them in a server that we operate for a bit of an extra fee. You buy however many disks you want, you specify the redundancy level you want given the disks, and your storage lasts as long as your disks do, or at least as long as their warranty does. One of the things that people have traditionally wanted to do with this user-provided storage is NFS export it to at least their own machines, which leaves us with the problem of operating a NFS server (or several) built on top of people's random disks.
In late 2014, we went through an iteration of seeing this need (again) and trying to come up with a design and architecture that worked for us, one that we felt that we could administer and operate with reasonable confidence. This was just after we were deploying our second generation of ZFS NFS fileservers, where OmniOS frontends did the NFS and ZFS but talked to their disks over iSCSI, with the physical disks living on Linux backends. In a triumph of brute force design, we decided that we would reduce our user-provided NFS server problem to the already solved problem of doing NFS fileservice with OmniOS and iSCSI backends.
Of course the user-provided NFS storage would not use the full scale setup of our OmniOS fileservers; instead it would use a much simpler brute force version. Each 'fileserver' would be an OmniOS frontend (running on a Dell 1U server instead of our regular OmniOS fileserver hardware) that was directly connected (with a single network cable) to a single Linux iSCSI backend that would hold all of the user-provided disks. This gave us a setup that looked and operated like the OmniOS NFS fileserver environment we already had confidence in, at relatively low hardware cost. By reducing things this way, we didn't have to worry about NFS service on Linux or putting lots of disks on OmniOS, and in theory everything would be great.
We definitely built a single OmniOS machine to be the initial NFS frontend. I'm not sure we ever built an iSCSI backend for it, because in practice we never went anywhere with actually selling this idea to professors and having them buy disks for it. Instead, a few years later (in 2016), we quietly decommissioned the single OmniOS frontend we'd built. The last lingering relic of this entire cycle of design, build, and decommissioning was a third iSCSI network we noticed recently.
(I believe the plan was that all NFS frontends and all iSCSI backends for this project would have used the same 'iscsi3' network, even though they weren't all networked together and so in some sense each pair should have had its own network. Probably we would have still used unique IP addresses, just in case.)