The difference between a SAN and a cluster filesystem
Both SANs and cluster filesystems have multiple machines talking to multiple shared disks that all of them can see. The difference is that SANs are designed for a single machine to talk to a given disk at a time, while cluster filesystems allow multiple machines to talk to the disk at once.
Which raises the big question: why do you need cluster filesystems at all? Why can't multiple systems share a single disk without doing anything special?
There's two problems with shared disks: caches and coordinating updates. These days, pretty much all filesystems cache bits of the on-disk filesystem in memory, ranging from file data to parts of the filesystem metadata like directories. None of this caching works very well if there is something else changing the data on the disk, because the system has no idea that it is serving stale data from cache instead of throwing it out and fetching the current data again.
In theory you could get around that by doing no caching (although the performance loss would probably be pretty impressive). However, this still leaves you with the problem of coordinating several systems that are all trying to update the filesystem at the same time. Without some sort of locking, you are going to wind up with a pretty scrambled filesystem in short order, as systems gleefully allocate the same data block to several files, overwrite each other's directory updates, and so on.
Further, there's nothing that the SAN storage can do to fix either issue because both problems happen well out of its sphere of operations. Without cooperation from the systems talking to it, the most it can do to help is to enforce exclusive access to disks. (Of course, exclusive access to disks is exactly what you don't want if you really do have a cluster filesystem, or even some sorts of failover depending on how exactly the exclusive access is implemented.)
|
|