Naming disk devices: drive IDs versus drive locations

October 31, 2013

From my perspective there are two defensible ways of naming disk drives at the operating system level. You can do it by a stable identifier tied to the physical drive somehow, such as a drive serial number or WWN, or by a stable identifier based on its connection topology and thus ultimately the drive's physical location (such as the 'port X on card Y' style of name). I don't want to get into an argument about which one is 'better' because I don't think that argument is meaningful; the real question to ask is which form of naming is more useful under what circumstances.

(Since the boundaries between the two sorts of names may be fuzzy, my rule of thumb is that it is clearly a drive identifier if you have to ask the drive for it. Well, provided that you are actually speaking to the drive instead of a layer in between. The ultimate drive identifiers are metadata that you've written to the drive.)

Before I get started, though, let me put one inconvenient fact front and center: in almost all environments today, you're ultimately going to be dealing with drives in terms of their physical location. For all the popularity of drive identifiers as a source of disk names (among OS developers and storage technologies), there are very few environments right now where you can tell your storage system 'pull the drive with WWN <X> and drop it into my hands' and have that happen. As I tweeted I really do need to know where a particular disk actually is.

This leads to my bias, which is that using drive identifiers makes the most sense when the connection topology either changes frequently or is completely opaque, or both. If your connection topology rearranges itself on a regular basis then it can't be a source of stable identifiers because it itself isn't stable. However you can sometimes get around this by finding a stable point in the topology; for example, iSCSI target names (and LUNs) are a stable point whereas the IP addresses or network interfaces involved may not be.

(Topology rearrangement can be physical rearrangement, ranging from changing cabling all the way up to physically transferring disks between enclosures for whatever reason.)

Conversely, physical location makes the most sense when topology is fixed (and drives aren't physically moved around). With stable locations and stable topology to map to locations, all of the important aspects of a drive's physical location can be exposed to you so you can see where it is, what the critical points are for connecting to it, what other drives will be affected if some of those points fail or become heavily loaded, and so on. Theoretically you don't have to put this in the device name if it's visible in some other way, but in practice visible names matter.

My feeling is that stable topology is much more common than variable topology, at least once you identify the useful fixed points in connection topology. Possibly this is an artifact of the environment I work in; on the other hand, I think that relatively small and simple environments like mine are much more common than large and complex ones.

Sidebar: the cynic's view of OS device naming

It's much easier to give disks an identifier based device name than it is to figure out how to decode a particular topology and then represent the important bits of it in a device name, especially if you're working in a limited device naming scheme (such as 'cXtYdZ', for example). And you can almost always find excuses for why the topology might be unstable in theory (eg 'the sysadmin might move PCI cards between slots and oh no').

Written on 31 October 2013.
« An open question: part uniformity versus unit cost
Our likely future backend and fileserver hardware »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Oct 31 01:14:16 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.