Old Unix filesystems and byte order

February 8, 2016

It all started with a tweet by @JeffSipek:

illumos/solaris UFS don't use a fixed byte order. SPARC produces structs in BE, x86 writes them out in LE. I was happier before I knew this.

As they say, welcome to old time Unix filesystems. Solaris UFS is far from the only filesystem defined this way; in fact, most old time Unix filesystems are probably defined in host byte order.

Today this strikes us as crazy, but that's because we now exist in a quite different hardware environment than the old days had. Put simply, we now exist in a world where storage devices both can be moved between dissimilar systems and are. In fact, it's an even more radical world than that; it's a world where almost everyone uses the same few storage interconnect technologies and interconnects are common between all sorts of systems. Today we take it for granted that how we connect storage to systems is through some defined, vendor neutral specification that many people implement, but this was not at all the case originally.

(There are all sorts of storage standards: SATA, SAS, NVMe, USB, SD cards, and so on.)

In the beginning, storage was close to 100% system specific. Not only did you not think of moving a disk from a Vax to a Sun, you probably couldn't; the entire peripheral interconnect system was almost always different, from the disk to host cabling to the kind of backplane that the controller boards plugged into. Even as some common disk interfaces emerged, larger servers often stayed with faster proprietary interfaces and proprietary disks.

(SCSI is fairly old as a standard, but it was also a slow interface for a long time so it didn't get used on many servers. As late as the early 1990s it still wasn't clear that SCSI was the right choice.)

In this environment of system specific disks, it was no wonder that Unix kernel programmers didn't think about byte order issues in their on disk data structures. Just saying 'everything is in host byte order' was clearly the simplest approach, so that's what people by and large did. When vendors started facing potential bi-endian issues, they tried very hard to duck them (I think that this was one reason endian-switchable RISCs were popular designs).

In theory, vendors could have decided to define their filesystems as being in their current endianness before they introduced another architecture with a different endianness (here Sun, with SPARC, would have defined UFS as BE). In practice I suspect that no vendor wanted to go through filesystem code to make it genuinely fixed endian. It was just simpler to say 'UFS is in host byte order and you can't swap disks between SPARC Solaris and x86 Solaris'.

(Since vendors did learn, genuinely new filesystems were much more likely to be specified as having a fixed and host-independent byte order. But filesystems like UFS trace their roots back a very long way.)

Written on 08 February 2016.
« Clearing SMART disk complaints, with safety provided by ZFS
The fundamental practical problem with the Certificate Authority model »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Feb 8 23:04:43 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.