Wandering Thoughts archives


A surprise about which of our machines has the highest disk write volume

Once upon a time, hard drives and SSDs just had time-based warranties. These days, many SSDs have warranties that are more like cars; they're good for so much time or so many terabytes written, whichever comes first, and different SSD makers and models can have decidedly different maximum figures for this. So, as part of investigating what SSDs to get for future usage here, we've been looking into what sort of write volume we see on both our ZFS fileservers (well, on the iSCSI backends for them) and on the system SSDs of those of our Ubuntu servers that have them. The result was a bit surprising.

Before I started looking into this, I probably would have guessed that the highest write volume would be either for the SSDs of the ZFS pool that holds our /var/mail filesystem. I might have also guessed that perhaps some of the oldest disks for ZFS pools on our most active fileserver might be pretty active. While both of these are up in the write volume rankings, neither has our highest write volume.

Our highest write volume turns out to happen on the system SSDs in our central mail machine; they see about 32 TB of writes a year, compared to about 23 TB of writes a year on the busiest iSCSI backend disks on our most active fileserver. The oldest and most active SSDs involved in the mail spool have seen only about 10 TB of writes a year, which is actually below many of our more active ZFS pool disks (on several fileservers). The central mail machine's IO activity is also heavily unbalanced in favour of writes; with some hand-waving about the numbers, the machine runs about 80% writes (by the amount of data involved) or more. The disks in the ZFS pools show much lower write to read ratios; an extreme case is the mail spool's disks, which see only 12% writes by IO volume.

My current theory is that this huge write volume is because Exim does a lot of small writes to things like message files and log files and then fsync()'s them out to disk all the time. Exim uses three files for each message and updates two of them frequently as message deliveries happen; updates almost certainly involve fsync(), and then on top of that the filesystem is busy making all the necessary file creations, renames, and deletions be durable. We're using ext4, but even there the journal has to be forced to disk at every step.

(This certainly seems to be something involving Exim, as our external mail gateway has the same highly unbalanced writes to reads ratio. The gateway is doing roughly 4 TB of writes a year, but that's still quite high for our Ubuntu system SSDs.)

PS: All of these figures for SSDs are before any internal write amplification that the SSD itself does. My understanding is that SSD warranty figures are quoted before write amplification, as the user-written write volume.

sysadmin/MTAHighWriteVolume written at 03:17:09; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.