Wandering Thoughts archives

2019-08-23

What happens in ZFS when you have 4K sector disks in an ashift=9 vdev

Suppose, not entirely hypothetically, that you've somehow wound up with some 4K 'advance format' disks (disks with a 4 KByte physical sector size but 512 byte emulated (aka logical) sectors) in a ZFS pool (or vdev) that has an ashift of 9 and thus expects disks with a 512 byte sector size. If you import or otherwise bring up the pool, you get slightly different results depending on the ZFS implementation.

In ZFS on Linux, you'll get one ZFS Event Daemon (zed) event for each disk, with a class of vdev.bad_ashift. I don't believe this event carries any extra information about the mismatch; it's up to you to use the information on the specific disk and the vdev in the event to figure out who has what ashift values. In the current Illumos source, it looks like you get a somewhat more straightforward message, although I'm not sure how it trickles out to user level. At the kernel level it says:

Disk, '<whatever>', has a block alignment that is larger than the pool's alignment.

This error is not completely correct, since it's the vdev ashift that matters here, not the pool ashift, and it also doesn't tell you what the vdev ashift or the device ashift are; you're once again left to look those up yourself.

(I was going to say that the only likely case is a 4K advance format disk in an ashift=9 vdev, but these days you might find some SSDs or NVMe drives that advertise a physical sector size larger than 4K.)

This is explicitly a warning, not an error. Both the ZFS on Linux and Illumos code have the a comment to this effect (differing only in 'post an event' versus 'issue a warning'):

/*
 * Detect if the alignment requirement has increased.
 * We don't want to make the pool unavailable, just
 * post an event instead.
 */

This is a warning despite the fact that your disks can accept IO for 512-byte sectors because what ZFS cares about (for various reasons) is the physical sector size, not the logical one. A vdev with ashift=9 really wants to be used on disks with real 512-byte physical sectors, not on disks that just emulate them.

(In a world of SSDs and NVMe drives that have relatively opaque and complex internal sizes, this is rather less of an issue than it is (or was) with spinning rust. Your SSD is probably lying to you no matter what nominal physical sector size it advertises.)

The good news is that as far as I can tell, this warning has no further direct effect on pool operation. At least in ZFS on Linux, the actual disk's ashift is only looked up in one place, when the disk is opened as part of a vdev, and the general 'open a vdev' code discards it after this warning; it doesn't get saved anywhere for later use. So I believe that ZFS IO, space allocations, and even uberblock writes will continue as before.

(Interested parties can look at vdev_open in vdev.c. Disks are opened in vdev_disk.c.)

That ZFS continues operating after this warning doesn't mean that life is great, at least if you're using HDs. Since no ZFS behavior changes here and ZFS can do a using disks with 4K physical sectors in an ashift=9 vdev will likely leave your disk (or disks) doing a lot of read/modify/write operations when ZFS does unaligned writes (as it can often do). This both performs relatively badly and leaves you potentially exposed to damage to unrelated data if there's a power loss part way through.

(But, as before, it's a lot better than not being able to replace old dying disks with new working ones. You just don't want to wind up in this situation if you have a choice, which is a good part of why I advocate for creating basically all pools as 'ashift=12' from the start.)

PS: ZFS events are sort of documented in the zfs-events manpage, but the current description of vdev.bad_ashift is not really helpful. Also, I wish that the ZFS on Linux project itself had the current manpages online (well, apart from as manpage source in the Github repo, since most people find manpages in their raw form to be not easy to read).

solaris/ZFS4KDiskWithAshift9 written at 21:29:48; Add Comment

Link: GNOME Terminal Cursor Blinking Saga

Geoff Greer's GNOME Terminal Cursor Blinking Saga is about how to turn off cursor blinking in gnome-terminal, because the Gnome people are still in love with having their cursor blink despite it being a terrible idea and have progressively made it harder and harder to turn off.

Needing to look this up yet again did cause me to check the Gnome bug to expose a preference UI for this, which caused me to discover that you can actually easily turn this off these days, although not globally; you have to do it for each profile, turning off 'Cursor blinking'. If you have a lot of profiles and are the right sort of person, you may want to write a shell script that uses the gsettings approach, which still works.

(Why I needed to know this is that for my own reasons I'm doing a from-scratch Fedora 30 install in a virtual machine, and of course it came up with a gnome-terminal setup where the cursor blinks.)

links/GnomeTerminalBlinkingSaga written at 11:06:17; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.