Wandering Thoughts archives


I won't be trying out ZFS's new TRIM support for a while

ZFS on Linux's development version has just landed support for using TRIM commands on SSDs in order to keep their performance up as you write more data to them and the SSD thinks it's more and more full; you can see the commit here and there's more discussion in the pull request. This is an exciting development in general, and since ZoL 0.8.0 is in the release candidate stage at the moment, this TRIM support might even make its way into a full release in the not too distant future.

Normally, you might expect me to give this a try, as I have with other new things like sequential scrubs. I've tracked the ZoL development tree on my own machines for years basically without problems, and I definitely have fairly old pools on SSDs that could likely benefit from being TRIM'd. However, I haven't so much as touched the new TRIM support and probably won't for some time.

Some projects have a relatively unstable development tree where running it can routinely or periodically destabilize your environment and expose you to bugs. ZFS on Linux is not like this; historically the code that has landed in the development version has been quite stable and problem free. Code in the ZoL tree is almost always less 'in development' and more 'not in a release yet', partly because ZoL has solid development practices along with significant amounts of automated tests. As you can read in the 'how has this been tested?' section of the pull request, the TRIM code has been carefully exercised both through specific new tests and random invocation of TRIM through other tests.

All of this is true, but then there is the small fact that in practice, ZFS encryption is not ready yet despite having been in the ZoL development tree for some time. This isn't because ZFS encryption is bad code (or untested code); it's because ZFS encryption turns out to be complicated and to interact with lots of other things. The TRIM feature is probably less complicated than encryption, but it's not simple, there are plenty of potential corner cases, and life is complicated by potential issues in how real SSDs do or don't cope well with TRIM commands being issued in the way that ZoL will. Also, an errant TRIM operation inherently destroys some of your data, because that's what TRIM does.

All of this makes me feel that TRIM is inherently much more dangerous than the usual ZoL new feature, sufficiently dangerous that I don't feel confident enough to try it. This time around, I'm going to let other people do the experimentation and collect the arrows in their backs. I will probably only start using ZFS TRIM once it's in a released version and a number of people have used it for a while without explosions.

If you feel experimental despite this, I note that according to the current manpage an explicit 'zpool trim' can apparently be limited to a single disk. I would definitely suggest using it that way (on a pool with redundancy); TRIM a single disk, wait for the disk to settle and finish everything, and then scrub your pool to verify that nothing got damaged in your particular setup. This is definitely how I'm going to start with ZFS TRIM, when I eventually do.

(On my work machine, I'm still tracking the ZoL tree so I'm using a version with TRIM available; I'm just not enabling it. On my home machine, for various reasons, I've currently frozen my ZoL version at a point just before TRIM landed, just in case. I have to admit that stopping updating ZoL does make the usual kernel update dance an easier thing, especially since WireGuard has stopped updating so frequently.)

linux/ZFSNoTrimForMeYet written at 21:19:02; Add Comment

It's always DNS (a story of our circular dependency)

Our building and in fact much of the University of Toronto downtown campus had a major power failure tonight. When power came back on I wasn't really expecting our Ubuntu servers to come back online, but to my surprise they started pinging (which meant not just that the actual servers were booting but that the routers, the firewall, the switches, and so on had come back). However when I started ssh'ing in, our servers were not in a good state. For a start, I didn't have a home directory, and in fact none of our NFS filesystems were mounted and the machines were only part-way through boot, stalled trying to NFS mount our central administrative filesystem.

My first thought was that our fileservers had failed to boot up, either our new Linux ones or our old faithful OmniOS ones, but when I checked they were mostly up. Well, that's getting ahead of things, because when I started to check what actually happened is that the system I was logged in to reported something like 'cannot resolve host <X>'. That would be a serious problem.

(I could resolve our hostnames from an outside machine, which turned out to be very handy since I needed some way to get their IPs so I could log into them.)

We have a pair of recursive OpenBSD-based resolvers; they had booted and could resolve external names, but they couldn't resolve any of our own names. Our configuration uses Unbound backed by NSD, where the NSD on each resolver is supposed to hold a cached copy of our local zones that is refreshed from our private master. In past power shutdowns, this has allowed the resolvers to boot and serve DNS data from our zones even without the private master being up, but this time around it didn't; both NSDs returned SERVFAIl when queried and in 'nsd-control zonestatus' reported things like:

zone: <our-zone>
      state: refreshing
      served-serial: none
      commit-serial: none

Our private master was up, but like everything else it was stalled trying to NFS mount our central administrative filesystem. Since this central filesystem is where our nameserver data lives, this was a hard dependency. This NFS mount turned out to be stalled for two reasons. The obvious and easy to deal with one was that the private master couldn't resolve the hostname of the NFS fileserver. When I tried to mount by IP address, I found the second one; the fileserver itself was refusing mounts because, without working DNS, it couldn't map IP addresses to names to verify NFS mount permission.

(To break this dependency I wound up adding NFS export permission for the IP address of the private master, then manually mounting the filesystem from the fileserver's IP on the private master. This let the boot continue, our private master's nameserver started, our local resolvers could refresh their zones from it, and suddenly internal DNS resolution started working for everyone. Shortly afterward, everyone could at least get the central administrative filesystem mounted.)

So, apparently it really always is DNS, even when you think it won't be and you've tried to engineer things so that your DNS will always work (and when it's worked right in the past).

sysadmin/OurDNSCircularDependency written at 01:42:40; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.