Wandering Thoughts archives

2018-09-04

Some views on the Go 2 Error Inspection early draft proposal

The big news in Go lately has been the announcement of Go 2 Draft Designs for improved error handling, error values, and generics. The details of the proposed improvements to error values have been split into two parts, error printing and error inspection. Today I have some views on the error inspection draft (and you should also read the error values problem overview). Broadly, I like the proposal but I feel that it doesn't go far enough and the result of not doing so is going to be inconvenience and people reinventing the wheel repeatedly. So here are some rambling thoughts on the whole thing.

The error inspection draft proposes a standard interface for unwrapping nested errors and then some standard functions to conveniently do things with (wrapped) errors, errors.Is(), which checks to see if a specific error is somewhere in your wrapped error, and errors.As(), which tries to recover a specific type of error from your wrapped error. A standard way of wrapping errors and then checking through the error chain deals with my issue where Go's net package has undocumented wrapped errors that are a pain to deal with. In the new world, if I want to see if an error is ultimately because of an EHOSTUNREACH from some system call, I would write something like:

if errors.Is(err, syscall.EHOSTUNREACH) {
   ....
}

I like this code. It's short and it directly expresses what I'm checking. Also, this automatically works with any level of error nesting; I can be dealing with a *net.OpError that has been wrapped by my own code for better reporting of the details, and it all just works.

As a side note, in a world with automatic error unwrapping via errors.Is() and errors.As(), it becomes even more important to actively and explicitly document what errors can be wrapped inside your errors. People will actively want to know what error types there's any point to look for, and making them dump your errors or read your source code to find out is just annoying them. This should lead to significantly more documentation on this in the Go standard library.

In a world with errors.Is(), it will be extremely annoying if people needlessly erase errors by turning them into text strings by running them through the current fmt.Errorf(). As a result, I believe strongly that there should be a standard interface for doing the equivalent of fmt.Errorf() for wrapping errors in a way that is transparent to errors.Is(). I am neutral on the overview's open question of whether this should be a clever version of fmt.Errorf(), but certainly going that way would get fast adoption.

Given that errors.Is() makes checking sentinel values one of the easiest things you can do with errors (especially wrapped errors), I expect to see a significant push towards more error sentinel values. Today, I see a reasonable amount of code that doesn't have constant sentinel values and always makes errors with fmt.Errorf(); in a future world with errors.Is(), I expect there to be a lot more sentinel error values, even if they're immediately wrapped up in another layer of errors that provide those custom messages.

Now let's talk about where I feel this proposal doesn't go far enough and as a result leaves people to reinvent wheels. The first annoyance comes in if you have multiple sentinel errors that you want to check against. For example, my check of EHOSTUNREACH is actually incomplete; I should really check ENETUNREACH and ETIMEDOUT as well, and perhaps some others. The current proposal has no API for this, and as a result I expect that we'll see people repeatedly writing their own version of a multi-error check. I believe that there should be a standard API for this; the natural interface is a varargs version of errors.Is(), looking like this:

if errors.IsOneOf(err, syscall.EHOSTUNREACH, syscall.ENETUNREACH, syscall.ETIMEDOUT) {
   ....
}

All of that is all very well and good for checking sentinel values, but what if you want to check that you have a network timeout error? In the current proposal, you wind up writing something like this:

if ne, ok := errors.As(*net.OpError)(err); ok && ne.Temporary() {
   ....
}

We're going through all of this work simply to see if this is a temporary network error, which is a simple 'is this a ...' query. In a world where errors themselves could answer 'Is()' queries, it would make sense for the net package to have a net.TemporaryError sentinel value and then to simply write this check as the much more natural:

if errors.Is(err, net.TemporaryError) {
   ....
}

I expect that pretty much every boolean 'is this a ....' method call on errors today would likely be more ergonomic if it was expressed as a sentinel value this way.

Under the hood, no net error would ever actually be the net.TemporaryError sentinel, because they have to wrap other errors. Instead they would have an Is() check method that would say that they matched net.TemporaryError if .Temporary() was true.

This doesn't make errors.As() unneeded, because there are things you can want to recover from specific types of errors other than boolean answers to questions. One case, drawn from my prior experience, is knowing what system call failed:

oe, ok := errors.As(*os.SyscallError)(err); 
if ok && oe.Syscall == "connect" {
   ....
}

You would also want to use errors.As() or something like it if you had an interface of your own that you wanted to check against. For example:

type temperror interface {
    Temporary() bool
}

if te, ok := errors.As(temperror)(err); ok && te.Temporary() {
   ....
}

Here we're asking 'is it a temporary error in general' instead of our earlier question of 'is it a temporary network error'. There are a number of error classes even in the standard library that have this interface (including syscall.Errno itself), so you might well want to match a wrapped error this way.

Now, this example pushes somewhat against my errors.Is(err, net.TemporaryError) example, because you can't generalize a sentinel value the way you can a method call (which you can turn into an interface, as here). If Go people specifically want to preserve this general style of error method interface, then it probably makes sense to avoid generalizing sentinels. Once general sentinels are available, people may be tempted to implement 'is this a temporary error' and the like purely through sentinels, which would make the error method interface less and less useful.

Perhaps the Go standard library should document some common error method patterns, such as .Temporary() and .Timeout(), partly to encourage people to implement them more often. In a world with errors.As(), where it is much easier to check if something in a wrapped stack of errors thinks the error is temporary or whatever, we might see people doing this much more often than I think they do today.

PS: Whatever exactly happens with Go error inspection, I think it's going to cause a real change in how people both create and deal with Go errors in future code. People are highly motivated to do whatever is easiest, so we can expect them to steadily adopt whatever approach is easiest to use with error inspection. This holds true regardless of what you want them to do with the new way. I also suspect that we aren't going to like some of the hacks that people come up with to bend errors.Is() and so on to what they want to do.

programming/Go2ErrorInspectionViews written at 23:45:00; Add Comment

ZFS quietly discards all-zero blocks, but only sometimes

On the ZFS on Linux mailing list, a question came up about whether ZFS discards writes of all-zero blocks (as you'd get from 'dd if=/dev/zero of=...'), turning them into holes in your files or, especially, holes in your zvols. This is especially relevant for zvols, because if ZFS behaves this way it provides you with a way of returning a zvol to a sparse state from inside a virtual machine (or other environment using the zvol):

$ dd if=/dev/zero of=fillfile
[... wait for the disk to fill up ...]
$ rm -f fillfile

The answer turns out to be that ZFS does discard all-zero blocks and turn them into holes, but only if you have some sort of compression turned on (ie, that you don't have the default 'compression=off'). This isn't implemented as part of ZFS ZLE compression (or other compression methods); instead, it's an entirely separate check that looks only for an all-zero block and returns a special marker if that's what it has. As you'd expect, this check is done before ZFS tries whatever main compression algorithm you set.

Interestingly, there is a special compression level called 'empty' (ZIO_COMPRESS_EMPTY) that only does this special 'discard zeros' check. You can't set it from user level with something like 'compression=empty', but it's used internally in the ZFS code for a few things. For instance, if you turn off metadata compression with the zfs_mdcomp_disable tunable, metadata is still compressed with this 'empty' compression. Comments in the current ZFS on Linux source code suggest that ZFS relies on this to do things like discard blocks in dnode object sets where all the dnodes in the block are free (which apparently zeroes out the dnode).

There are two consequences of this. The first is that you should always set at least ZLE compression on zvols, even if their volblocksize is the same as your pool's ashift block size and so they can't otherwise benefit from compression (this would also apply to filesystems if you set an ashift-sized recordsize). The second is that it reinforces how you should basically always turn compression on on filesystems, even if you think you have mostly incompressible data. Not only do you save space at the end of files, but you get to drop any all-zero sections of sparse or pseudo-sparse files.

(Looking back, Richard Laager mentioned this zero block discarding for zvols back in a comment on this entry of mine, but apparently it didn't stick in my mind. Also, now I know the details.)

I took a quick look back through the history of ZFS's code, and as far as I could see, this zero-block discarding has always been there, right back to the beginnings of compression (which I believe came in with ZFS itself). ZIO_COMPRESS_EMPTY doesn't quite date back that far; instead, it was introduced along with zfs_mdcomp_disable, back in 2006.

(All of this is thanks to Gordan Bobic for raising the question in reply to me when I was confidently wrong, which led to me actually looking it up in the code.)

solaris/ZFSZeroBlockDiscarding written at 00:33:46; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.