Wandering Thoughts archives

2014-10-31

With ZFS, rewriting a file in place might make you run out of space

Here's an interesting little issue that I confirmed recently: if you rewrite an existing file in place with random IO on a plain ZFS filesystem, you can wind up using extra space and even run out of space. This is a little bit surprising but is not a bug; it's just fallout from how ZFS works.

It's easy to see how this can happen if you have compression or deduplication turned on on the filesystem and you rewrite different data; the new data might compress or deduplicate less well than the old data and so use up more space. Deduplication might especially be prone to this if you initialize your file with something simple (zeroes, say) and then rewrite with actual data.

(The corollary to this is that continuously rewritten files like the storage for a database can take up a fluctuating amount of disk space over time on such a filesystem. This is one reason of several that we're unlikely to ever turn compression on on our fileservers.)

But this can happen even on filesystems without dedup or compression, which is a little bit surprising. What's happening is the result of the ZFS 'record size' (what many filesystems would call their block size). ZFS has a variable record size, ranging from the minimum block size of your disks up to the recordsize parameter, usually 128 KB. When you write data, especially sequential data, ZFS will transparently aggregate it together into large blocks; this makes both writes and reads more efficient and so is a good thing.

So you start out by writing a big file sequentially, which aggregates things together into 128 KB on-disk blocks, puts pointers to those blocks into the file's metadata, and so on. Now you come back later and rewrite the file using, say, 8 KB random IO. Because ZFS is a copy on write filesystem, it can't overwrite the existing data in place. Instead every time you write over a chunk of an existing 128 KB block, the block winds up effectively fragmented and your new 8 KB chunk consumes some amount of extra space for extra block pointers and so on (and perhaps extra metaslab space due to fragmentation).

To be honest, actually pushing a filesystem or a pool out of space requires you to be doing a lot of rewrites and to already be very close to the space limit. And if you hit the limit, it seems to not cause more than occasional 'out of space' errors for the rewrite IO; things will go to 0 bytes available but the rewrites will continue to mostly work (new write IO will fail, of course). Given comments I've seen in the code while looking into the extra space reservation in ZFS pools, I suspect that ZFS is usually estimating that an overwrite takes no extra space and so usually allowing it through. But I'm guessing at this point.

(The other thing I don't know is what such a partially updated block looks like on disk. Does the entire original 128 KB block get fully read, split and rewritten somehow, or is there something more clever going on? Decoding the kernel source will tell me if I can find and understand the right spot, but I'm not that curious at the moment.)

solaris/ZFSRewriteSpaceUsage written at 22:50:21; Add Comment

A drawback to handling errors via exceptions

Recently I discovered an interesting and long standing bug in DWiki. DWiki is essentially a mature program, so this one was uncovered through the common mechanism of someone using invalid input, in this case a specific sort of invalid URL. DWiki creates time-based views of this blog through synthetic parts of the URLs that end in things like, for example, '.../2014/10/' for entries from October 2014. Someone came along and requested a URL that looked like '.../2014/99/', and DWiki promptly hit an uncaught Python exception (well, technically it was caught and logged by my general error code).

(A mature program usually doesn't have bugs handling valid input, even uncommon valid input. But the many forms of invalid input are often much less well tested.)

To be specific, it promptly coughed up:

calendar.IllegalMonthError: bad month number 99; must be 1-12

Down in the depths of the code that handled a per-month view I was calling calendar.monthrange() to determine how many days a given month has, which was throwing an exception because '99' is of course not a valid month of the year. The exception escaped because I wasn't doing anything in my code to either catch it or not let invalid months get that far in the code.

The standard advantage of handling errors via exceptions definitely applied here. Even though I had totally overlooked this error possibility, the error did not get quietly ignored and go on to corrupt further program state; instead I got smacked over the nose with the existence of this bug so I could find it and fix it. But it also exposes a drawback of handling errors with exceptions, which is that it makes it easier to overlook the possibility of errors because that possibility isn't explicit.

The calendar module doesn't document what exceptions it raises, either in general or especially in the documentation for monthrange() in specific (where it would be easy to spot while reading about the function). Because an exception is effectively an implicit extra return 'value' from functions, it's easy to overlook the possibility that you'll actually get an exception; in Python, there's nothing there to rub your nose in it and make you think about it. And so I never even thought about what happened if monthrange() was handed invalid input, in part because of the usual silent assumption that the code would only be called with valid input because of course DWiki doesn't generate date range URLs with bad months in them.

Explicit error returns may require a bunch of inconvenient work to handle them individually instead of letting you aggregate exception handling together, but the mere presence of an explicit error return in a method's or function's signature serves as a reminder that yes, the function can fail and so you need to handle it. Exceptions for errors are more convenient and more safe for at least casual programming, but they do mean you need to ask yourself what-if questions on a regular basis (here, 'what if the month is out of range?').

(It turns out I've run into this general issue before, although that time the documentation had a prominent notice that I just ignored. The general issue of error handling with exceptions versus explicit returns is on my mind these days because I've been doing a bunch of coding in Go, which has explicit error returns.)

python/ExceptionsOverlookProblem written at 01:00:38; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.