Wandering Thoughts archives

2018-02-04

A surprise in how ZFS grows a file's record size (at least for me)

As I wound up experimentally verifying, in ZFS all files are stored as a single block of varying size up to the filesystem's recordsize, or using multiple recordsize blocks. If a file has more than one block, all blocks are recordsize, no more and no less. If a file is a single block, the size of this block is based on how much data has been written to the file (or technically the maximum offset that's been written to the file). However, how the block size grows as you write data to the file turns out to be somewhat surprising (which makes me very glad that I actually did some experiments to verify what I thought I knew before I wrote this entry, because I was very wrong).

Rather than involving the ashift or growing in powers of two, ZFS always grows the (logical) block size in 512-byte chunks until it reaches the filesystem recordsize. The actual physical space allocated on disk is in ashift sized units, as you'd expect, but this is not directly related to the (logical) block size used at the file level. For example, here is a 16896 byte file (of incompressible data) on an ashift=12 pool:

 Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
4780566    1   128K  16.5K    20K     512  16.5K  100.00  ZFS plain file
[...]
0 L0 DVA[0]=<0:444bbc000:5000> [L0 ZFS plain file] [...] size=4200L/4200P [...]

The DVA records an 0x5000 byte allocation (20 Kb), but the logical and physical-logical size are only 0x4200 bytes (16.5 Kb).

In thinking about it, this makes a certain amount of sense because the ashift is really a vdev property, not a pool property, and can vary from vdev to vdev within a single pool. As a result, the actual allocated size of a given block may vary from vdev to vdev (and a block may be written to multiple vdevs if you have copies set to more than 1 or it's metadata). The file's current block size thus can't be based on the ashift, because ZFS doesn't necessarily have a single ashift to base it on; instead ZFS bases it on 512-byte sectors, even if this has to be materialized differently on different vdevs.

Looking back, I've already sort of seen this with ZFS compression. As you'd expect, a file's (logical) block size is based on its uncompressed size, or more exactly on the highest byte offset in the file. You can write something to disk that compresses extremely well, and it will still have a large logical block size. Here's an extreme case:

; dd if=/dev/zero of=testfile bs=128k count=1
[...]
# zdb -vv -bbbb -O ssddata/homes cks/tmp/testfile

 Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
956361    1   128K   128K      0     512   128K    0.00  ZFS plain file
[...]

This turns out to have no data blocks allocated at all, because the 128 Kb of zeros can be recorded entirely in magic flags in the dnode. But it still has a 128 Kb logical block size. 128 Kb of the character 'a' does wind up requiring a DVA allocation, but the size difference is drastic:

Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
956029    1   128K   128K     1K     512   128K  100.00  ZFS plain file
[...]
0 L0 DVA[0]=<0:3bbd1c00:400> [L0 ZFS plain file] [...] size=20000L/400P [...]

We have a compressed size of 1 Kb (and a 1 Kb allocation on disk, as this is an ashift=9 vdev), but once again the file block size is 128 Kb.

(If we wrote 127.5 Kb of 'a' instead, we'd wind up with a file block size of 127.5 Kb. I'll let interested parties do that experiment themselves.)

What this means is that ZFS has much less wasted space than I thought it did for files that are under the recordsize. Since such files grow their logical block size in 512-byte chunks, even with no compression they waste at most almost all of one physical block on disk (if you have a file that is, say, 32 Kb plus one byte, you'll have a physical block on disk with only one byte used). This has some implications for other areas of ZFS, but those are for another entry.

(This is one of those entries that I'm really glad that I decided to write. I set out to write it as a prequel to another entry just to have how ZFS grew the block size of files written down explicitly, but wound up upending my understanding of the whole area. The other lesson for me is that verifying my understanding with experiments is a really good idea, because every so often my folk understanding is drastically wrong.)

solaris/ZFSRecordsizeGrowth written at 22:28:55; Add Comment

More notes on using uMatrix in Firefox 56 (in place of NoScript)

I wrote my first set of notes very early on in my usage of uMatrix, before things had really settled down and I mostly knew what I was doing it. Since then I've been refining my configuration and learning more about what works and how, and I've accumulated more stuff I want to record.

The first thing, the big thing, is that changing from NoScript to uMatrix definitely seems to have mostly solved my Firefox memory issues. My Firefox still slowly grows its memory usage over time, even with a stable set of windows, but it's doing so far less than it used to and as a result it's now basically what I consider stable. I certainly no longer have to restart it once every day or two. By itself this is a huge practical win and I'm far less low-key irritated with my Firefox setup.

(I'm not going to say that this memory growth was NoScript's fault, because it may well have been caused by some interaction between NS and my other extensions. It's also possible that my cookie blocker had something to do with it, since uMatrix also replaced it.)

It turns out that one hazard of using a browser for a long time is that you can actually forget how you have it configured. I had initial problems getting my uMatrix setup to accept cookies from some new sites I wanted to do this for (such as bugzilla.kernel.org). It turned out that I used to have Firefox's privacy settings set to refuse all cookies except ones from sites I'd specifically allowed. Naturally uMatrix itself letting cookies through wasn't doing anything when I'd told Firefox to refuse them in the first place. In the uMatrix world, I want to accept cookies in general and then let it manage them.

Well, more or less. uMatrix's approach is to accept all cookies but only let them be sent when you allow it. I decided I didn't entirely like having cookies hang around, so I've also added Self-Destructing Cookies to clean those cookies up later. SDC will also remove LocalStorage data, which I consider a positive since I definitely don't want random websites storing random amounts of things there.

(I initially felt grumpy about uMatrix's approach but have since come around to feeling that it's probably right for uMatrix, partly because of site-scoped rules. You may well have a situation where the same cookies are 'accepted' and sent out on some sites but blocked on others. uMatrix's approach isn't perfect here but it more or less allows this to happen.)

Another obvious in retrospect thing was YouTube videos embedded in other sites. Although you wouldn't know it without digging under the surface, these are in embedded iframes, so it's not enough to just allow YT's JavaScript on a site where you want them; you also need to give YT 'frame' permissions. I've chosen not to do this globally, because I kind of like just opening YT videos in another window using the link that uMatrix gives me.

I have had one annoying glitch in my home Firefox with uMatrix, but once I dug deep enough it appears that there's something unusual going on in my home Firefox 56. At first I thought it was weird network issues with Google (which I've seen before in this situation), but now I'm not sure; in any case I get a consistent NS_ERROR_FAILURE JavaScript failure deep in Google Groups' 'loaded on the fly' JS code. This is un-debuggable and un-fixable by me, but at least I have my usual option to fall back on.

('Things break mysteriously if you have an unusual configuration and even sometimes if you don't' is basically the modern web experience anyway.)

PS: A subtle benefit of using uMatrix is that it also exists for Chrome, so I can have the same interface and even use almost the same ruleset in my regular mode Chrome.

PPS: I'll have to replace Self-Destructing Cookies with something else when I someday move to Firefox Quantum, but as covered, I already have a candidate.

web/FirefoxUMatrixNotesII written at 01:59:09; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.