Wandering Thoughts

2018-05-21

Bad versions of packages in the context of minimal version selection

Recently, both Sam Boyer and Matt Farina have made the point that Go's proposed package versioning lacks an explicit way for packages to declare known version incompatibilities with other packages. Suppose that you have a package A and it uses package X, initially at v1.5.0. The package X people release v1.6.0, which is fine, and then v1.7.0, they introduce an API behavior change that is incompatible with how your package uses the API (Matt Farina's post has a real world example of this). By the strict rules of semantic versioning this is a no-no, but in real life it happens for all sorts of reasons. People would like the ability to have their own package say 'I'm not compatible with v1.7.0 (and later versions)', which Russ Cox's proposal doesn't provide.

The first thing to note is that in a minimal version selection environment, this incompatibility doesn't even come up as long as you're only building the package or something using the package that has no other direct or indirect dependencies on package X. If you're only using package A, package A says it wants X@v1.5.0 and that's what MVS picks. MVS will never advance to the incompatible version v1.7.0 on its own; it must be forced to do so. Even if you're also using package B and B requires X@v1.6.0, you're still okay; MVS will advance the version of X but only to v1.6.0, the new minimal version.

(This willingness to advance the version at all is a pragmatic tradeoff. We can't be absolutely sure that v1.6.0 is really API compatible with A's required X@v1.5.0, but requiring everyone to use exactly the same version of a package is a non-starter in practice. In order to make MVS useful at all, we have to hope that advancing the version here is safe enough (by default, and if we lack other information).)

So this problem with incompatible package versions only comes up in MVS if you also have another package B that explicitly requires X@v1.7.0. The important thing here is that this version incompatibility is not a solvable situation. We cannot build a system that works; package A doesn't work with v1.7.0 while package B only works with v1.7.0 and we need both. The only question is whether MVS or an MVS-like algorithm will actually tell us about this problem, aborting the build, or whether it will build a system that doesn't work (if we're lucky the system will fail our tests).

To me, this changes how critical the problem is to address. Failure to build a working system where it's possible would be one thing, but we don't have that; instead we merely have the question of whether you're going to get told up front that what you want isn't possible.

The corollary to this is that when package A publishes information that it's incompatible with X version v1.7.0, it's doing so almost entirely as a service for other people, not something it needs for itself. Since A's manifest only requires X@v1.5.0, MVS will generally use v1.5.0 when building A alone (let's assume that none of A's other dependencies also use X and will someday advance to requiring X@v1.7.0). It's only when A gets bundled together with B that problems happen, and so this is mostly when A's information about version incompatibility is useful. Should this information be published in a machine readable form? Well, I think it would be nice, but it depends on what else we have to give up for it.

(The developers of A may want to leave themselves a note about the situation in their version manifest, of course, just so that no developer accidentally tries advancing X's version and then gets surprised by the results.)

PS: There is also an argument that such incompatible version blocks should only be advisory warnings or the like. As the person building the overall system, you may actually know that the end result will work anyway; perhaps you've taken steps to compensate for the API incompatibility in your own code. Since the failure is an overall system failure, package A can't necessarily be absolutely sure about things.

(Things might be easier to implement as advisory warnings. One approach would be to generate the MVS versions as usual, then check to see if anyone declared an incompatibility with the concrete versions chosen. Resolving the situation, if it's even possible, would be up to you.)

programming/MVSAndBadVersions written at 22:41:57; Add Comment

2018-05-20

'Minimal version selection' accepts that semantic versioning is fallible

Go has been quietly wrestling with package versioning for a long time. Recently, Russ Cox brought forward a proposal for package versioning; one of the novel things about it is what he calls 'minimal version selection', which I believe has been somewhat controversial.

In package management and versioning, the problem of version selection is the problem of what version of package dependencies you'll use. If your package depends on another package A, and you say your minimum version of A is 1.1.0, and package A is available in 1.0.0, 1.1.0, 1.1.5, 1.2.0, and 2.0.0, version selection is picking one of those versions. Most package systems will pick the highest version available within some set of semantic versioning constraints; generally this means either 1.1.5 or 1.2.0 (but not 2.0.0, because the major version change is assumed to mean API incompatibilities exist). In MVS, you short-circuit all of this by picking the minimum version allowed; here, you would pick 1.1.0.

People have had various reactions to MVS, but as a grumpy sysadmin my reaction is positive, for a simple reason. As I see it, MVS is a tacit acceptance that semantic versioning is not perfect and fails often enough that we can't blindly rely on it. Why do I say this? Well, that's straightforward. The original version number (our minimum requirement) is the best information we have about what version the package will definitely work with. Any scheme that advances the version number is relying on that new version to be sufficiently compatible with the original version that it can be substituted for it; in other words, it's counting on people to have completely reliably followed semantic versioning.

The reality of life is that this doesn't happen all of the time. Sometimes mistakes are made; sometimes people have a different understanding of what semantic versioning means because semantic versioning is ultimately a social thing, not a technical one. In an environment where semver is not infallible (ie, in the real world), MVS is our best option to reliably select package versions with the highest likelihood of working.

(Some package management systems arrange to also record one or more 'known to work' package version sets. I happen to think that MVS is more straightforward than such two-sided schemes for various reasons, including practical experience with some Rust stuff.)

I understand that MVS is not very aesthetic. People really want semver to work and to be able to transparently take advantage of it working (and I agree that it would be great if it did work). But as a grumpy sysadmin, I have seen a non-zero amount of semver not working in these situations, and I would rather have things that I can build reliably even if they are not using all of the latest sexy bits.

programming/FallibleSemverAndMVS written at 22:30:49; Add Comment

Modern CPU power usage varies unpredictably based on what you're doing

I have both an AMD machine and an Intel machine, both of them using comparable CPUs that are rated at 95 watts TDP (although that's misleading), and I gathered apples to apples power consumption numbers for them. In the process I discovered a number of anomalies in relative power usage between the two CPUs. As a result I've wound up with the obvious realization that modern CPUs have complicated and unpredictable power usage (in addition to all of the other complicated things about them).

In the old days, it was possible to have a relatively straightforward view of how CPU usage related to power draw, where all you really needed to care about was how many CPUs were in use and maybe whether it was integer or floating point code. Not only is that is clearly no longer the case, but what factors change the power usage vary from CPU model to CPU model. My power consumption numbers show one CPU to CPU anomaly right away, where an infinite loop in two shells has one shell using more power on a Ryzen 1800X and the other shell using more power on an i7-8700K. These two shells are running the same code on both CPUs and each shell's code is likely to be broadly similar to the other, but the CPUs are responding to it quite differently, especially when the code is running on all of the CPUs.

Beyond this anomaly, there is also that this simple 'infinite shell loop' power measurement showed a different (and higher) power usage than a simple integer loop in Go. I can make up theories for why, but it's clear that even if you restrict yourself to integer code, a simple artificial chunk of code may not have anywhere near the same power usage as more complex real code. The factors influencing this are unlikely to be simple, and they also clearly vary from CPU to CPU. 'Measure your real code' has always been good advice, but it clearly matters more than ever today if you care about power usage.

(The corollary of 'measure your real code' is probably that you have to measure real usage too; otherwise you may be running into something like my Bash versus rc effect. This may not be entirely easy, to put it one way.)

It's not news these days that floating point operations and especially the various SIMD instructions such as AVX and AVX-512 use more power than basic integer operations; that's why people reach for mprime as a heavy-duty CPU stress test, instead of just running integer code. MPrime's stress test itself is a series of different tests, and it will probably not surprise you to hear that which specific tests seemed to use the most power varied between my AMD Ryzen 1800X machine and my Intel i7-8700K machine. I don't know enough about MPrime's operation to know if the specific tests differ in what CPU operations they use or only in how much memory they use and how they stride through memory.

(One of the interesting differences was that on my i7-8700k, the test series that was said to use the most power seemed to use less power than the 'maximum heat and FPU stress' tests. But it's hard to say too much about this, since power usage could swing drastically from sub-test to sub-test. I saw swings of 20 to 30 watts from sub-test to sub-test, which does make reporting a 'mprime power consumption' number a bit misleading.)

Trying to monitor the specific power usage of MPrime sub-tests is about where I decided both that I'd run out of patience and that the specific details were unlikely to be interesting. It's clear that what uses more or less power varies significantly between the Ryzen 1800X system and the i7-8700K system, and really that's all I need to know. I suspect that it basically varies between every CPU micro-architecture, although I wouldn't be surprised if each company's CPUs are broadly similar to each other (on the basis that the micro-architectures and the design priorities are probably often similar to each other).

PS: Since I was measuring system power usage, it's possible that some of this comes from the BIOS deciding to vary CPU and system fan speeds, with faster fan speeds causing more power consumption. But I suspect that fan speed differences don't account for all of the power draw difference.

tech/VaryingCPUPowerDraws written at 01:13:11; Add Comment

2018-05-18

ZFS spare-N spare vdevs in your pool are mirror vdevs

Here's something that comes up every so often in ZFS and is not as well publicized as perhaps it should be (I most recently saw it here). Suppose that you have a pool, there's been an issue with one of the drives, and you've had a spare activate. In some situations, you'll wind up with a pool configuration that may look like this:

[...]
   wwn-0x5000cca251b79b98    ONLINE  0  0  0
   spare-8                   ONLINE  0  0  0
     wwn-0x5000cca251c7b9d8  ONLINE  0  0  0
     wwn-0x5000cca2568314fc  ONLINE  0  0  0
   wwn-0x5000cca251ca10b0    ONLINE  0  0  0
[...]

What is this spare-8 thing, beyond 'a sign that a spare activated here'? This is sometimes called a 'spare vdev', and the answer is that spare vdevs are mirror vdevs.

Yes, I know, ZFS says that you can't put one vdev inside another vdev and these spare-N vdevs are inside other vdevs. ZFS is not exactly wrong, since it doesn't let you and me do this, but ZFS itself can break its own rules and it's doing so here. These really are mirror vdevs under the surface and as you'd expect they're implemented with exactly the same code in the ZFS kernel code.

(If you're being sufficiently technical these are actually a slightly different type of mirror vdev, which you can see being defined in vdev_mirror.c. But while they have different nominal types they run the same code to do various operations. Admittedly, there are some other sections in the ZFS code that check to see whether they're operating on a real mirror vdev or a spare vdev.)

What this means is that these spare-N vdevs behave like mirror vdevs. Assuming that both sides are healthy, reads can be satisfied from either side (and will be balanced back and forth as they are for mirror vdevs), writes will go to both sides, and a scrub will check both sides. As a result, if you scrub a pool with a spare-N vdev and there are no problems reported for either component device, then both old and new device are fine and contain a full and intact copy of the data. You can keep either (or both).

As a side note, it's possible to manually create your own spare-N vdevs even without a fault, because spares activation is actually a user-level thing in ZFS. Although I haven't tested this recently, you generally get a spare-N vdev if you do 'zpool replace <POOL> <ACTIVE-DISK> <NEW-DISK>' and <NEW-DISK> is configured as a spare in the pool. Abusing this to create long term mirrors inside raidZ vdevs is left as an exercise to the reader.

(One possible reason to have a relatively long term mirror inside a raidZ vdev is if you don't entirely trust one disk but don't want to pull it immediately, and also have a handy spare disk. Here you're effectively pre-deploying a spare in case the first disk explodes on you. You could also do the same if you don't entirely trust the new disk and want to run it in parallel before pulling the old one.)

PS: As you might expect, the replacing-N vdev that you get when you replace a disk is also a mirror vdev, with the special behavior than when the resilver finishes, the original device is normally automatically detached.

solaris/ZFSSparesAreMirrors written at 22:44:19; Add Comment

How I usually divide up NFS (operation) metrics

When you're trying to generate metrics for local disk IO, life is generally relatively simple. Everyone knows that you usually want to track reads separately from writes, especially these days when they may have significantly different performance characteristics on SSDs. While there are sometimes additional operations issued to physical disks, they're generally not important. If you have access to OS-level information it can be useful to split your reads and writes into synchronous versus asynchronous ones.

Life with NFS is not so simple. NFS has (data) read and write operations, like disks do, but it also has a large collection of additional protocol operations that do various things (although some of these protocol operations are strongly related to data writes, for example the COMMIT operation, and should probably be counted as data writes in some way). If you're generating NFS statistics, how do you want to break up or aggregate these other operations?

One surprisingly popular option is to ignore all of them on the grounds that they're obviously unimportant. My view is that this is a mistake in general, because these NFS operations can have an IO impact on the NFS server and create delays on the NFS clients if they're not satisfied fast enough. But if we want to say something about these and we don't want to go to the extreme of presenting per-operation statistics (which is probably too much information, and in any case can hide patterns in noise), we need some sort of breakdown.

The breakdown that I generally use is to split up NFS operations into four categories: data reads, data writes (including COMMIT), operations that cause metadata writes such as MKDIR and REMOVE, and all other operations (which are generally metadata reads, for example READDIRPLUS and GETATTR). This split is not perfect, partly because some metadata read operations are far more common (and are far more cached on the server) than other operations; specifically, GETATTR and ACCESS are often the backbone of a lot of NFS activity, and it's common to see GETATTR as by far the most common single operation.

(I'm also not entirely convinced that this is the right split; as with other metrics wrestling, it may just be a convenient one that feels logical.)

Sidebar: Why this comes up less with local filesystems and devices

If what you care about is the impact that IO load is having on the system (and how much IO load there is), you don't entirely care why an IO request was issued, you only care that it was. From the disk drive's perspective, a 16 KB read is a 16 KB read, and it takes as much work to satisfy a 16 KB file as it does a 16 KB directory or a free space map. This doesn't work for NFS because NFS is more abstracted, and both the amount of operations and the amount of bytes that flow over the wire don't necessarily give you a true picture of the impact on the server.

Of course, in these days of SSDs and complicated disk systems, just having IO read and write information may not be giving you a true picture either. With SSDs especially, we know that bursts of writes are different from sustained writes, that writing to a full disk is often different than writing to an empty one, and apparently giving drives some idle time to do background processing and literally cool down may change their performance. But many things are simplifications so we do the best we can.

(Actual read and write performance is a 'true picture' in one sense, in that it is giving you information about what results the OS is getting from the drive. But it doesn't necessarily help to tell you why, or what you can do to improve the situation.)

tech/NFSMyMetricsSplit written at 01:44:01; Add Comment

2018-05-17

I'm worried about Wayland but there's not much I can do about it

In a comment on my entry about how I have a boring desktop, Opk asked a very good question:

Does it concern you at all that Wayland may force change on you? It may be a good few years away yet and perhaps fvwm will be ported.

Oh my yes, I'm definitely worried about this (and it turns out that I have been for quite some time, which also goes to show how long Wayland has been slowly moving forward). The FVWM people have said that they're not going to try to write a version of Wayland, which means that when Wayland inevitably takes over I'm going to need a new 'window manager' (in Wayland this is a lot more than just what it is in X) and possibly an entirely new desktop environment to go with it.

The good news is that apparently XWayland provides a reasonably good way to let X programs still display on a Wayland server, so I won't be forced to abandon as many X things as I expected. I may even be able to continue to run remote X programs via SSH and XWayland, which is important for my work desktop. This X to Wayland bridge will mean that I can keep not just programs with no Wayland equivalent but also old favorites like xterm, where I simply don't want to use what will be the Wayland equivalent (I don't like gnome-terminal or konsole very much).

The bad news for me is two-fold. First, I'm not attracted to tiling window managers at all, and since tiling window managers are the in thing, they're the most common alternate window managers for Wayland (based on various things, such as the Arch list). There seems to be a paucity of traditional stacking Wayland WMs that are as configurable as fvwm is, although perhaps there will be alternate methods in Wayland to do things like have keyboard and mouse bindings. It's possible that this will change when Wayland starts becoming more dominant, but I'm not holding my breath; heavily customized Linux desktop environments have been feeling more and more like extreme outliers over the years.

Second, it seems at least reasonably likely that a lot of current tray applets and notification systems will stop being general and start becoming tightly bound to mainstream desktop environments like Gnome 3, KDE, and Cinnamon. We've already seen this with Gnome 3 and Cinnamon, which have 'applets' that are now JavaScript extensions that run in the context of the Gnome and Cinnamon shells and simply can't be used outside them. In a Wayland world that focuses attention more than ever on a few mainstream desktop environments, will there be any equivalent of stalonetray and things for it like pnmixer?

(The people writing tiling Wayland window managers like Sway will probably certainly want there to be, because it will be hard to have a viable alternate environment without them. The question is whether major projects like NetworkManager will oblige or whether NM will use its limited development resources elsewhere.)

So yes, I worry about all of this. But in practice it's a very abstracted worry. To start with, Wayland is still not really here yet. Fedora is using it more, but it's by no means universal even for Gnome (where it's the default), and I believe that KDE (and other supported desktop environments) don't even really try to use it. At this rate it will be years and years before anyone is seriously talking about abandoning X (since Gnome programs will still face pressure to be usable in KDE, Cinnamon, and other desktop environments that haven't yet switched to Wayland).

(I believe that Fedora is out ahead of other other Linux distributions, too. People like Debian will probably be trying to support X and pressure people to support X for years to come.)

More significantly, there's nothing I can do about all of this. How Wayland in general and Wayland environments develop is far beyond my ability to influence; in practice I'm a far outlier in window manager and desktop land, and so I'll have to make do with whatever is available. If I'm lucky it will be something generally comparable to my current environment; if I'm not, well, I can use Cinnamon and it will probably survive in a Wayland-only world. I might even learn enough Cinnamon shell and JavaScript to customize it a bit.

(If I had a lot of energy and enthusiasm, perhaps I would be trying to write the stacking, construction kit style Wayland window manager and compositor of my dreams. I don't have anything like that energy. I do hope other people do, and while I'm hoping I hope that they like textual icon managers as much as I do.)

linux/WaylandWorries written at 01:33:05; Add Comment

2018-05-16

How you run out of inodes on an extN filesystem (on Linux)

I've mentioned that we ran out of inodes on a Linux server and covered what the high level problem was, but I've never described the actual mechanics of how and why you can run out of inodes on a filesystem, or more specifically on an extN filesystem. I have to be specific about the filesystem type, because how this is handled varies from filesystem to filesystem; some either have no limit on how many inodes you can have or have such a high limit that you're extremely unlikely to run into it.

The fundamental reason you can run out of inodes on an extN filesystem is that extN statically allocates space for inodes; in every extN filesystem, there is space for so many inodes reserved, and you can never have any more than this. If you use 'df -i' on an extN filesystem, you can see this number for the filesystem, and you can also see it with dumpe2fs, which will tell you other important information. Here, let's look at an ext4 filesystem:

# dumpe2fs -h /dev/md10
[...]
Block size:               4096
[...]
Blocks per group:         32768
[...]
Inodes per group:         8192
[...]

I'm showing this information because it leads to the important parameter for how many inodes any particular extN filesystem has, which is the bytes/inode ratio (mke2fs's -i argument). By default this is 16 KB, ie there will be one inode for every 16 KB of space in the filesystem, and as the mke2fs manpage covers, it's not too sensible to set it below 4 KB (the usual extN block size).

The existence of the bytes/inode ratio gives us a straightforward answer for how you can run a filesystem out of inodes: you simply create lots of files that are smaller than this ratio. ExtN implicitly assumes that each inode will on average use at least 16 KB of disk space; if on average your inodes use less, you will run out of inodes before you run out of disk space. One tricky thing here is that this space doesn't have to be used up by regular files, because other sorts of inodes can be small too. Probably the easiest other source is directories; if you have lots of directories with a relatively small number of subdirectories and files in each, it's quite possible for many of them to be smaller than 16 KB, and in some cases you can have a great many subdirectories.

(In our problem directory hierarchy, almost all of the directories are 4 KB, although a few are significantly larger. And the hierarchy can have a lot of subdirectories when things go wrong.)

Another case is symbolic links. Most symbolic links are quite small, and in fact ext4 may be able to store your symbolic link entirely in the inode itself. This means that you can potentially use up a lot of inodes without using any disk space (well, beyond the space for the directories that the symbolic links are in). There are other sorts of special files that also use little or no disk space, but you probably don't have tons of them in an extN filesystem unless something unusual is going on.

(If you do have tens of thousands of Unix sockets or FIFOs or device files, though, you might want to watch out. Or even tons of zero-length regular files that you're using as flags and a persistence mechanism.)

Most people will never run into this on most filesystems, because most filesystems have an average inode size usage that's well above 16 KB. There usually plenty of files over 16 Kb, not that many symbolic links, and a relatively few (small) directories compared to the regular files. For instance, one of my relatively ordinary Fedora root filesystem has a bytes/inode ratio of roughly 73 Kb per inode, and another is at 41 KB per inode.

(You can work out your filesystem's bytes/inode ratio simply by dividing the space used in KB by the number of inodes used.)

linux/HowInodesRunOut written at 01:10:42; Add Comment

2018-05-15

I have a boring desktop and I think I'm okay with that

Every so often I wind up either reading about or looking at pictures of beautifully customized Unix desktops. These are from people who have carefully picked colors and themes, often set up a highly customized window manager environment, set up all sorts of panels and information widgets, and so on (one endless source of these is on reddit). Sometimes this involves using a tiling window manager with various extension programs. I look at these things partly because there's something in me that feels drawn to them and that envies those setups.

My desktop is unconventional, but within that it's boring. It has colours that are either mismatched or at best vaguely matched, various font choices picked somewhat at random, and assorted decorations and so on that are there mostly because they're what's easy to do in the plain old standby of fvwm. There's very little design and very little use of interesting programs; I mean, I still use xclock and xload, and I don't think fvwm is considered an interesting window manager these days.

(Fvwm certainly has limitations in terms of what you can easily do in it. A dedicated person could expand fvwm's capabilities by use of modern X programs like wmutils and wmctrl, but I'm not such a person.)

I wound up thinking about this when I was recently reading another article about this (via, itself via), and this time around I came to a straightforward realization, one that I could have arrived at years ago: I'm not that dedicated and I don't care that much. My mismatched, assorted desktop is boring, but it works okay for me, and I've become the kind of pragmatic person that is okay with that.

I wouldn't mind a nicer desktop and every so often I make a little stab in that direction (I recently added some fvwm key bindings that were inspired by Cinnamon), but I'm never going to do the kind of work that's required to build a coherent custom desktop or make the kind of sacrifices required. Tiling window managers, programmable window managers, highly custom status bars, all of that stuff is neat to read about and look at, but it's not something for me. The best I'm likely to ever do is minor changes around the edges (at least until Wayland forces me to start over from scratch). And so my desktop is always going to be boring. I think I'm finally okay with that.

(There's a freedom in giving up in this sense. One way to put it is that I can stop feeling slightly guilty about not having a nicer, more coherent desktop environment, or in having something that's the kind of advanced environment you might expect a serious Unix person to have. I know this is an irrational feeling, but no one said feelings are rational.)

PS: This also means that I can give up thinking about switching to another window manager. It's quite unlikely I could find one that I want to use other than fvwm (fvwm's icon manager is extremely important to me), but I've long had the thought that there might be a better one out there somewhere. Maybe there is, but even if there is, it's probably way too much effort to switch.

sysadmin/MyBoringDesktop written at 01:20:46; Add Comment

2018-05-13

My GDPR pessimism

The latest great hope of various people, more or less including myself, is that the European GDPR will come along and put an end to various sorts of annoying email marketing activities and other intrusive ad and marketing activities. Under the GDPR, so goes the theory, companies like Yubico and Red Hat will not be able to abuse email addresses they happen to have sitting around to send marketing email; in fact they may not even have those email addresses sitting around at all.

(At least for people in the EU. The further great hope of the GDPR is that many companies affected by it won't bother trying to tackle the near-impossible task of figuring out who's in the EU and who's not.)

I'd like to believe this, but I'm not sure that I do. I'm not basing this on any examination of the GDPR or on what people have written about it. Instead, my pessimism comes from the cynical version of the Golden Rule. My simple observation that regardless of what they say, governments very rarely actually kill off entire decent-sized industries and slap large fines on a wide variety of prosperous and perfectly normal corporations who are conducting business as usual. It might happen, but it seems much more likely that there will be delays and 'clarifications' and so on that in the end cause the GDPR to have little practical effect on this sort of activity. If there is change, I expect it to happen only very slowly, as countries slow-walk things like fines as much as possible in favour of 'consulting' and 'advising' and so on with companies.

(In other words, a lot of stern letters and not many other effects. And I expect companies to take advantage of this to stall as much as possible, and to plead implementation difficulties and other things that tragically mean they can't comply quite yet. It may all be very theatrical, in the 'security theater' sense.)

Partly I come by this pessimism by watching what's happened with Canada's theoretically relatively strong anti-spam law. One of the strong features of this law was that it created a private right of action, where you could start a civil case against violators and thus you didn't have to depend on the official regulator getting around to doing something. Since Canada is a loser-pays legal system, this was always going to be a reasonably risky activity, but then in 2017 this provision was quietly suspended, including the charming quote:

The Government supports a balanced approach that protects the interests of consumers while eliminating any unintended consequences for organizations that have legitimate reasons for communicating electronically with Canadians.

This provision has yet to be revived, and there have been no 2018 enforcement actions by the CRTC under CASL (at least none that appear in the CRTC's public records).

It's possible that the EU will be more aggressive and determined about the GDPR and violations of it than Canada has been about our lauded (at the time) anti-spam law, especially in today's climate with increased public concern about these sort of issues, but I'm not going to hold my breath.

PS: It turns out that there has been some activity on the CASL front (and, and, and, and) and there may be good news someday. But if so, it will probably be significantly later than the already slow timeline that CASL itself specified. Applications to the speed of GDPR are left as an exercise for the reader.

spam/GDPRPessimism written at 22:08:13; Add Comment

2018-05-12

ZFS on Linux's development version now has much better pool recovery for damaged pools

Back in March, I wrote about how much better ZFS pool recovery was coming, along with what turned out to be some additional exciting features, such as the long-awaited feature of shrinking ZFS pools by removing vdevs. The good news for people using ZFS on Linux is that most of both features have very recently made it into the ZFS on Linux development source tree. This is especially relevant and important if you have a damaged ZFS on Linux pool that either doesn't import or panics your system when you do import it.

(These changes are OpenZFS 9075 and its dependencies such as OpenZFS 8961, and the vdev removal changes, although there are followup fixes to them such as OpenZFS 9290.)

These changes aren't yet in any ZFS on Linux release and I suspect that they won't appear until 0.8.0 is released someday (ie, they won't be ported into the current 0.7.x release branch). However, it's fairly easy to build ZFS on Linux from source if you need to temporarily run the latest version in order to recover or copy data out of a damaged pool that you can't otherwise get at. I believe that some pool recovery can be done as a one-time import and then you can revert back to a released version of ZFS on Linux to use the now-recovered pool, but certainly not all pool import problems can be repaired like this.

(As far as vdev removal goes, it currently requires permanently using a version of ZFS that supports it, because it adds a device_removal feature to your pool that will never deactivate, per zpool-features. This may change at some point in the future, but I wouldn't hold my breath. It seems miraculous enough that we've gotten vdev removal after all of these years, even if it's only for single devices and mirror vdevs.)

I haven't tried out either of these features, but I am running a recently built development version of ZFS on Linux with them included and nothing has exploded so far. As far as things go in general, ZFS on Linux has a fairly large test suite and these changes added tests along with their code. And of course they've been tested upstream and OmniOS CE had enough confidence in them to incorporate them.

linux/ZFSOnLinuxBetterPoolImport written at 22:26:45; Add Comment

Sorting out some of my current views on operator overloading in general

Operator overloading is a somewhat controversial topic in programming language design and programming language comparisons. To somewhat stereotype both sides, one side thinks that it's too often abused to create sharp-edged surprises where familiar operators do completely surprising things (such as << in C++ IO). The other side thinks that it's a tool that can be used to create powerful advantages when done well, and that its potential abuses shouldn't cause us to throw it out entirely.

In general, I think that operator overloading can be used for at least three things:

  1. implementing the familiar arithmetic operations on additional types of numbers or very strongly number-like things, where the new implementations respect the traditional arithmetic properties of the operators; for example + and * are commutative.

  2. implementing these operations on things which already use these operators in their written notation, even if how the operators are used doesn't (fully) preserve their usual principles. Matrix multiplication is not commutative, for example, but I don't think many people would argue against using * for it in a programming language.

  3. using these operators simply for convenient, compact notation in ways that have nothing to do with arithmetic, mathematical notation, or their customary uses in written form for the type of thing you're dealing with.

I don't think anyone disagrees with the use of operator overloading for the first case. I suspect that there is some but not much disagreement over the second case. It's the third case that I think people are likely to be strongly divided over, because it's by far the most confusing one. As an outside reader of the code, even once you know the types of objects involved, you don't know anything about what's actually happening; you have to read the definition of what that type does with that operator. This is the 'say what?' moment of << in C++ IO and % with Python strings.

Languages are partly a cultural thing, not purely a technical one, and operator overloading (in its various sorts) can be a better or a worse fit for different languages. Operator overloading probably would clash badly with Go's culture, for example, even if you could find a good way to add it to the language (and I'm not sure you could without transforming Go into something relatively different).

(Designing operator overloading into your language pushes its culture in one direction but doesn't necessarily dictate where you wind up in the end. And there are design decisions that you can make here that will influence the culture, for example requiring people to define all of the arithmetic operators if they define any of them.)

Since I'm a strong believer in both the pragmatic effects and aesthetic power of syntax, I believe that even operator overloading purely to create convenient notation for something can be a good use of operator overloading in the right circumstances and given the right language culture. Generally the right circumstances are going to be when the operator you're overloading has some link to what the operation is doing. I admit that I'm biased here, because I've used the third sort of operator overloading from time to time in Python and I think it made my code easier to read, at least for me (and it certainly made it more compact).

(For example, I once implemented '-' for objects that were collections of statistics, most (but not all) of them time-dependent. Subtracting one object from another gave you an object that had the delta from one to the other, which I then processed to print per-interval statistics.)

In thinking about this now, one thing that strikes me is that an advantage of operators over function calls is that operators tend to be written with whitespace, whereas function calls often run everything together in a hard to read blur. We know that whitespace helps readability, so if we're going to lean heavily on function calls in a language (including in the form of method calls), perhaps we should explore ways of adding whitespace to them. But I'm not sure whitespace alone is quite enough, since operators are also distinct from letters.

(I believe this is where a number of functional languages poke their heads up.)

programming/SomeOverloadingViews written at 00:39:36; Add Comment

(Previous 11 or go back to May 2018 at 2018/05/11)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.