Wandering Thoughts

2018-04-17

Go and the pragmatic problems of having a Python-like with statement

In a comment on my entry on finalizers in Go, Aneurin Price asked:

So there's no deterministic way to execute some code when an object goes out of scope? Does Go at least have something like Python's "with" statement? [...]

For those who haven't seen it, the Python with statement is used like this:

with open("output.txt", "w") as fp:
    ... do things with fp ...

# fp is automatically closed by the
# the time we get here.

Python's with gives you reliable and automatic cleanup of fp or whatever resource you're working with inside the with block. Your code doesn't have to know anything or do anything; all of the magic is encapsulated inside with and things that speak its protocol.

Naturally, Go has no equivalent; sure, we have the defer statement but it's not anywhere near the same thing. In my opinion this is the right call for Go, because of two issues you would have if you tried to have something like Python's with in Go.

The obvious issue is that you would need some sort of protocol to handle initialization and cleanup, which would be a first for Go. You need the protocol because a big point of Python's with is that it magically handles everything for you without you having to remember to write any extra code; it's part of the point that using with is easier and shorter than trying to roll your own version (which encourages people to use it). If you're willing to write extra code, Go has everything today in the form of defer().

But beyond that there is a broader philosophical issue that's exposed by Aneurin Price's first question. In a language like Go where your local data may escape into functions you call, what does it mean for something to go out of scope? One answer is that things only go out of scope when there's no remaining reference to them. Unfortunately I believe that this is more or less impossible to implement efficiently without either going to Rust's extremes of ownership tracking in the language or forcing a reference counting garbage collector (where you know immediately when something is no longer referenced). This leaves you with the finalizer problem, where you're not actually cleaning up the resource promptly.

The other answer is that 'going out of scope' simply means 'execution reaches the end of the relevant block'. As in Python, you always invoke the cleanup actions at this point regardless of whether your resource may have escaped into things you've called and thus may still be alive somewhere. This implicit, hidden cleanup is a potentially dangerous trap for your code; if you forget and pass the resource to something that retains a reference to it, you may get explosions (much) later when that now-dead resource is used. If you're in luck, this use is deterministic so you can find it in tests. If you're unlucky, this use only happens in, say, an error path.

Using defer() instead of an implicit cleanup doesn't stop this problem from happening, but it makes explicit what's going on. When you write or see a defer(fp.Close()), you're pointedly reminded that at the end of the function, the resource will be dead. There is no implicit magic, only explicit actions, and hopefully this creates enough warning and awareness. Given Go's design goals, being explicit here as part of the language design makes complete sense to me. You can still get it wrong, but at least the wrongness is more visible.

(I don't think being explicit is necessarily better in general than Python's implicit magic. Go and Python are different languages with different goals; what's appropriate for one is not necessarily appropriate for the other. Python has both language features and cultural features that make with a good thing for it.)

GoVersusPythonWith written at 00:40:28; Add Comment

2018-04-06

Using Go finalizers can be a better option than not using them

Go has finalizers, which let you have some code be invoked just as an object is about to be garbage collected. However, plenty of people don't like them and the usual advice is to completely avoid them (for example). Recently, David Crawshaw wrote The Tragedy of Finalizers (via), in which he points out various drawbacks of finalizers and shows a case where relying on them causes failures. I more or less agree with all of this, but at the same time, I've used finalizers myself in a Go package to access to Solaris/Illumos kstats and I'll defend that usage.

What I use finalizers for is to avoid an invisible leak if people don't use my API correctly. In theory when you call my package you get back a magic token, which holds the only reference to some C-allocated memory. When you're done with the token, you're supposed to call a method to close it down, which will free the C-allocated memory. In practice, well, people make API usage and object lifetime mistakes. Without a finalizer, if a token went out of scope and was lost to garbage collection we'd permanently leak that C-allocated memory. As with all memory and resource leaks of this nature, this would be an especially annoying and pernicious leak because it would be completely invisible from the Go level. None of the usual Go level memory leak tools would help you at all (and I suspect that the usual C leak finding tools would have serious problems due to the presence of Go).

At one level, using a finalizer here is a pragmatic decision; it protects people using my package from certain usage errors that would cause problems that are hard to deal with. At another level, though, I can argue that using finalizers here is actually within the broad spirit of Go. As a garbage collected language, Go has essentially made a decision that explicitly managing object lifetimes is too hard, too much work, and too error-prone. It's a bit peculiar to be perfectly fine with this for memory, but not fine with this for other resources for anything other than purely pragmatic reasons.

(At the same time, those pragmatic reasons are real; as David Crawshaw explains, relying on memory garbage collection to garbage collect other resources before you run out of them is at best dangerous. Even my case is a bit dubious, since C-allocated memory doesn't apply pressure to the Go garbage collector.)

David Crawshaw followed up his article with Sharp-Edged Finalizers in Go, where he advocated using finalizers in this situation to force panics when people fail to use your APIs correctly. You can do this, but it feels somewhat un-Go-like to me. As a result I think you should only resort to this if the consequence of not using your API correctly are quite severe (for example, potential data loss because you forgot to commit a database transaction and then check for errors in it).

As a general note, I wouldn't say that my sort of use of finalizers is intended to avoid resource leaks as such. You will have a resource leak in practice from the time when you stop needing the resource (the kstat token, the open file, or what have you) until the Go garbage collection calls your finalizer (if it ever does), because the resource is still there but neither in use nor wanted. What finalizers do is make that leak be theoretically a temporary one, instead of definitely permanent. In other words, it's a recoverable leak instead of an unrecoverable one.

PS: This isn't original to me, of course. For example, this unofficial Go FAQ says that this is the main use of finalizers, and there's the example of the *os.File finalizer in the standard library.

(This has been on my mind for a while, but David Crawshaw's articles provide a convenient prompt and I hadn't thought of using finalizers to force a hard error in this situation.)

GoFinalizersStopLeaks written at 01:32:14; Add Comment

2018-03-02

Frequent versus infrequent developers (in languages and so on)

Yesterday I mentioned the phrase 'infrequent developer' in an aside in my entry. Today I'm writing about what I mean by that and by its opposite, the frequent developer, and why I care about this.

What I'm calling frequent developers here in the context of, say, a language (such as Go) are people who routinely work with code or programs written in that language. When you're a frequent developer, you naturally develop expertise in that language's operation and often a development environment for it, because you use it often. You know the commands, you remember their options (or at least the ones that you need), you've run into some of the somewhat obscure corners and things that can go wrong. You know your way around things. You'll naturally learn and master even relatively complex procedures.

For a frequent developer, setting up and running some special piece of software to help work on the language is both okay and perfectly sensible. It may take a bit more time to learn and operate, but you use things frequently enough that the extra overhead is only a small portion of the time you spend dealing with the language. It's worth setting up caches and CI and so on, because you'll get enough benefit out of them. You are well up the XKCD 'is it worth the time' table. Frequent developers tend to accumulate a halo of tools that make their lives easier and often improve their results; they know about the linters, the checkers, the formatters, and so on.

An infrequent developer is someone who does not fit this profile. Sure, they have some software written in Go, or Python, or using Django, or whatever, but mostly it sits there working and they don't have to think about it very often. They only modify it or rebuild it or update its dependencies or the like once in a while. Since they're only occasional users of a language environment, infrequent developers generally don't maintain expertise in the finer details of the language's operation, although they can probably remember (or look up) how to do the common things and the basics. They won't remember how to deal with the unusual cases, and in fact may never have run into them. Complex procedures will probably have to be re-learned nearly every time they're needed (or re-Googled for).

Since infrequent developers spend relatively little time dealing with the language, setting up and running additional pieces of software is a much higher overhead for them and is generally not worth it if they have a choice. They get hit on both sides compared to frequent developers; they're less familiar with the software so working on it takes longer, and they use the language much less so the same amount of absolute time spend on additional software is proportionally much higher. Infrequent developers object strongly to thing 'just run this caching proxy, it only takes a bit of time to manage'. Overheads that are small to frequent developers loom very big for infrequent ones. Infrequent developers usually do not have the halo of tools that frequent developers do, and mostly stick to the basics (and as a result they miss out on various things).

It's quite easy and natural for a language community to think first and foremost about frequent developers. Frequent developers are your most active and best users, and generally they are the ones that talk to you most, have the most to say, and are the best informed about the current state of affairs and what their options are. But at the same time, focusing on frequent developers is a limited point of view and will cause you to miss what causes pain for infrequent developers. Worse, it can cause you to design only for frequent developers.

If you're only thinking about frequent developers, it's easy to create a system that assumes that of course people will set up this or that software, or that some particular pain point doesn't really matter because everyone will have tools that cover it over, or that a complex procedure is the right answer because of the power it exposes. To pick on something other than Go, it won't matter that your language refuses to mix spaces and tabs because everyone can just run an editor plugin to fix it automatically (or to automatically indent only with spaces).

(As far as complex procedures go, well, Git is famously full of them. And I say this as someone who considers himself in the 'frequent developer' camp with git, including having tools for dealing with it.)

As I mentioned in my aside yesterday, I have wound up feeling that the perspective of these infrequent developers is often overlooked and not widely heard from. I think that this is not a great thing; to summarize why, I think there are probably more infrequent developers for any popular language than you might think.

(The perspective of infrequent developers is similar to beginners in the language, but I don't think it's quite the same and I'm not sure that being beginner friendly will make you friendly to infrequent developers too.)

FrequentVsInfrequentDevs written at 22:50:43; Add Comment

A sysadmin's perspective on Go vendoring and vgo

One big thing in the Go world lately has been Russ Cox's writings on adding package versioning to the core of Go through what is currently being called Versioned Go, vgo for short. His initial plans were for vgo to completely drop Go's current vendoring feature. If you wanted to capture a local copy of your external dependencies, you would have to set up your own proxy server (per his article on modules, vgo would come with one). According to the vgo & vendoring golang-dev thread (via), opinions have since changed on this and the Go team accepts that some form of vendoring will stay. My interest in vendoring is probably different from what normal Go developers care about, so I want to explain my usage case, why vendoring is important to us, and why the initial proxy solution would not have made me very happy.

We are never going to be doing ongoing Go development, with a nice collection of Go programs and tooling that we work on regularly and build frequently. Instead, we're going to have a few programs written in Go because Go is the right language (enough so to overcome our usual policy against it). If we're going to have local software in a compiled language, we need to be able to rebuild it on demand, just in case (otherwise it's a ticking time bomb). More specifically, we want people who aren't Go specialists to be able to reliably rebuild the program following some simple and robust process. The closer the process is to 'copy this directory tree into /tmp, cd there, and run a standard <something> build command', the better.

Today you can get most of the way there with vendoring, but as I discovered this only works if you're working from within a $GOPATH. This is less than ideal because it means that the build instructions are more involved than 'cd here and run go build'. However, setting up a $GOPATH is a lot better than having to find and run an entire proxy just to satisfy vgo. A proxy makes sense if you routinely build Go programs (and running it in that case is not a big deal), but we're only likely to be building this program (or any Go program) once every few years. Adding an entire daemon that we have to run in order to do our builds would not make us happy, and even magic $GOPROXY settings would be kind of a pain (especially if we had to manually populate and maintain the cache directory).

The good news for me is that Russ Cox's posting in golang-dev is pretty much everything I want here. It appears to let me create entirely self contained directory trees (of source code, with no magic binary files) that include the full external dependencies and that can be built with a simple standard command with no setup required.

(This entry was basically overtaken by events. When Russ Cox published his series of articles, my immediate reaction was that I hated losing vendoring and I was going to object loudly in an entry. Now that the Go team has already had enough feedback to change their minds, the entry is less objecting and more trying to explain why I care about this and describe our somewhat unusual perspective on things as what I'll call 'infrequent developers', a perspective that I think is often not widely heard from.)

GoVendoringAndVgo written at 01:47:16; Add Comment

2018-02-28

egrep's -o argument is great for extracting unusual fields

In Unix, many files and other streams of text are nicely structured so you can extract bits of them with straightforward tools like awk. Fields are nicely separated by whitespace (or by some simple thing that you can easily match on), the information you want is only in a single field, and the field is at a known and generally fixed offset (either from the start of the line or the end of the line). However, not all text is like this. Sometimes it's because people have picked bad formats. Sometimes it's just because that's how the data comes to you; perhaps you have full file paths and you want to extract one component of the path that has some interesting characteristic, such as starting with a '.'.

For example, recently we wanted to know if people here stored IMAP mailboxes in or under directories whose name started with a dot, and if they did, what directory names they used. We had full paths from IMAP subscriptions, but we didn't care about the whole path, just the interesting directory names. Tools like awk are not a good match for this; even with 'awk -F/' we'd have to dig out the fields that start with a dot.

(There's a UNIX pipeline solution to this problem, of course.)

Fortunately, these days I have a good option for this, and that is (e)grep's -o argument. I learned about it several years ago due to a comment on this entry of mine, and since then it's become a tool that I reach for increasingly often. What -o does is described by the manpage this way (for GNU grep):

Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.

What this really means is extract and print regular expression based field(s) from the line. The straightforward use is to extract a full field, for example:

egrep -o '(^|/)\.[^/]+' <filenames

This extracts just the directory name of interest (or I suppose the file name, if there is a file that starts with a dot). It also shows that we may need to post-process the result of an egrep -o field extraction; in this case, some of the names will have a '/' on the front and some won't, and we probably want to remove that /.

Another trick with egrep -o is to use it to put fields into consistent places. Suppose that our email system's logs have a variety of messages that can be generated when a sending IP address is in a DNS blocklist. The full log lines vary but they all contain a portion that goes 'sender IP <IP> [stuff] in <DNSBL>'. We would like to extract the sender IP address and perhaps the DNSBL. Plain 'egrep -o' doesn't do this directly, but it will put the two fields we care about into consistent places:

egrep -o 'sender IP .* in .*(DNSBL1|DNSBL2)' <logfile |
   awk '{print $3, $(NF)}'

Another option for extracting fields from in the middle of a large message is to use two or more egreps in a pipeline, with each egrep successively refining the text down to just the bits you're interested in. This is useful when the specific piece you're interested in occurs at some irregular position inside a longer portion that you need to use as the initial match.

(I'm not going to try to give an example here, as I don't have any from stuff I've done recently enough to remember.)

Since you can use grep with multiple patterns (by providing multiple -e arguments), you can use grep -o to extract several fields at once. However the limitation of this is that each field comes out on its own line. There are situations where you'd like all fields from one line to come out on the same line; basically you want the original line with all of the extraneous bits removed (all of the bits except the fields you care about). If you're in this situation, you probably want to turn to sed instead.

In theory sed can do all of these things with the s substitution command, regular expression sub-expressions, and \<N> substitutions to give back just the sub-expressions. In practice I find egrep -o to almost always be the simpler way to go, partly because I can never remember the exact way to make sed regular expressions do sub-expression matching. Perhaps I would if I used sed more often, but I don't.

(I once wrote a moderately complex and scary sed script mostly to get myself to use various midlevel sed features like the pattern space. It works and I still use it every day when reading my email, but it also convinced me that I didn't want to do that again and sed was a write-mostly language.)

In short, any time I want to extract a field from a line and awk won't do it (or at least not easily), I turn to 'egrep -o' as the generally quick and convenient option. All I have to do is write a regular expression that matches the field, or enough more than the field that I can then either extract the field with awk or narrow things down with more use of egrep -o.

PS: grep -o is probably originally a GNU-ism, but I think it's relatively widespread. It's in OpenBSD's grep, for example.

PPS: I usually think of this as an egrep feature, but it's in plain grep and even fgrep (and if I think about it, I can actually think of uses for it in fgrep). I just reflexively turn to egrep over grep if I'm doing anything complicated, and using -o definitely counts.

Sidebar: The Unix pipeline solution to the filename problem

In the spirit of the original spell implementation:

tr '/' '\n' <fullpaths | grep '^\.' | sort -u

All we want is path components, so the traditional Unix answer is to explode full paths into path components (done here with tr). Once they're in that form, we can apply the full power of normal Unix things to them.

EgrepOFieldExtraction written at 23:29:47; Add Comment

2018-02-23

Github and publishing Git repositories

Recently I got into a discussion on Twitter where I mentioned that I'd like a simple way to publish Git repositories on my own web server. You might reasonably ask why I need such a thing, since Github exists and I even use it. For me, a significant part of the answer is social. To put it one way, Github has become a little bit too formal, or at least I perceive it as having done so.

What has done this to Github is that more and more, people will look at your Github presence and form judgements based on what they see. They will go through your list of repositories and form opinions, and then look inside some of the repositories and form more opinions. At least part of this exploration is natural and simply comes from stumbling over something interesting; more than once, I've wound up on someone's repository and wondered what else they work on and if there's anything interesting there. But a certain amount of it is the straightforward and logical consequence of the common view that Github is part of your developer resume. We curate our resumes, and if our Github presence is part of that, well, we're going to curate that too. A public portfolio of work always tries to put your best foot forward, and even if that's not necessarily my goal with my Github presence, I still know that that's how people may take it.

All of this makes me feel uncomfortable about throwing messy experiments and one-off hacks up on Github. If nothing else, they feel like clutter that gets in the way of people seeing (just) the repositories that I'm actively proud of, want to attract attention to, and think that people might find something useful in. Putting something up on Github just so people can get a copy of it feels not so much wrong as out of place; that's not what I use my Github presence for.

(A strongly related issue are the signals that I suspect that your Github presence sends when you file issues in other people's Github repositories. Some of the time people are going to look at your profile, your activities, and your repositories to assess your clue level, especially if you're reporting something tangled and complex. If you want people to take your issues seriously, a presence that signals 'I probably know what I'm doing' is pretty useful.)

A separate set of Git repositories elsewhere, in a less formal space, avoids all of these issues. No one is going to mistake a set of repositories explicitly labeled 'random stuff I'm throwing up in case people want to look' for anything more than that, and to even find it in the first place they would have to go on a much more extensive hunt than it takes to get to my Github presence (which I do link in various places because, well, it's my Github presence, the official place where I publish various things).

Sidebar: What I want in a Git repository publishing program

The minimal thing I need is something you can do git clone and git pull from, because that is the very basic start of publishing a Git repository. What I'd like is something that also gave a decent looking web view as well, with a description and showing a README, so that people don't have to clone a repository just to poke around in it. Truly ideal would be also providing tarball or zip archive downloads. All of this should be read-only; accepting git push and other such operations is an anti-feature.

It would be ideal if the program ran as a CGI, because CGIs are easy to manage and I don't expect much load. I'll live with a daemon that runs via FastCGI, but it can't be its own web server unless it can work behind another web server via a reverse proxy, since I already have a perfectly good web server that is serving things I care a lot more about.

(Also, frankly I don't trust random web server implementations to do HTTPS correctly and securely, and HTTPS is no longer optional. Doing HTTPS well is so challenging that not all dedicated, full scale web servers manage it.)

It's possible that git http-backend actually does what I want here, if I can set it up appropriately. Alternately, maybe cgit is what I want. I'll have to do some experimentation.

GithubAndGitRepoPublishing written at 00:59:49; Add Comment

2018-02-21

Sorting out what exec does in Bourne shell pipelines

Today, I was revising a Bourne shell script. The original shell script ended by running rsync with an exec like this:

exec rsync ...

(I don't think the exec was there for any good reason; it's a reflex.)

I was adding some filtering of errors from rsync, so I fed its standard error to egrep and in the process I removed the exec, so it became:

rsync ... 2>&1 | egrep -v '^(...|...)'

Then I stopped to think about this, and realized that I was working on superstition. I 'knew' that combining exec and anything else didn't work, and in fact I had a memory that it caused things to malfunction. So I decided to investigate a bit to find out the truth.

To start with, let's talk about what we could think that exec did here (and what I hoped it did when I started digging). Suppose that you end a shell script like this:

#!/bin/sh
[...]
rsync ... 2>&1 | egrep -v '...'

When you run this shell script, you'll wind up with a hierarchy of three processes; the shell is the parent process, and then generally the rsync and the egrep are siblings. Linux's pstree will represent this as 'sh───2*[sleep]', and my favorite tool shows it like so:

pts/10   |      17346 /bin/sh thescript
pts/10    |     17347 rsync ...
pts/10    |     17348 egrep ...

If exec worked here the way I was sort of hoping it would, you'd get two processes instead of three, with whatever you exec'd (either the rsync or the egrep) taking over from the parent shell process. Now that I think about it, there are some reasonably decent reasons to not do this, but let's set that aside for now.

What I had a vague superstition of exec doing in a pipeline was that it might abruptly truncate the pipeline. When it go to the exec the shell just did what you told it to, ie exec the process, and since it had turned itself into a process it didn't go on to set up the rest of the pipeline. That would make 'exec rsync ... | egrep' be the same as just 'exec rsync ...', with the egrep effectively ignored. Obviously you wouldn't want that, hence me automatically taking the exec out.

Fortunately this is not what happens. What actually does happen is not quite that the exec is ignored, although that's what it looks like in simple cases. To understand what's going on, I had to start by paying careful attention to how exec is described, for example in Dash's manpage:

Unless command is omitted, the shell process is replaced with the specified program [...]

I have emphasized the important bit. The magic trick is what 'the shell process' is in a pipeline. If we write:

exec rsync ... | egrep -v ...

When the shell gets to processing the exec, what it considers 'the shell process' is actually the subshell running one step of the pipeline, here the subshell that exists to run rsync. This subshell is normally invisible here because for simple commands like this, the (sub)shell will immediately exec() rsync anyway; using exec just instructs this subshell to do what it was already going to do.

We can cause the shell to actually materialize a subshell by putting multiple commands here:

(/bin/echo hi; sleep 120) | cat

If you look at the process tree for this, you'll probably get:

pts/9    |      7481 sh
pts/9     |     7806 sh
pts/9      |    7808 sleep 120
pts/9     |     7807 cat

The subshell making up the first step of the pipeline could end by just exec()ing sleep, but it doesn't (at least in Dash and Bash); once the shell has decided to have a real subshell here, it stays a real subshell.

If you use exec in the context of such an actual subshell, it will indeed replace 'the shell process' of the subshell with the command you exec:

$ (exec echo hi; echo ho) | cat
hi
$

The exec replaced the entire subshell with the first echo, and so it never went on to run the second echo.

(Effectively you've arranged for an early termination of the subshell. There are probably times when this is useful behavior as part of a pipeline step, but I think you can generally use exit and what you're actually doing will be clearer.)

(I'm sure that I once knew all of this, but it fell out of my mind until I carefully worked it out again just now. Perhaps this time around it will stick.)

Sidebar: some of this behavior can vary by shell

Let's go back to '(/bin/echo hi; sleep 120) | cat'. In Dash and Bash, the first step's subshell sticks around to be the parent process of sleep, as mentioned. Somewhat to my surprise, both the Fedora Linux version of official ksh93 and FreeBSD 10.4's sh do optimize away the subshell in this situation. They directly exec the sleep, as if you wrote:

(/bin/echo hi; exec sleep 120) | cat

There's probably a reason that Bash skips this little optimization.

BourneExecInPipeline written at 22:30:27; Add Comment

2018-02-05

I should remember that sometimes C is a perfectly good option

Recently I found myself needing a Linux command that reported how many CPUs are available for you to use. On Linux, the official way to do this is to call sched_getaffinity and count how many 1 bits are set in the CPU mask that you get back. My default tool for this sort of thing these days is Go and I found some convenient support for this (in the golang.org/x/sys/unix package), so I wrote the obvious Go program:

package main
import (
    "fmt"
    "os"
    "golang.org/x/sys/unix"
)

func main() {
    var cpuset unix.CPUSet
    err := unix.SchedGetaffinity(0, &cpuset)
    if err != nil {
        fmt.Printf("numcpus: cannot get affinity: %s\n", err)
        os.Exit(1)
    }
    fmt.Printf("%d\n", cpuset.Count())
}

This compiled, ran on most of our machines, and then reported an 'invalid argument' error on some of them. After staring at strace output for a while, I decided that I needed to write a C version of this so I understood exactly what it was doing and what I was seeing. I was expecting this to be annoying (because it would involve writing code to count bits), but it turns out that there's a set of macros for this so the code is just:

#define _GNU_SOURCE
#include    <sched.h>
#include    <unistd.h>
#include    <stdio.h>
#include    <stdlib.h>

#define MAXCPUS 0x400

int main(int argc, char **argv) {
    cpu_set_t *cpuset;
    cpuset = CPU_ALLOC(MAXCPUS);

    if (sched_getaffinity(0, CPU_ALLOC_SIZE(MAXCPUS), cpuset) < 0) {
        fprintf(stderr, "numcpus: sched_getaffinity: %m\n");
        exit(1);
    }
    printf("%d\n", CPU_COUNT(cpuset));
}

(I think I have an unnecessary include file in there but I don't care. I spray standard include files into my C programs until the compiler stops complaining. Also, I'm using a convenient glibc printf() extension since I'm writing for Linux.)

This compiled, worked, and demonstrated that what I was seeing was indeed a bug in the x/sys/unix package. I don't blame Go for this, by the way. Bugs can happen anywhere, and they're generally more likely to happen in my code than in library code (that's one reason I like to use library code whenever possible).

The Go version and the C version are roughly the same number of lines and wound up being roughly as complicated to write (although the C version fails to check for an out of memory condition that's extremely unlikely to ever happen).

The Go version builds to a 64-bit Linux binary that is 1.1 Mbytes on disk. The C version builds to a 64-bit Linux binary that is 5 Kbytes on disk.

(This is not particularly Go's fault, lest people think that I'm picking on it. The Go binary is statically linked, for example, while the C version is dynamically linked; statically linking the C version results in an 892 Kbyte binary. Of course, in practice it's a lot easier to dynamically link and run a program written in C than in anything else because glibc is so pervasive.)

When I started writing this entry, I was going to say that what I took from this is that sometimes C is the right answer. Perhaps it is, but that's too strong a conclusion for this example. Yes, the C version is the same size in source code and much smaller as a binary (and that large Go binary does sort of offend my old time Unix soul). But if the Go program had worked I wouldn't have cared enough about its size to write a C version, and if the CPU_SET macros didn't exist with exactly what I needed, the C version would certainly have been more annoying to write. And there is merit in focusing on a small set of tools that you like and know pretty well, even if they're not the ideal fit for every situation.

But still. There is merit in remembering that C exists and is perfectly useful and many things, especially low level operating system things, are probably quite direct to do in C. I could probably write more C than I do, and sometimes it might be no more work than doing it in another language. And I'd get small binaries, which a part of me cares about.

(At the same time, these days I generally find C to be annoying. It forces me to care about things that I mostly don't want to care about any more, like memory handling and making sure that I'm not going to blow my foot off.)

PS: I'm a little bit surprised and depressed that the statically linked C program is so close to the Go program in size, because the Go program includes a lot of complex runtime support in that 1.1 Mbytes (including an entire garbage collector). The C program has no such excuses.

CSometimesGoodAnswer written at 23:34:16; Add Comment

2018-01-28

Adding 'view page in no style' to the WebExtensions API in Firefox Quantum

After adding a 'view page in no style' toggle to the Firefox context menu, I still wasn't really satisfied. Having it in the context menu is better than nothing, but I'm very used to using a gesture for it and I really wanted that in Firefox Quantum (even if I'm not using Firefox 57+ today, I'm going to have to upgrade someday). My first hope was that the WebExtensions API exposed some way to call Firefox's sendAsyncMessage(), so I skimmed through the Firefox WebExtensions API documentation. Unsurprisingly, Firefox does not expose such a powerful internal API to WebExtensions, but I did find browser.tabs.toggleReaderMode(). At this point I had a dangerous thought: how difficult would it be to add a new WebExtensions API to Firefox?

It turns out that it's not that difficult and here is how I did it, building on the base of my context menu hack. Since I already have the core code to toggle 'view page in no style' on and off, all I need is a new API function. The obvious place to start is with how toggleReaderMode is specified and implemented, because it should be at least very close to what I need.

Unfortunately there are a lot of mentions of 'toggleReaderMode' in various files, so we need to look around and guess a bit:

; cd mozilla-central
; rg toggleReaderMode
[...]
browser/components/extensions/schemas/tabs.json
991:        "name": "toggleReaderMode",
[...]

The WebExtensions API has to be defined somewhere and looking at tabs.json pretty much confirms that this is it, as you can see from the full listing. The relevant JSON definition starts out:

{
  "name": "toggleReaderMode",
  "type": "function",
  "description": "Toggles reader mode for the document in the tab.",
  "async": true,
  "parameters": [
  [...]

It's notable that this doesn't seem to define any function name to be called when this API point is invoked, which suggests that the function just has the same name. So we can search for some function of that name:

; rg toggleReaderMode
[...]
browser/components/extensions/ext-tabs.js
1045:        async toggleReaderMode(tabId) {
[...]

This is very likely to be what we want, given the file path, and indeed it is. Looking at the full source to toggleReaderMode() shows that it works roughly how I would expect, in that it winds up calling sendAsyncMessage("Reader:ToggleReaderMode"). This means that all we need to turn it into our new toggleNoStyleMode() API is the trivial modification of changing the async message sent, and of course the name.

First, the implementation, more or less blindly copied from toggleReaderMode:

// <cks>: toggle View Page in No Style
async toggleNoStyleMode(tabId) {
  let tab = await promiseTabWhenReady(tabId);
  tab = getTabOrActive(tabId);

  tab.linkedBrowser.messageManager.sendAsyncMessage("PageStyle:Toggle");
},

toggleReaderMode() has code to deal with the possibility that the tab can't be put in Reader mode, which we've taken out; as before, we toggle things unconditionally without caring if it's applicable. This is probably not what you should do for a proper API, but this is a hack.

Having implemented the API function, we now need to make it part of the actual WebExtensions API by changing tabs.json. We use a straight copy of toggleReaderMode's API specification with the name and description changed (the latter just for neatness):

{
  "name": "toggleNoStyleMode",
  "type": "function",
  "description": "Toggles view page in no style mode for the document in the tab.",
  "async": true,
  "parameters": [
    {
      "type": "integer",
      "name": "tabId",
      "minimum": 0,
      "optional": true,
      "description": "Defaults to the active tab of the $(topic:current-window)[current window]."
    }
  ]
},

With my Firefox Quantum rebuilt with these changes included, I could now add a user script to Foxy Gestures to test and use this. Following the examples of common user scripts, especially the 'Go to URL' example, the necessary JavaScript code is:

 executeInBackground(() => {
    getActiveTab(tab => browser.tabs.toggleNoStyleMode(tab.id));
 });

(Since the tabId argument is optional and probably defaults to what I want, I could probably simplify this down to just calling browser.tabs.ToggleNoStyleMode() with no argument and without the getActiveTab() dance. But I'm writing all of this mostly by superstition, so carefully copying an existing working example doesn't hurt.)

Actually doing all of this and testing it immediately showed me a significant limitation of 'view page in no style' in Firefox Quantum, which is that when Firefox disables CSS for a page this way, it really disables all CSS. Including and especially the CSS that Foxy Gestures has to inject in order to show mouse trails for mouse gestures that you're in the process of making. The gestures still work, but I have to make them blindly. This is probably okay for my usage, but it's an unfortunate limitation in general. Perhaps Mozilla would accept a bug report that View → Page Style → No Style also affects addons.

(If I want another option, uMatrix allows you to disable CSS temporarily, but it takes a lot more work. And in Quantum it might still affect addon-injected CSS; I haven't experimented.)

(This and previous entries extend, at great length, my tweets. They definitely took more time to write than it took me to actually do both hacks.)

PS: I don't know how WebExtensions permissions are set up and controlled, but apparently however it works, Foxy Gestures didn't need any new permissions to access my new API. The MDN page on WebExtensions permissions suggests that because I put my new API in browser.tabs, it's available without any special permissions.

FirefoxNewWebExtsAPI written at 22:50:42; Add Comment

Adding 'view page in no style' to Firefox Quantum's context menu

Yesterday I covered the limitations of Firefox's Reader mode, why I like 'view page in no style' enough to make it a personal FireGestures user script gesture, and how that's impossible in Firefox 57+ because WebExtensions doesn't expose an API for it to addons. If I couldn't have this accessible through Foxy Gestures, I decided that I cared enough to hack my personal Firefox build to add something to toggle 'view page in no style' to the popup context menu (so that it'd be accessible purely through the mouse in my setup). Today I'm going to write up more or less how I did it, as a guide to anyone else who wants to do such modifications.

An important but not widely known thing about Firefox that makes this much more feasible than it seems is that a great deal of Firefox's UI (and a certain amount of its actual functionality) is implemented in JavaScript and various readable configuration files (XUL and otherwise), not in C++ code. This makes it much easier to modify and add to these things relatively blindly, and as a bonus you're far less likely to crash Firefox or otherwise cause problems if you get something wrong. I probably wouldn't have attempted this hack if it had involved writing or modifying C++ code, but as it was I was pretty confident I could do it all in JavaScript.

First we need to find out how and where Firefox handles viewing pages in no style. When I'm spelunking in the Firefox code base, my approach is to start from a string that I know is used by the functionality (for example in a menu entry that involves whatever it is that I want) and work backward. Here the obvious string to look for is 'No Style'.

; cd mozilla-central
; rg 'No Style'
[...]
browser/locales/en-US/chrome/browser/browser.dtd
708:<!ENTITY pageStyleNoStyle.label "No Style">

(rg is ripgrep, which has become my go-to tool for this kind of thing. You could use any recursive grep command.)

Firefox tends not to directly use strings in the UI; instead it adds a layer of indirection in order to make translation easier. Looking at browser.dtd shows that it just defines some text setup stuff, with no actual code, so we want to find where pageStyleNoStyle is used. Some more searching finds browser/base/content/browser-menubar.inc, where we can find the following fairly readable configuration text:

<menupopup onpopupshowing="gPageStyleMenu.fillPopup(this);">
  <menuitem id="menu_pageStyleNoStyle"
            label="&pageStyleNoStyle.label;"
            accesskey="&pageStyleNoStyle.accesskey;"
            oncommand="gPageStyleMenu.disableStyle();"
            type="radio"/>
[...]

The important bit here is oncommand, which is the actual JavaScript that gets run to disable the page style. Unfortunately this just runs a function, so we need to search the codebase again to find the function, which is in browser/base/content/browser.js:

 disableStyle() {
   let mm = gBrowser.selectedBrowser.messageManager;
   mm.sendAsyncMessage("PageStyle:Disable");
 },

Well, that's not too helpful, since all it does is send a message that's handled by something else. We need to find where this message is handled, which we can do by searching for 'PageStyle:Disable'. That turns up browser/base/content/tab-content.js, where we find:

var PageStyleHandler = {
 init() {
   addMessageListener("PageStyle:Switch", this);
   addMessageListener("PageStyle:Disable", this);
   [...]

 receiveMessage(msg) {
   switch (msg.name) {
   [...]
     case "PageStyle:Disable":
       this.markupDocumentViewer.authorStyleDisabled = true;
       break;
   [...]

Since I wanted my new context menu entry to toggle this setting, I decided that the simple way was to add a new message for this. That needs one line in init() to register the message:

   addMessageListener("PageStyle:Toggle", this); // <cks>

and then a new case in receiveMessage() to handle it:

     // <cks>
     case "PageStyle:Toggle":
       this.markupDocumentViewer.authorStyleDisabled = !this.markupDocumentViewer.authorStyleDisabled;
       break;

(I tend to annotate my own additions and modifications to Firefox's code so that I can later clearly identify that they're my work. Possibly this case should have a longer comment so that I can later remember why it's there and what might make it unneeded in the future.)

We can definitely disable the page style by setting authorStyleDisabled to true, but it's partly a guess that we can re-enable the current style by resetting authorStyleDisabled to false. However, it's a well informed guess since Firefox 56 and before worked this way (which I know from my FireGestures user script). It's worth trying, though, because duplicating what PageStyle:Switch does would be much more complicated.

Next I need to hook this new functionality up to the context menu, which means that I have to find the context menu. Once again I'll start from some text that I know appears there:

; rg 'Save Page As'
[...]
browser/locales/en-US/chrome/browser/browser.dtd
567:<!ENTITY savePageCmd.label            "Save Page As…">

There's a number of uses of savePageCmd in the Firefox source code, because there's a number of places where you can save the page, but the one I want is in browser/base/content/browser-context.inc (which we can basically guess from the file's name, if nothing else). Here's the full menu item where it's used:

<menuitem id="context-savepage"
          label="&savePageCmd.label;"
          accesskey="&savePageCmd.accesskey2;"
          oncommand="gContextMenu.savePageAs();"/>

At this point I had a choice in how I wanted to implement my new context menu item. As we can see from inspecting the oncommand for this entry (and others), the proper way is to add a new toggleNoStyle() function to gContextMenu that sends our new PageStyle:Toggle message. The hack way is to simply write the necessary JavaScript inline in the oncommand for our new menu entry. Let's do this the proper way, which means we need to find gContextMenu and friends.

Searching for savePageAs and hidePlugin (from another context menu entry) says that they're defined in browser/base/content/nsContextMenu.js. So I added, right after hidePlugin(),

 // <cks> Toggle view page in no style
 toggleNoStyle() {
   let mm = gBrowser.selectedBrowser.messageManager;
   mm.sendAsyncMessage("PageStyle:Toggle");
 },

(This is simply disableStyle() modified to send a different asynchronous message. Using gBrowser here may or may not be entirely proper, but this is a hack and it seems to work. Looking more deeply at other code in nsContextMenu.js suggests that perhaps I should be using this.browser.messageManager instead, and indeed that works just as well as using gBrowser. I'm preserving my improper code here so you can see my mis-steps as well as the successes.)

Now I can finally add in a new context menu entry to invoke this new gContextMenu function. Since this is just a hack, I'm not going to define a new DTD entity for it so it can be translated; I'm just going to stick the raw string in, and it's not going to have an access key. I'm putting it just before 'Save Page As' for convenience and so I don't have to worry that it's in a section of the context menu that might get disabled on me. The new menu item is thus just:

<menuitem id="context-pagestyletoggle"
          label="Toggle Page Style"
          oncommand="gContextMenu.toggleNoStyle();"/>

(I deliberately made the menu string short so that it wouldn't force the context menu to be wider than it already is. Probably I could make it slightly more verbose without problems.)

After rebuilding my local Firefox Quantum tree and running it, I could test that I had a new context menu item and that it did in fact work. I even opened up the browser console to verify that my various bits of JavaScript code didn't seem to be reporting any errors.

(There were lots of warnings and errors from other code, but that's not my problem.)

This is a hack (for reasons beyond a hard-coded string). I've made no attempt to see if the current page has a style that we can or should disable; the menu entry is unconditionally available and it's up to me to use it responsibly and deal with any errors that come up. It's also arguably in the wrong position in the context menu; it should probably really go at the bottom. I just want it more accessible than that, so I don't have to move my mouse as far down.

(Not putting it at the bottom also means that I don't need to worry about how to properly interact with addons that also add context menu entries. Probably there is some easy markup for that in browser-context.inc, but I'm lazy.)

PS: My first implementation of this used direct JavaScript in my oncommand. I changed it to the more proper approach for petty reasons, namely that putting it in this entry would have resulted in an annoyingly too-wide line of code.

FirefoxNoStyleInContext written at 20:28:55; Add Comment

(Previous 10 or go back to January 2018 at 2018/01/18)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.