Sorting out what
exec does in Bourne shell pipelines
Today, I was revising a Bourne shell script. The original shell script
ended by running
rsync with an
exec like this:
exec rsync ...
(I don't think the
exec was there for any good reason; it's a
I was adding some filtering of errors from
rsync, so I fed its
standard error to
egrep and in the process I removed the
so it became:
rsync ... 2>&1 | egrep -v '^(...|...)'
Then I stopped to think about this, and realized that I was working
on superstition. I 'knew' that
exec and anything else didn't work, and in fact I had
a memory that it caused things to malfunction. So I decided to
investigate a bit to find out the truth.
To start with, let's talk about what we could think that
here (and what I hoped it did when I started digging). Suppose that
you end a shell script like this:
#!/bin/sh [...] rsync ... 2>&1 | egrep -v '...'
When you run this shell script, you'll wind up with a hierarchy of
three processes; the shell is the parent process, and then generally
rsync and the
egrep are siblings. Linux's
represent this as '
sh───2*[sleep]', and my favorite tool shows it like so:
pts/10 | 17346 /bin/sh thescript pts/10 | 17347 rsync ... pts/10 | 17348 egrep ...
exec worked here the way I was sort of hoping it would, you'd
get two processes instead of three, with whatever you
rsync or the
egrep) taking over from the parent
shell process. Now that I think about it, there are some reasonably
decent reasons to not do this, but let's set that aside for now.
What I had a vague superstition of
exec doing in a pipeline was
that it might abruptly truncate the pipeline. When it go to the
exec the shell just did what you told it to, ie
exec the process,
and since it had turned itself into a process it didn't go on to
set up the rest of the pipeline. That would make '
... | egrep' be the same as just '
exec rsync ...', with the
egrep effectively ignored. Obviously you wouldn't want that,
hence me automatically taking the
Fortunately this is not what happens. What actually does happen is
not quite that the
exec is ignored, although that's what it looks
like in simple cases. To understand what's going on, I had to start
by paying careful attention to how
exec is described, for example
in Dash's manpage:
Unless command is omitted, the shell process is replaced with the specified program [...]
I have emphasized the important bit. The magic trick is what 'the shell process' is in a pipeline. If we write:
exec rsync ... | egrep -v ...
When the shell gets to processing the
exec, what it considers
'the shell process' is actually the subshell running one step of
the pipeline, here the subshell that exists to run
subshell is normally invisible here because for simple commands
like this, the (sub)shell will immediately
exec just instructs this subshell to do what it was already
going to do.
We can cause the shell to actually materialize a subshell by putting multiple commands here:
(/bin/echo hi; sleep 120) | cat
If you look at the process tree for this, you'll probably get:
pts/9 | 7481 sh pts/9 | 7806 sh pts/9 | 7808 sleep 120 pts/9 | 7807 cat
The subshell making up the first step of the pipeline could end by
sleep, but it doesn't (at least in Dash and
Bash); once the shell has decided to have a real subshell here, it
stays a real subshell.
If you use
exec in the context of such an actual subshell, it
will indeed replace 'the shell process' of the subshell with the
$ (exec echo hi; echo ho) | cat hi $
exec replaced the entire subshell with the first
so it never went on to run the second
(Effectively you've arranged for an early termination of the subshell.
There are probably times when this is useful behavior as part of a
pipeline step, but I think you can generally use
exit and what you're
actually doing will be clearer.)
(I'm sure that I once knew all of this, but it fell out of my mind until I carefully worked it out again just now. Perhaps this time around it will stick.)
Sidebar: some of this behavior can vary by shell
Let's go back to '
(/bin/echo hi; sleep 120) | cat'. In Dash
and Bash, the first step's subshell sticks around to be the parent
sleep, as mentioned. Somewhat to my surprise, both the
Fedora Linux version of official ksh93
and FreeBSD 10.4's
optimize away the subshell in this situation. They directly
sleep, as if you wrote:
(/bin/echo hi; exec sleep 120) | cat
There's probably a reason that Bash skips this little optimization.
(I could more or less do this before in NoScript as a one-off temporary thing, but generally it wasn't quite worth it and I always had lingering concerns. uMatrix lets me set it once and leave it, and then I get to enjoy it afterward.)
PPS: There are some setting differences that also turn out to matter, to my surprise. If you use NoScript in a default-block setup and almost always use temporary permissions, I suggest that you tell NoScript to only reload the current tab on permission changes so that the effects of temporarily allowing something are much more contained. If I had realized how much of a difference it makes, especially with NoScript's global permissions, I would have done it years ago.
Sidebar: Cookie handling also benefits from scoped permissions
I hate Youtube's behavior of auto-playing the next video when I've watched one, because generally I'm only on YouTube to watch exactly one video. You can turn this off, but to make it stick you need to accept cookies from YouTube, which will then quietly follow you around the web anywhere someone embeds some YouTube content. uMatrix's scoped permissions let me restrict when YouTube can see those cookies to only when I'm actually on YouTube looking at a video. I can (and do) do similar things with cookies from Google Search.
(I also have Self-Destructing Cookies set to throw out YouTube's cookies every time I close down Firefox, somewhat limiting the damage of any tracking cookies. This means I have to reset the 'no auto-play' cookie every time I restart Firefox, but I only do that infrequently.)
I've now received my first spam email over IPv6
One of the machines that I run my sinkhole SMTP server on has an IPv6 address, one that's currently present in DNS as the target of an MX record (not literally, of course, it's the AAAA record of the host that's the MX target). Back in October, in a somewhat different DNS setup, it saw its first IPv6 probe, and has seen a number since then. A bit over a week ago it got its first actual spam email over IPv6.
The source IPv6 address is in Asia, which doesn't surprise me; my impression is that APNIC is one of the most active areas for IPv6 usage for various reasons (including IPv4 address exhaustion). The sending host appears to also have an IPv4 address, but apparently it prefers to use IPv6 if it can (which is again not surprising). Its IPv4 address is listed in a couple of DNS blocklists, including b.barracudacentral.org, but is not in Spamhaus or the CBL.
(Spamhaus has an old policy statement about IPv6 that gives an example of querying them for IPv6 addresses. Out of curiosity I tried it for this IPv6 address and not unsurprisingly got nothing. I don't know if Spamhaus, or anyone else, is actually serving IPv6 address DNS blocklist information or if everyone has punted so far.)
The actual spam is your standard variety advance fee fraud spam,
claiming to be from a completely unrelated email address in cox.net
with replies directed to another address at 'net-c.com'. The spam
message claims to be from someone with 'Egmont group, USA', which
probably explains the choice of cox.net as the
From: and sender
(The spammer probably means this Egmont group, which is plausible given the rest of the spam message, which is a typical 'we believe you have been scammed, we have some compensation to give you' thing. Since I didn't know about the Egmont group before this, I can't say that spam isn't educational.)
I have some vague thoughts on IPv6 and spam, but I've decided that they're for another entry. I have seen periodic IPv6 connections, but they appear to mostly be TLS scanners.
(My logs say that Google tried to deliver email over IPv6 back in early December, but I refused it because email from GMail to this sinkhole server is far too likely to be boring spam, usually advance fee fraud attempts. Perhaps I should declare that all spam received over IPv6 is interesting enough to capture.)
Some consumer SSDs are moving to a 4k 'advance format' physical block size
Earlier this month I wrote an entry about consumer SSD nominal physical block sizes, because I'd noticed that almost all of the recent SSDs we had advertised a 512 byte physical block size (the exceptions were Intel 'DC' SSDs). In that entry, I speculated that consumer SSD vendors might have settled on just advertising them as 512n devices and we'd see this on future SSDs too, since the advertised 'physical block size' on SSDs is relatively arbitrary anyways.
Every so often I write a blog entry that becomes, well, let us phrase it as 'overtaken by events'. Such is the case with that entry. Here, let me show you:
$ lsblk -o NAME,TRAN,MODEL,PHY-SEC --nodeps /dev/sdf /dev/sdg NAME TRAN MODEL PHY-SEC sdf sas Crucial_CT2050MX 512 sdg sas CT2000MX500SSD1 4096
The first drive is a 512n 2 TB Crucial MX300. We bought a number of them in the fall for a project, but then Crucial took them out of production in favour of the new Crucial MX500 series. The second drive is a 2TB Crucial MX500 from a set of them that we just started buying to fill out our drive needs for the project. Unlike the MX300s, this MX500 advertises a 4096 byte physical block size and therefor demonstrates quite vividly that the thesis of my earlier entry is very false.
(I have some 750 GB Crucial MX300s and they also advertise 512n physical block sizes, which led to a ZFS pool setup mistake. Fixing this mistake is now clearly pretty important, since if one of my MX300s dies I will probably have to replace it with an MX500.)
My thesis isn't just false because different vendors have made different decisions; this example is stronger than that. These are both drives from Crucial, and successive models at that; Crucial is replacing the MX300 series with the MX500 series in the same consumer market segment. So I already have a case where a vendor has changed the reported physical block size in what is essentially the same thing. It seems very likely that Crucial doesn't see the advertised physical block size as a big issue; I suspect that it's primarily set based on whatever the flash controller being used works best with or finds most convenient.
(By today, very little host software probably cares about 512n versus 4k drives. Advanced format drives have been around long enough that most things are probably aligning to 4k and issuing 4k IOs by default. ZFS is an unusual and somewhat unfortunate exception.)
I had been hoping that we could assume 512n SSDs were here to stay because it would make various things more convenient in a ZFS world. That is now demonstrably wrong, which means that once again forcing all ZFS pools to be compatible with 4k physical block size drives is very important if you ever expect to replace drives (and you should, as SSDs can die too).
PS: It's possible that not all MX500s advertise a 4k physical block size; it might depend on capacity. We only have one size of MX500s right now so I can't tell.
Memories of MGR
I recently got into a discussion of MGR on Twitter (via), which definitely brings back memories. MGR is an early Unix windowing system, originally dating from 1987 to 1989 (depending on whether you go from the Usenix presentation, when people got to hear about it, to the comp.sources.unix, when people could get their hands on it). If you know the dates for Unix windowing systems you know that this overlaps with X (both X10 and then X11), which is part of what makes MGR special and nostalgic and what gave it its peculiar appeal at the time.
MGR was small and straightforward at a time when that was not what other Unix window systems were (I'd say it was slipping away with X10 and X11, but let's be honest, Sunview was not small or straightforward either). Given that it was partially inspired by the Blit and had a certain amount of resemblance to it, MGR was also about as close as most people could come to the kind of graphical environment that the Bell Labs people were building in Research Unix.
(You could in theory get a DMD 5620, but in reality most people had far more access to Unix workstations that you could run MGR on that they did to a 5620.)
On a practical level, you could use MGR without having to set up a complicated environment with a lot of moving parts (or compile a big system). This generally made it easy to experiment with (on hardware it supported) and to keep it around as an alternative for people to try out or even use seriously. My impression is that this got a lot of people to at least dabble with MGR and use it for a while.
Part of MGR being small and straightforward was that it also felt like something that was by and for ordinary mortals, not the high peaks of X. It ran well on ordinary machines (even small machines) and it was small enough that you could understand how it worked and how to do things in it. It also had an appealingly simple model of how programs interacted with it; you basically treated it like a funny terminal, where you could draw graphics and do other things by sending escape sequences. As mentioned in this MGR information page, this made it network transparent by default.
MGR was not a perfect window system and in many ways it was a quite
limited one. But it worked well in the 'all the world's a terminal'
world of the late 1980s and early 1990s, when almost all of what
you did even with X was run
xterms, and it was often much faster
and more minimal than the (fancier) alternatives (like X), especially
on basic hardware.
Thinking of MGR brings back nostalgic memories of a simpler time in Unix's history, when things were smaller and more primitive but also bright and shiny and new and exciting in a way that's no longer the case (now they're routine and Unix is everywhere). My nostalgic side would love a version of MGR that ran in an X window, just so I could start it up again and play around with it, but at the same time I'd never use it seriously. Its day in the sun has passed. But it did have a day in the sun, once upon a time, and I remember those days fondly (even if I'm not doing well about explaining why).
(We shouldn't get too nostalgic about the old days. The hardware and software we have today is generally much better and more appealing.)
lsblk to get extremely useful information about disks
Every so often I need to know the serial number of a disk, generally
because it's the only way to identify one particular disk out of
two (or more) identical ones. As one example, perhaps I need to
replace a failed drive
that's one of a pair. You can get this information from the disks
smartctl, but the process is somewhat annoying if you
just want the serial number, especially if you want it for multiple
(Sometimes you have a dead disk so you need to find it by process of elimination starting from the serial numbers of all of the live disks.)
lsblk for some time to get disk UUIDs and raid UUIDs,
but I never looked very deeply at its other options. Recently I
lsblk can do a lot more, and in particular it can
report disk serial numbers (as well as a bunch of other handy
information) in an extremely convenient form. It's simplest to just
show you an example:
$ lsblk -o NAME,SERIAL,HCTL,TRAN,MODEL --nodeps /dev/sd? NAME SERIAL HCTL TRAN MODEL sda S21NNXCGAxxxxxH 0:0:0:0 sata Samsung SSD 850 sdb S21NNXCGAxxxxxE 1:0:0:0 sata Samsung SSD 850 sdc Zxxxxx4E 2:0:0:0 sata ST500DM002-1BC14 sdd WD-WMC5K0Dxxxxx 4:0:0:0 sata WDC WD1002F9YZ-0 sde WD-WMC5K0Dxxxxx 5:0:0:0 sata WDC WD1002F9YZ-0
(For obscure reasons I don't feel like publishing the full serial numbers of our disks. It might be harmless to do so, but let's not find out otherwise the hard way.)
You can get a full list of possible fields with '
along with generally what they mean, although you'll find that some
of them are less useful than you might guess.
VENDOR is always
'ATA' for me, for example, and
KNAME is the same as
TRAN is usually 'sata', as here, but we have some
machines where it's different. Looking for a
PHY-SEC that's not
512 is a convenient way to find advanced format drives, which may be surprisingly
uncommon in some environments.
SIZE is another surprisingly handy field; if you
know you're looking for a disk of a specific size, it lets you
filter disks in and out without checking serial numbers or even the
specific model, if you have multiple different sized drives from
one vendor such as WD or Seagate.
lsblk to just report on the devices that you
gave it and not also include their partitions, software RAID devices
that use them, and so on.)
lsblk output is great for summarizing all of the
disks on a machine in something that's easy to print out and use.
Pretty much everything I need to know is one spot and I can easily
use this to identify specific drives. I'm quite happy to have
stumbled over this additional use of
lsblk, and I plan to make
much more use of it in the future. Possibly I should routinely
collect this output for my machines and save it away.
(This entry is partly to write down the list of
lsblk fields that
I find useful so I don't have to keep remembering them or sorting
lsblk --help and trying to remember the fields that are
less useful than they sound.)
How I tend to label bad hardware
Every so often I wind up dealing with some piece of hardware that's bad, questionable, or apparently flaky. Hard disks are certainly the most common thing, but the most recent case was a 10G-T network card that didn't like coming up at 10G. For a long time I was sort of casual about how I handled these; generally I'd set them aside with at most a postit note or the like. As you might suspect, this didn't always work out so great.
These days I have mostly switched over to doing this better. We have a labelmaker (as everyone should), so any time I wind up with some piece of hardware I don't trust any more, I stick a label on it to mark it and say something about the issue. Labels that have to go on hardware can only be so big (unless I want to wrap the label all over whatever it is), so I don't try to put a full explanation; instead, my goal is to put enough information on the label so I can go find more information.
My current style of label looks broadly like this (and there's a flaw in this label):
volary 2018-02-12 no 10g problem
The three important elements are the name of the server the hardware came from (or was in when we ran into problems), the date, and some brief note about what the problem was. Given the date (and the machine) I can probably find more details in our email archives, and the remaining text hopefully jogs my memory and helps confirm that we've found the right thing in the archives.
As my co-workers gently pointed out, the specific extra text on this label is less than idea. I knew what it meant, but my co-workers could reasonably read it as 'no problem with 10G' instead of the intended meaning of 'no 10g link', ie the card wouldn't run a port at 10G when connected to our 10G switches. My takeaway is that it's always worth re-reading a planned label and asking myself if it could be misread.
A corollary to labeling bad hardware is that I should also label good hardware that I just happen to have sitting around. That way I can know right away that it's good (and perhaps why it's sitting around). The actual work of making a label and putting it on might also cause me to recycle the hardware into our pool of stuff, instead of leaving it sitting somewhere on my desk.
(This assumes that we're not deliberately holding the disks or whatever back in case we turn out to need them in their current state. For example, sometimes we pull servers out of service but don't immediately erase their disks, since we might need to bring them back.)
Many years ago I wrote about labeling bad disks that you pull out of servers. As demonstrated here, this seems to be a lesson that I keep learning over and over again, and then backsliding on for various reasons (mostly that it's a bit of extra work to make labels and stick them on, and sometimes it irrationally feels wasteful).
PS: I did eventually re-learn the lesson to label the disks in your machines. All of the disks in my current office workstation are visibly labeled so I can tell which is which without having to pull them out to check the model and serial number.
DTrace being GPL (and thrown into a Linux kernel) is just the start
The exciting news of the recent time interval comes from Mark J. Wielaard's dtrace for linux; Oracle does the right thing. To summarize the big news, I'll just quote from the Oracle kernel commit message:
This changeset integrates DTrace module sources into the main kernel source tree under the GPLv2 license. [...]
This is exciting news and I don't want to rain on anyone's parade, but it's pretty unlikely that we're going to see DTrace in the Linux kernel any time soon (either the kernel.org main tree or in distribution versions). DTrace being GPL compatible is just the minimum prerequisite for it to ever be in the main kernel, and Oracle putting it in their kernel only helps move things forward so much.
The first problem is simply the issue of integrating foreign code originally written for another Unix into the Linux kernel. For excellent reasons, the Linux kernel people have historically been opposed to what I've called 'code drops', where foreign code is simply parachuted into the kernel more or less intact with some sort of compatibility layer or set of shims. Getting them to accept DTrace is very likely to require modifying DTrace to be real native Linux kernel code that does things in the Linux kernel way and so on. This is a bunch of work, which means that it requires people who are interested in doing the work (and who can navigate the politics of doing so).
(I wrote more on this general issue when I talked about practical issues with getting ZFS into the main Linux kernel many years ago.)
Oracle could do this work, and it's certainly a good sign that they've at least got DTrace running in their own kernel. But since it is their own vendor kernel, Oracle may have just done a code drop instead of a real port into the kernel. Even if they've tried to do a port, similar efforts in the past (most prominently with XFS) took a fairly long time and a significant amount of work before the code passed muster with the Linux kernel community and was accepted into the main kernel.
A larger issue is whether DTrace would even be accepted in any form. At this point the Linux kernel has a number of tracing systems, so the addition of yet another one with yet another set of hooks and so on might not be viewed with particularly great enthusiasm by the Linux kernel people. Their entirely sensible answer to 'we want to use DTrace' might be 'use your time and energy to improve existing facilities and then implement the DTrace user level commands on top of them'. If Oracle followed through on this, we would effectively still get DTrace in the end (I don't care how it works inside the kernel if it works), but this also might cause Oracle to not bother trying to upstream DTrace. From Oracle's perspective, putting a relatively clean and maintainable patchset into their vendor kernel is quite possibly good enough.
(It's also possible that this is the right answer at a technical level. The Linux kernel probably doesn't need three or four different tracing systems that mostly duplicate each others work, or even two systems that do. Reimplementing the DTrace language and tools on top of, say, kprobes and eBPF would not be as cool as porting DTrace into the kernel, but it might be better.)
Given all of the things in the way of DTrace being in the main kernel, getting it included is unlikely to be a fast process (if it does happen). Oracle is probably more familiar with how to work with the main Linux kernel community than SGI was with XFS, but I would still be amazed if getting DTrace into the Linux kernel took less than a year. Then it would take more time before that kernel started making it into Linux distributions (and before distributions started enabling DTrace and shipping DTrace tools). So even if it happens, I don't expect to be able to use DTrace on Linux for at least the next few years.
(Ironically the fastest way to be able to 'use DTrace' would be for someone to create a version of the language and tools that sat on top of existing Linux kernel tracing stuff. Shipping new user-level programs is fast, and you can always build them yourself.)
PS: To be explicit, I would love to be able to use the DTrace language or something like it to write Linux tracing stuff. I may have had my issues with D, but as far as I can tell it's still a far more casually usable environment for this stuff than anything Linux currently has (although Linux is ahead in some ways, since it's easier to do sophisticated user-level processing of kernel tracing results).
Some things about ZFS block allocation and ZFS (file) record sizes
As I wound up experimentally verifying,
in ZFS all files are stored as a single block of varying size up
to the filesystem's
recordsize, or using multiple recordsize
blocks. For a file under the recordsize, the block size turns
out to be in a multiple of 512 bytes, regardless
of the pool's
ashift or the physical sector size of the drives
the pool is using.
Well, sort of. While everything I've written is true, it also turns out to be dangerously imprecise (as I've seen before). There are actually three different sizes here and the difference between them matters once we start getting into the fine details.
To talk about these sizes, I'll start with some illustrative
output for a file data block, as before:
0 L0 DVA=<0:444bbc000:5000> [L0 ZFS plain file] [...] size=4200L/4200P [...]
The first size of the three is the logical block size, before
compression. This is the first
size= number ('4200L' here, in hex
and L for logical). This is what grows in 512-byte units up to the
recordsize and so on.
The second size is the physical size after compression, if any;
this is the second
size= number ('4200P' here, P for physical).
It's a bit weird. If the file can't be compressed, it is the same
as the logical size and because the logical size goes in 512-byte
units, so does this size, even on
ashift=12 pools. However, if
compression happens this size appears to go by the
means it doesn't necessarily go in 512-byte units. On an
pool you'll see it go in 512-byte units (so you can have a compressed
size of '400P', ie 1 KB), but the same data written in an
pool winds up being in 4 Kb units (so you wind up with a compressed
size of '1000P', ie 4 Kb).
The third size is the actual allocated size on disk, as recorded
in the DVA's asize field (which
is the third subfield in the
DVA portion). This is always in
ashift-based units, even if the physical size is not. Thus you
can wind up with a 20 KB DVA but a 16.5
KB 'physical' size, as in our example (the DVA is '5000' while the
block physical size is '4200').
(I assume this happens because ZFS insures that the physical size is never larger than the logical size, although the DVA allocated size may be.)
For obvious reasons, it's the actual allocated size on disk (the DVA asize) that matters for things like rounding up raidz allocation to N+1 blocks, fragmentation, and whether you need to use a ZFS gang block. If you write a 128 KB (logical) block that compresses to a 16 KB physical block, it's 16 KB of (contiguous) space that ZFS needs to find on disk, not 128 KB.
On the one hand, how much this matters depends on how compressible your data is and much modern data isn't (because it's already been compressed in its user-level format). On the other hand, as I found out, 'sparse' space after the logical end of file is very compressible. A 160 KB file on a standard 128 KB recordsize filesystem takes up two 128 KB logical blocks, but the second logical block has 96 KB of nothingness at the end and that compresses down to almost nothing.
PS: I don't know if it's possible to mix vdevs with different
ashifts in the same pool. If it is, I don't know how ZFS would
ashift to use for the physical block size. The minimum
ashift in any vdev? The maximum
(This is the second ZFS entry in a row where I thought I knew what was going on and it was simple, and then discovered that I didn't and it isn't.)
Writing my first addon for Firefox wasn't too hard or annoying
(The pre-WebExtensions NoScript does a lot of magic, which has good and bad aspects. uMatrix is a lot more focused and regular.)
To cut a long story short, today I wrote a Firefox WebExtensions-based addon to fix this, which I have imaginatively called gsearch-urlfix. It's a pretty straightforward fix because Google embeds the original URL in their transformed URL as a query parameter, so you just pull it out and rewrite the link to it. Sane people would probably do this as a GreaseMonkey user script, but for various reasons I decided it was simpler and more interesting to write an addon.
The whole process was reasonably easy. Mozilla has very good
that will walk you through most of the mechanics of an addon, and
it's easy enough to test-load your addon into a suitable Firefox
up to me, which made me a bit worried about needing to play around
with regular expressions and string manipulation and parsing URLs,
and objects that do all of the hard work; all I had to do was glue
them together correctly. I had to do a little bit of debugging
because of things that I got wrong, but
console.log() worked fine
to give me my old standby of print based debugging.
There are a couple of things about your addon
the MDN site won't tell you directly. The first is that if you want
to make your addon into an unsigned XPI and load it permanently
into your developer or nightly Firefox, it must have an
(see the example here
and the discussion here).
The second is that the
matches globs for what websites your content
scripts are loaded into cannot be used to match something like 'any
.google. in it'; they're very limited.
I assume that this restriction is there because
matches feeds into the
permissions dialog for your addon.
(It's possible to have Firefox further filter what sites your content
scripts will load into, see here,
but the design of the whole system insures that your content scripts can
only be loaded into fewer websites than the user approved permissions for,
not more. If you need to do fancy matching, or even just
you'll probably have to ask for permission for all websites.)
This limitation is part of the reason why gsearch-urlfix currently only acts on www.google.com and www.google.ca; those are the two that I need and going further is just annoying enough that I haven't bothered (partly because I want to actually limit it to Google's sites, not have it trigger on anyone who happens to have 'google' as part of their website name). Pull requests are welcome to improve this.
I initially wasn't planning to submitting this to AMO to be officially signed so it can be installed in normal Firefox versions; among other things, doing so feels scary and probably calls for a bunch of cleanup work and polish. I may change my mind about that, if only so I can load it into standard OS-supplied versions of Firefox that I wind up using. Also, I confess that it would be nice to not have my own Firefox nag at me about the addon being unsigned, and the documentation makes the process sound not too annoying.
(This is not an addon that I imagine there's much of an audience for, but perhaps I'm wrong.)