2020-05-22
Mixed feelings about Firefox Addons' new non-Recommended extensions warning
I don't look at addons on addons.mozilla.org very often, so I didn't know until now that Mozilla has started showing a warning on the page for many addons, such as Textern (currently), to the effect, well, let me just quote what I see now (more or less):
[! icon] This is not monitored for security through Mozilla's Recommended Extensions program. Make sure you trust it before installing.
Learn more
(Textern is among the Firefox addons that I use.)
This has apparently been going on since at least the start of March, per this report, or even further back (reddit), so I'm late to the party here.
On the one hand, I can see why Mozilla is doing this. Even in their more limited WebExtensions form, Firefox addons can do a great deal of damage to the security and privacy of the people who use them, and Mozilla doesn't have the people (or the interest) to audit them all or keep a close eye on what they're doing. Firefox addons aren't quite the prominent target that Chrome addons are, but things like the "Stylish" explosion demonstrates that people are happy to target Firefox too. What happened with Stylish also fairly convincingly demonstrates that requiring people to approve addon permissions isn't useful in practice, for various reasons.
On the other hand, this is inevitably going to lead to two bad outcomes. First, some number of people will be scared away from perfectly fine addons that simply aren't popular enough for Mozilla to bring them into the Recommended Extensions program. The second order consequence is that getting people to use a better version of an existing addon has implicitly gotten harder if the existing addon is a 'Recommended Extension'; yours may be better, but it also has a potentially scary warning on it.
(Arguably this is the correct outcome from a security perspective; yours may be better, but it's not necessarily enough better to make up for the increased risk of it not being more carefully watched.)
Second, some number of people will now be trained to ignore another security related warning because in practice it's useless noise to them. I think that this is especially likely if they've been directly steered to an addon by a recommendation or plug from somewhere else, and aren't just searching around on AMO. If you're searching on AMO for an addon that does X, the warning may steer you to one addon over another or sell you on the idea that the risk is too high. If you've come to AMO to install specific addon Y because it sounds interesting, well, the warning is mostly noise; it is a 'do you want to do this thing you want to do' question, except it's not even a question.
(And we know how those questions get answered; people almost always say 'yes I actually do want to do the thing I want to do'.)
Unfortunately I think this is a case where there is no good answer. Mozilla can't feasibly audit everything, they can't restrict AMO to only Recommended Extensions, and they likely feel that they can't just do nothing because of the harms to people who use Firefox Addons, especially people who don't already understand the risks that addons present.
Working out how frequently your ICMP pings fail in Prometheus
Suppose, not hypothetically, that your Prometheus setup pings a bunch of machines (through the blackbox exporter) and some of those pings seem to fail some of the time. If they fail continuously for long enough, you'll raise an alert, but beyond that you may want to know how often they've flaked out over some time period for use in a Grafana dashboard. Today, I wanted this both as a failure percentage and then as a count of how many pings had failed.
Our Blackbox setup reports ICMP results as a probe_success
metric with a probe="icmp"
label (among others); it has a 1
value if the probe was successful and a 0 value if it failed. For
such metrics that are either 1 or 0, the classical way to determine
the percentage of time they're up (or successful) is to use
avg_over_time
, as covered in Robust Perception's "What
percentage of time is my service down for?".
The straightforward PromQL
query for 'what percent is this down' as a 0.0 to 1.0 value is thus:
1 - avg_over_time( probe_success{ probe="icmp" }[$__range] )
(This uses the Grafana range variable, covered here in the 'Using interval and range variables' section.)
However, I forgot this when I was initially setting our new ping status dashboard today and used a different approach. In general if you're looking for the 0.0 to 1.0 percentage of a subset of your data, you want the subcount divided by the total count. The total count of ICMP ping probes over time is:
count_over_time( probe_success{ probe="icmp" }[$__range] )
Because successful ping probes have a value of 1, we can get the
count of them with sum_over_time
, making the full expression
be:
1 - ( sum_over_time( probe_success{ probe="icmp" }[$__range] ) / count_over_time( probe_success{ probe="icmp" }[$__range] ) )
Of course, this 'sum / count' is just the average and so we can
replace this with the more efficient avg_over_time
expression.
(We don't want to use a subquery to count
up how many times probe_success
was zero, because a subquery
won't necessarily get the same number of metric points as
count_over_time
will. You might even have different Blackbox
ping frequencies for different targets.)
This version of our 'percentage of pings that failed' expression points the way to giving us the total number of failed pings. This is the total number of pings minus the successful pings, which is the parts of our complicated percentage expression flipped around:
count_over_time( probe_success{ probe="icmp" }[$__range] - sum_over_time( probe_success{ probe="icmp" }[$__range] )
Note that this is not the amount of time that pings were failing for. In general, it's impossible to work out a completely accurate number for that for various reasons, including that we may have metric points that are missing entirely for whatever reason. If we assume that we have metric points evenly covering the entire time range, the amount of time (in seconds) which pings were failing for is the total range of time in seconds times the 0.0 to 1.0 failure percentage. In Grafana again, this would be:
$__range_s * (1 - avg_over_time( probe_success{ probe="icmp" }[$__range] ))
(There may be a better way to compute this. I haven't thought much about it because I think 'amount of time down' is misleading here in a way that 'percentage of pings that failed' is not.)