Wandering Thoughts archives


My Firefox addons as of Firefox '74' (the current development version)

As I write this, Firefox 72 is the just released version of Firefox and 73 is in beta, but my primary Firefox is still a custom hacked version that I build from the development tree, so it most closely corresponds to what will be released as Firefox 74 in a certain amount of time (I've lost track of how fast Firefox makes releases). Since it's been about ten versions of Firefox (and more than a year) since the last time I covered my addons, it's time for another revisit of this perennial topic. Many of the words will be familiar from the last time, because my addons seem to have stabilized now.

My core addons, things that I consider more or less essential for my experience of Firefox, are:

  • Foxy Gestures (Github) is probably still the best gestures extension for me for modern versions of Firefox (but I can't say for sure, because I no longer investigate alternatives).

    (I use some custom gestures in my Foxy Gestures configuration that go with some custom hacks to my Firefox to add support for things like 'view page in no style' as part of the WebExtensions API.)

  • uBlock Origin (Github) is my standard 'block ads and other bad stuff' extension, and also what I use for selectively removing annoying elements of pages (like floating headers and footers).

  • uMatrix (Github) is my primary tool for blocking Javascript and cookies. uBlock Origin could handle the Javascript, but not really the cookies as far as I know, and in any case uMatrix gives me finer control over Javascript which I think is a better fit with how the web does Javascript today.

  • Cookie AutoDelete (Github) deals with the small issue that uMatrix doesn't actually block cookies, it just doesn't hand them back to websites. This is probably what you want in uMatrix's model of the world (see my entry on this for more details), but I don't want a clutter of cookies lingering around, so I use Cookie AutoDelete to get rid of them under controlled circumstances.

    (However unaesthetic it is, I think that the combination of uMatrix and Cookie AutoDelete is necessary to deal with cookies on the modern web. You need something to patrol around and delete any cookies that people have somehow managed to sneak in.)

  • Stylus has become necessary for me after Google changed their non-Javascript search results page to basically be their Javascript search results without Javascript, instead of the much nicer and more useful old version. I use Stylus to stop search results escaping off the right side of my browser window.

Additional fairly important addons that would change my experience if they weren't there:

  • Textern (Github) gives me the ability to edit textareas in a real editor. I use it all the time when writing comments here on Wandering Thoughts, but not as much as I expected on other places, partly because increasingly people want you to write things with all of the text of a paragraph run together in one line. Textern only works on Unix (or maybe just Linux) and setting it up takes a bit of work because of how it starts an editor (see this entry), but it works pretty smoothly for me.

    (I've changed its key sequence to Ctrl+Alt+E, because the original Ctrl+Shift+E no longer works great on Linux Firefox; see issue #30. Textern itself shifted to Ctrl+Shift+D in recent versions.)

  • Open in Browser (Github) allows me to (sometimes) override Firefox's decision to save files so that I see them in the browser instead. I mostly use this for some PDFs and some text files. Sadly its UI isn't as good and smooth as it was in pre-Quantum Firefox.

  • Cookie Quick Manager (Github) allows me to inspect, manipulate, save, and reload cookies and sets of cookies. This is kind of handy every so often, especially saving and reloading cookies.

The remaining addons I use I consider useful or nice, but not all that important on the large scale of things. I could lose them without entirely noticing the difference in my Firefox:

  • Certainly Something (Github) is my TLS certificate viewer of choice. I occasionally want to know the information it shows me, especially for our own sites.

  • HTTP/2 Indicator (Github) does what it says; it provides a little indicator as to whether HTTP/2 was active for the top-level page.

  • Link Cleaner cleans the utm_ fragments and so on out of URLs when I follow links. It's okay; I mostly don't notice it and I appreciate the cleaner URLs.

    (It also prevents some degree of information leakage to the target website about where I found their link, but I don't really care about that. I'm still sending Referer headers, after all.)

  • HTTPS Everywhere, basically just because. But in a web world where more and more sites are moving to using things like HSTS, I'm not sure HTTPS Everywhere is all that important any more.

Some of my previous extensions have stopped being useful since last time. They are:

  • My Google Search URL Fixup, because Google changed its search pages (as covered above for Stylus) and it became both unnecessary and non-functional. I should probably update its official description to note this, but Google's actions made me grumpy and lazy.

  • Make Medium Readable Again (also, Github) used to deal with a bunch of annoyances for Medium-hosted stuff, but then Medium changed their CSS and it hasn't been updated for that. I can't blame the extension's author; keeping up with all of the things that sites like Medium do to hassle you is a thankless and never-ending job.

I still have both of these enabled in my Firefox, mostly because it's more work to remove them than to let them be. In the case of MMRA, perhaps its development will come back to life again and a new version will be released.

(There are actually some branches in the MMRA Github repo and a bunch of forks of it, some of which are ahead of the main one. Possibly people are quietly working away on this.)

I have some Firefox profiles that are for when I want to use Javascript (they actually use the official Mozilla Linux Firefox release these days, which I just updated to Firefox 72). In these profiles, I also use Decentraleyes (also), which is a local CDN emulation so that less of my traffic is visible to CDN operators. I don't use it in my main Firefox because I'm not certain how it interacts with me blocking (most) Javascript setup, and also much of what's fetched from CDNs is Javascript, which obviously isn't applicable to me.

(There are somewhat scary directions in the Decentraleyes wiki on making it work with uMatrix. I opted to skip them entirely.)

web/Firefox74Addons written at 01:56:48; Add Comment


eBPF based tools are still a work in progress on common Linuxes

These days, it seems that a lot of people are talking about and praising eBPF and tools like BCC and bpftrace, and you can read about places such as Facebook and Netflix routinely using dozens of eBPF programs on systems in production. All of these are true, by and large; if you have the expertise and in the right environment, eBPF can do great things. Unfortunately, though, a number of things get in the way of more ordinary sysadmins being able to use eBPF and these eBPF tools for powerful things.

The first problem is that these tools (and especially good versions of these tools) are fairly recent, which means that they're not necessarily packaged in your Linux distribution. For instance, Ubuntu 18.04 doesn't package bpftrace or anything other than a pretty old version of BCC. You can add third party repositories to your Ubuntu system to (try to) fix this, but that comes with various sorts of maintenance problems and anyway a fair number of nice eBPF features also require somewhat modern kernels. Ubuntu LTS's standard server kernel doesn't necessarily qualify. The practical result is that eBPF is off the table for us until 20.04 or later, unless we have a serious enough problem that we get desperate.

(Certainly we're very unlikely to try to use eBPF on 18.04 for the kinds of routine monitoring and so on that Facebook, Netflix, and so on use it for.)

Even on distributions with recent packages, such as Fedora, you can run into issues where people working in the eBPF world assume you're in a very current environment. The Cloudflare ebpf_exporter (also) is a great way to get things like local disk latency histograms into Prometheus, but the current code base assumes you're using a version of BCC that was released only in October. That's a bit recent, even for Fedora.

(The ebpf_exporter does have pre-built release binaries available, so that's something.)

Then there's the fact that sometimes all of this is held together with unreliable glue because it's not really designed to all work together. Fedora has just updated the Fedora 31 to be a 5.4.x kernel, and now all BCC programs (including examples) fail to compile with a stream of reports about "error: expected '(' after 'asm'" being reported for various bits of the 5.4 kernel headers. Based on some Internet reading, this is apparently a sign of clang attempting to interpret inline assembly things that were written for gcc (which is what the Linux kernel is compiled with). Probably this will get fixed at some point, but for now Fedora people get to choose either 5.4 or BCC but not both.

(bpftrace still works on the Fedora 5.4 kernel, at least in light testing.)

Finally, there's the general problem (shared with DTrace on Solaris and Illumos) that a fair number of the things you might be interested in require hooking directly into the kernel code and the Linux kernel code famously can change all the time. My impression is that eBPF is slowly getting more stable tracepoints over time, but also that a lot of the time you're still directly attaching eBPF hooks to kernel functions.

In time, all of this will settle down. Both eBPF and the eBPF tools will stabilize, current enough versions of everything will be in all common Linux distributions, even the long term support versions, and the kernel will have stable tracepoints and so on that cover most of what you need. But that's not really the state of things today, and it probably won't be for at least a few years to come (and don't even ask about Red Hat Enterprise 7 and 8, which will be around for years to come in some places).

(This more or less elaborates on a tweet of mine.)

linux/EBPFStillInProgress written at 00:03:38; Add Comment


How I move files between iOS devices and Unix machines (using SSH)

Suppose, not hypothetically, that you're a Unix person with some number of iOS devices, such as a phone and a tablet, and you wind up with files in one environment that you would like to move to or access from the other. On the iOS devices you may have photos and videos you want to move to Unix to deal with them with familiar tools, and on Unix you may have files that you edit or read or refer to and you'd like to do that on your portable devices too. There are a variety of ways of doing this, such as email and Nextcloud, but the way I've come around to is using SSH (specifically SFTP) through the Secure Shellfish iOS app.

Secure Shellfish's fundamental pitch is nicely covered by its tagline of 'SSH file transfers on iOS' and its slightly longer description of 'SSH and SFTP support in the iOS Files app', although the Files app is not the only way you can use it. Its narrow focus makes it pleasantly minimalistic and quite straightforward, and it works just as it says it does; it uses SFTP to let you transfer files between a Unix account (or anything that supports SFTP) and your iOS devices, and also to look at and modify in place Unix files from iOS, through Files-aware programs like Textastic. As far as (SSH) authentication goes, it supports both passwords and SSH keys (these days it will generate RSA keys and supports importing RSA, ECDSA, and ed25519 keys).

If the idea of theoretically allowing Secure Shellfish full access to your Unix account makes you a bit nervous, there are several things you can do. On machines that you fully control, you can set up a dedicated login that's used only for transferring things between your Unix machine and your iOS devices, so that they don't even have access to your regular account and its full set of files. Then, if you use SSH keys, you can set your .ssh/authorized_keys to force the Secure Shellfish key to always run the SFTP server instead of allowing access to an ordinary shell. For example:

command="/usr/libexec/openssh/sftp-server",restrict ssh-rsa [...]

(sftp-server has various command line flags that may be useful here for the cautious. As I found out the hard way, different systems have different paths to sftp-server, and you don't get good diagnostics from Secure Shellfish if you get it wrong. On at least some versions of OpenSSH, you can use the special command name 'internal-sftp' to force use of the built-in SFTP server, but then I don't think you can give it any command line flags.)

To avoid accidents, you can also configure an initial starting directory in Secure Shellfish itself and thereby restrict your normal view of the Unix account. This can also be convenient if you don't want to have to navigate through a hierarchy of directories to get to what you actually want; if you know you're only going to use a particular server you configure to work in some directory, you can just set that up in advace.

As I've found, there are two ways to transfer iOS things like photos to your Unix account with Secure Shellfish. In an iOS app such as Photos, you can either directly send what you want to transfer to Secure Shellfish in the strip of available apps (and then pick from there), or you can use 'Save to Files' and then pick Secure Shellfish and go from there. The advantage and drawback of directly picking Secure Shellfish from the app strip is that your file is transferred immediately and that you can't do anything more until the transfer finishes. If you 'save to files', your file is transferred somewhat asynchronously. As a result, if you want to immediately do something with your data on the Unix side and it's a large file, you probably want to use the app route; at least you can watch the upload progress and know immediately when it's done.

(Secure Shellfish has a free base version and a paid 'Pro' upgrade, but I honestly don't remember what's included in what. If it was free when I initially got it, I upgraded to the Pro version within a very short time because I wanted to support the author.)

PS: Secure Shellfish supports using jump (SSH) servers, but I haven't tested this and I suspect that it doesn't go well with restricting your Secure Shellfish SSH key to only doing SFTP.

tech/IOSUnixFileTransfer written at 00:45:26; Add Comment


Why I prefer the script exporter for exposing script metrics to Prometheus

Suppose that you have some scripts that you use to extract and generate Prometheus metrics for targets, and these scripts run on your Prometheus server. These metrics might be detailed SNTP metrics of (remote) NTP servers, IMAP and POP3 login performance metrics, and so on. You have at least three methods to expose these script metrics to Prometheus; you can run them from cron and publish through either node_exporter's textfile collector or Pushgateway, or you can use the third part script_exporter to run your scripts in response to Prometheus scrape requests (and return their metrics). Having used all three methods to generate metrics, I've come to usually prefer using the script exporter except in one special case.

Conceptually, in all three methods you're getting metrics from some targets. In the cron-based methods, what targets you're getting what metrics from (and how frequently) is embedded in and controlled by scripts, cron.d files, and so on, not in your Prometheus configuration the way your other targets are. In the script exporter method, all of that knowledge of targets and timing is in your Prometheus configuration, just like your other targets. And just like other targets, you can configure additional labels on some of your script exporter scrapes, or have different timings, or so on, and it's all controlled in one place. If some targets need some different checking options, you can set that in your Prometheus configuration as well.

You can do all of this with cron based scripts, but you start littering your scripts and cron.d files and so on with special cases. If you push it far enough, you're basically building your own additional set of target configurations, per-target options, and so on. Prometheus already has all of that ready for you to use (and it's not that difficult to make it general with the usual tricks, or the label based approach).

There are two additional benefits from directly scraping metrics. First, the metrics are always current instead of delayed somewhat by however long Prometheus takes to scrape Pushgateway or the host agent. Related to this, you get automatic handling of staleness if something goes wrong and scrapes start failing. Second, you have a directly exposed metric for whether the scrape worked or whether it failed for some reason, in the form of the relevant up and script_success metrics. With indirect scraping you have to construct additional things to generate the equivalents.

The one situation where this doesn't work well is when you want a relatively slow metric generation interval. Because you're scraping directly, you have the usual Prometheus limitation where it considers any metric more than five minutes old to be stale. If you want to do your checks and generate your metrics only once every four or five minutes or slower, you're basically stuck publishing them indirectly so that they won't regularly disappear as stale, and this means one of the cron-based methods.

sysadmin/PrometheusScriptExporterWhy written at 01:57:07; Add Comment


Three ways to expose script-created metrics in Prometheus

In our Prometheus environment, we've wound up wanting (and creating) a bunch of custom metrics that are most naturally created through a variety of scripts. Some of these are general things that are simply not implemented to our tastes in existing scripts and exporters, such as SMART disk metrics, and some of these are completely custom metrics for our environment, such as our per-user, per-filesystem disk space usage information or information from our machine room temperature sensors (which come from an assortment of vendors and have an assortment of ways of extracting information from them). When you're generating metrics in scripts, you need to figure out how to get these metrics from the script into Prometheus. I know of three different ways to do this, and we've used all three.

The first and most obvious way is to have the script publish the metrics to Pushgateway. This requires very little from the host that the script is running on; it has to be able to talk to your Pushgateway host and it needs a HTTP client like curl or wget. This makes Pushgateway publication the easiest approach when you're running as little as possible on the script host. It has various drawbacks that can be boiled down to 'you're using Pushgateway', such as you having to manually check for metrics going stale because the script that generates them is now failing.

On servers where you're running node_exporter, the Prometheus host agent, the simplest approach is usually to have scripts expose their metrics through the textfile collector, where they write a text file of metrics into a particular directory. We wrote a general wrapper script to support this, which handles locking, writing the script's output to a temporary file, and so on, so that our metrics generation scripts only have to write everything to standard output and exit with a success status.

(If a script fails, our wrapper script removes that particular metrics text file to make the metrics go stale. Now that I'm writing this entry, I've realized that we should also write a script status metric for the script's exit code, so we can track and alert on that.)

Both of these methods generally run the scripts through cron, which generally means that you generate metrics at most once a minute and they'll be generated at the start of any minute that the scripts run on. If you scrape your Pushgateway and your host agents frequently, Prometheus will see updated metrics pretty soon after they're generated (a typical host agent scrape interval is 15 seconds).

The final way we expose metrics from scripts is through the third party script_exporter daemon. To quote its Github summary, it's a 'Prometheus exporter to execute scripts and collect metrics from the output or the exit status'. Essentially it's like the Blackbox exporter, except instead of a limited and hard-coded set of probes you have a whole collection of scripts generating whatever metrics that you want to write and configure. The script exporter lets these scripts take parameters, for example to select what target to work on (how this works is up to each script to decide).

Unlike the other two methods, which are mostly configured on the machines running the scripts, generating metrics through the script exporter has to be set up in Prometheus by configuring a scrape configuration for it with appropriate targets defined (just like for Blackbox probes). This has various advantages that I'm going to leave for another entry.

Because you have to set up an additional daemon for the script exporter, I think it works best for scripts that you don't want to run on multiple hosts (they can target multiple hosts, though). In this it's much like the Blackbox exporter; you normally run one Blackbox exporter and use it to check on everything (or a few of them if you need to check from multiple vantage points or check things that are only reachable from some hosts). You certainly could run a script exporter on each machine and doing so has some advantages over the other two ways, but it's likely to be more work compared to using the textfile collector or publishing to Pushgateway.

(It also has a different set of security issues, since the script exporter has to be exposed to scraping from at least your Prometheus servers. The other two approaches don't take outside input in any way; the script exporter minimally allows the outside to trigger specific scripts.)

PS: All of these methods assume that your metrics are 'the state of things right now' style metrics, where it's harmless and desired to overwrite old data with new data. If you need to accumulate metrics over time that are generated by scripts, see using the statsd exporter to let scripts update metrics.

sysadmin/PrometheusScriptMetricsHow written at 00:49:25; Add Comment


How job control made the SIGCHLD signal useful for (BSD) Unix

On modern versions of Unix, the SIGCHLD signal is sent to a process when one of its child processes terminates or some other status changes happen. Catching SIGCHLD is a handy way to find out about child processes exiting while your program is doing other things, and ignoring it will make them vanish instead of turning into zombies. Despite all of these useful things, SIGCHLD was not in V7 Unix; it was added in 4.2 BSD and independently in System III (as SIGCLD).

In V7, programs like the shell had fairly straightforward handling of waiting for children that was basically synchronous. When they ran something (or you ran the 'wait' shell builtin), they called wait() until either it returned an error or possibly the process ID they were looking for came up. If you wanted to interrupt this, you used ^C and the shell's signal handler for SIGINT did some magic. This was sufficient in V7 because the V7 shell didn't really need to know about changes in child status except when you asked for it.

When BSD added job control, it also made it so that background programs that tried to do terminal input or output could be automatically suspended. This is necessary when programs can move between being foreground programs and background ones; you might start a program in the foreground, background it when it takes too long to respond, and then want to foreground it again once it's ready to talk to you. However, adding this feature means that a job control shell now needs to know about process state changes asynchronously. If the shell is waiting at the shell prompt for you to enter the next command (ie, reading from the terminal itself and blocked in read()) and a background program gets suspended for terminal activity, you probably want the shell to tell you about this right away, not when you next provide a line of input to the shell. To get this immediate notification, you need a signal that's sent when child processes change their status, including when they get suspended this way. That's SIGCHLD.

Broadly, SIGCHLD enables programs to react to their children exiting even when they're doing other things too. This is useful in general, which is probably why System III added its own version even though it didn't have job control.

(Shells that use job control don't have to immediately react this way, and some don't. It may even depend on shell settings, which is sensible; some people don't like getting interrupted by shell messages when they're typing a command or thinking, and would rather tap the Return key when they want to see.)

PS: 4.2 BSD also introduced wait3(), which for the first time allowed processes to check for child process status without blocking. This is what you'd use in a shell if you're only checking for exited and suspended children right before you print the next prompt.

unix/SIGCHLDAndJobControl written at 02:46:19; Add Comment


The good and bad of errno in a traditional Unix environment

I said recently in passing that errno was a generally good interface in pre-threading Unix. That may raise some eyebrows, so today let's talk about the good and the bad parts of errno in a traditional Unix environment, such as V7 Unix.

The good part of errno is that it's about the simplest thing that can work to provide multiple values from system calls in C, which doesn't directly have multi-value returns (especially in early C). Using a global variable to 'return' a second value is about the best you can do in basic C unless you want to pass a pointer to every system call and C library function that wants to provide an errno value (this would include much of stdio, for example). Passing around such a pointer all the time doesn't just create uglier code; it also creates more code and more stack (or register) usage for the extra parameter.

(Modern C is capable of tricks like returning a two-element structure in a pair of registers, but this is not the case for the older and simpler version of C used by Research Unix through at least V7.)

Some Unix C library system call functions in V7 could have returned error values as special values, but perhaps not all of them (V7 didn't allow many files, but it did have quite constrained address space on the PDP-11 series). Even if they did, this would result in more code when you actually had to check the return value of things like open() or sbrk(), since the C code would have had to check the range or other qualities of the return value.

(The actual system calls in V7 Unix and before used an error signaling method designed for assembly language, where the kernel arranged to return either the system call result or the error number in register r0 and set a condition code depending on which it was. You can read this documented in eg the V4 dup manpage, where Unix still had significant assembly language documentation. The V7 C library arranged to turn this into setting errno and returning -1 on errors; see eg libc/sys/dup.s along with libc/crt/cerror.s.)

The bad part of errno is that it's a single global value, which means that it can be accidentally overwritten by new errors between the time it's set by a failing system call and when you want to use it. The simple way to have this happen is just to do another failing system call in your regular code, either directly or indirectly. A classical mistake was to do something that checked whether standard output (or standard error) was a terminal by trying to do a TTY ioctl() on it; when the ioctl failed, your original errno value would be overwritten by ENOTTY and the reason your open() or whatever failed would be listed as the mysterious 'not a typewriter' message (cf).

Even if you avoided this trap you could have issues with signals, since signals can interrupt your program at arbitrary points, including immediately after you returned from a system call and before you've looked at errno. These days you're basically not supposed to do anything in signal handlers, but in the older days of Unix it was common to perform any number of things in them. Especially, for instance, a SIGCHLD handler might call wait() to collect the exit status of children until it failed with some errno, which would overwrite your original one if the timing was bad. A signal handler could arrange to deal with this if the programmer remembered the issue, but you might not; people often overlook timing races, especially if they have narrow windows and rarely happen.

(SIGCHLD wasn't in V7, but it was in BSD; this is because BSD introduced job control, which made it necessary. But that's another entry.)

On the whole I think that errno was a good interface for the constraints of traditional Unix, where you didn't have threads or good ways of returning multiple values from C function calls. While it had drawbacks and bad sides, it was generally possible to work around them and they usually didn't come up too often. The errno API only started to get really awkward when threads were introduced and you could have multiple things making system calls at once in the same address space. Like much of Unix (especially in the Research Unix era through V7), it's not perfect but it's good enough.

unix/ErrnoGoodBad written at 23:28:50; Add Comment


Things I've stopped using in GNU Emacs for working on Go

In light of my switch to basing my GNU Emacs Go environment on lsp-mode, I decided to revisit a bunch of .emacs stuff that I was previously using and take out things that seemed outdated or that I wasn't using any more. In general, my current assumption is that Go's big switch to using modules will probably break any tool for dealing with Go code that hasn't been updated, so all of them are suspect until proven otherwise. For my own reasons, I want to record everything I remove.

My list, based on an old copy of my .emacs that I saved, is:

  • go-guru, which was provided through go-mode; one of the things that I sort of used from it was a minor mode to highlight identifiers. To the extent that I care about such highlighting, it's now provided by lsp-mode.

  • gorename and the go-rename bindings for it in go-mode. In practice I never used it to automatically rename anything in my code, so I don't miss it now. Anyway, lsp-mode and gopls do support renaming things, although I have to remember that this is done through the lsp-rename command and there's no key or menu binding for it currently.

  • godoctor, which was another path to renaming and other operations. I tried this out early on but found some issues with it, then mostly never used it (just like gorename).

  • go-eldoc, which provided quick documentation summaries that lsp-mode will now also do (provided that you tune lsp-mode to your tastes).

  • I previously had M-. bound to godef-jump (which comes from go-mode), but replaced it with an equivalent lsp-mode binding to lsp-ui-peek-find-definitions.

  • I stopped using company-go to provide autocompletion data for Go for company-mode in favour of company-lsp, which uses lsp-mode as a general source of completion data.

All of these dropped Emacs things mean that I've implicitly stopped using gocode, which was previously the backend for a number of these things.

In general I've built up quite a bunch of Go programming tools from various sources, such as gotags, many of which I installed to poke at and then never got around to using actively. At some point I should go through everything and weed out the tools that haven't been updated to deal with modules or that I simply don't care about.

(The other option is that I should remove all of the Go programs and tools I've built up in ~/go/bin and start over from scratch, adding only things that I turn out to actively use and want. Probably I'm going to hold off on doing this until Go goes to entirely modular builds and I have to clean out my ~/go/src tree anyway.)

I should probably investigate various gopls settings that I can set either through lsp-go or as experimental settings as covered in the gopls emacs documentation. Since I get the latest Emacs packages from Melpa and compile the bleeding edge gopls myself, this is more or less an ongoing thing (with occasional irritations).

programming/GoEmacsDroppedTools written at 18:33:59; Add Comment


A retrospective on our OmniOS ZFS-based NFS fileservers

Our OmniOS fileservers have now been out of service for about six months, which makes it somewhat past time for a retrospective on them. Our OmniOS fileservers followed on our Solaris fileservers, which I wrote a two part retrospective on (part 1, part 2), and have now been replaced by our Linux fileservers. To be honest, I have been sitting on my hands about writing this retrospective because we have mixed feelings about our OmniOS fileservers.

I will put the summary up front. OmniOS worked reasonably well for us over its lifespan here and looking back I think it was almost certainly the right choice for us at the time we made that choice (which was 2013 and 2014). However it was not without issues that marred our experience with it in practice, although not enough to make me regret that we ran it (and ran it for as long as we did). Part of our issues are likely due to a design mistake in making our fileservers too big, although this design mistake was probably magnified when we were unable to use Intel 10G-T networking in OmniOS.

On the one hand, our OmniOS fileservers worked, almost always reliably. Like our Solaris fileservers before them, they ran quietly for years without needing much attention, delivering NFS fileservice to our Ubuntu servers; specifically, we ran them for about five years (2014 through 2019, although we started migrating away at the end of 2018). Over this time we had only minor hardware issues and not all that many disk failures, and we suffered no data loss (with ZFS checksums likely saving us several times, and certainly providing good reassurances). Our overall environment was easy to manage and was pretty much problem free in the face of things like failed disks. I'm pretty sure that our users saw a NFS environment that was solid, reliable, and performed well pretty much all of the time, which is the important thing. So OmniOS basically delivered the fileserver environment we wanted.

(Our Linux iSCSI backends ran so problem free that I almost forgot to mention them here; we basically got to ignore them the entire time we ran our OmniOS fileserver environment. I think that they routinely had multi-year uptimes; certainly they didn't go down outside of power shutdowns (scheduled or unscheduled).)

On the other hand, we ran into real limitations with OmniOS and our fileservers were always somewhat brittle under unusual conditions. The largest limitation was the lack of working 10G-T Ethernet (with Intel hardware); now that we have Linux fileservers with 10G-T, it's fairly obvious what we were missing and that it did really matter. Our OmniOS fileservers were also not fully reliable; they would lock up, reboot, or perform very badly under an array of fortunately exceptional conditions to a far greater degree than we liked (for example, filesystems that hit quota limits). We also had periodic issues from having two iSCSI networks, where OmniOS would decide to use only one of them for one or more iSCSI targets and we had to fiddle things in magic ways to restore our redundancy. It says something that our OmniOS fileservers were by far the most crash-prone systems we operated, even if they didn't crash very often. Some of the causes of these issues were identified, much like our 10G-T problems, but they were never addressed in the OmniOS and Illumos kernel to the best of my knowledge.

(To be clear here, I did not expect them to be; the Illumos community only has so many person-hours available, and some of what we uncovered are hard problems in things like the kernel memory management.)

Our OmniOS fileservers were also harder for us to manage for an array of reasons that I mostly covered when I wrote about how our new fileservers wouldn't be based on Illumos, and in general there are costs we paid for not using a mainstream OS (costs that would be higher today). With that said, there are some things that I currently do miss about OmniOS, such as DTrace and our collection of DTrace scripts. Ubuntu may someday have an equivalent through eBPF tools, but Ubuntu 18.04 doesn't today.

In the final summary I don't regret us running our OmniOS servers when we did and for as long as we did, but on the whole I'm glad that we're not running them any more and I think our current fileserver architecture is better overall. I'm thankful for OmniOS's (and thus Illumos') faithful service here without missing it.

PS: Some of our OmniOS issues may have been caused by using iSCSI instead of directly attached disks, and certainly using directly attached disks would have made for smaller fileservers, but I suspect that we'd have found another set of problems with directly attached disks under OmniOS. And some of our problems, such as with filesystems that hit quota limits, are very likely to be independent of how disks were attached.

solaris/OmniOSFileserverRetrospective written at 22:14:24; Add Comment

The history and background of us using Prometheus

On Prometheus and Grafana after a year, a commentator asked some good questions:

Is there a reason why you went with a "metrics-based" (?) monitoring solution like Prometheus-Grafana, and not a "service-based" system like Zabbix (or Nagios)? What (if anything) was being used before the current P-G system?

I'll start with the short answer, which is that we wanted metrics as well as alerting and operating one system is simpler than operating two, even if Prometheus's alerting is not necessarily as straightforward as something intended primarily for that. The longer answer is in the history of how we got here.

Before the current Prometheus system, what we had was based on Xymon and had been in place sufficiently long that portions of it talked about 'Hobbit' (the pre-2009 name of Xymon, cf). Xymon as we were operating it was almost entirely a monitoring and alerting system, with very little to nothing in the way of metrics and metrics dashboards. We've understood for a long time that having metrics is important and we wanted to gather and use them, but we had never managed to turn this desire into actually doing anything (at one point I sort of reached a decision on what to build, but then I never actually built anything for various reasons).

In the fall of 2018 (last year), our existing Xymon setup reached a critical point where we couldn't just let it be, because it was hosted on an Ubuntu 14.04 machine. For somewhat unrelated reasons I wound up looking at Prometheus, and its quick-start demonstration sold me on the idea that it could easily generate useful metrics in our environment (and then let us see them in Grafana). My initial thoughts were to split metrics apart from alerting and to start by setting up Prometheus as our metrics system, then figure out alerting later. I set up a testing Prometheus and Grafana for metrics on a scratch server around the start of October.

Since we were going to run Prometheus and it had some alerting capabilities, I explored if it could more or less sufficiently cover our alerting needs. It turned out that it could, although perhaps not in an ideal way. However, running one system and gathering information once (more or less) is less work than also trying to pick a modern alerting system, set it up, and set up monitoring for it, especially if we wanted to do it on a deadline (with the end of Ubuntu's support for 14.04 looming up on it). We decided that we would at least get Prometheus in place now to replace Xymon, even if it wasn't ideal, and then possibly implement another alerting system later at more leisure if we decided that we needed to. So far we haven't felt a need to go that far; our alerts work well enough in Prometheus, and we don't have all that many custom 'metrics' that really exist only to trigger alerts.

(Things we want to alert on often turn out to also be things that we want to track over time, more often than I initially thought. We've also wound up doing more alerting on metrics than I expected us to.)

Given this history, it's not quite right for me to say that we chose Prometheus over other alternative metrics systems. Although we did do some evaluation of other options after I tried Prometheus's demo and started exploring it, what it basically boiled down to was we had decent confidence Prometheus could work (for metrics) and none of the other options seemed clearly better to the point where we should spend the time exploring them as well. Prometheus was not necessarily the best, it just sold us on that it was good enough.

(Some of the evaluation criteria I used turned out to be incorrect, too, such as 'is it available as an Ubuntu package'. In the beginning that seemed like an advantage for Prometheus and anything that was, but then we wound up abandoning the Ubuntu Prometheus packages as being too out of date.)

sysadmin/PrometheusWhyHistory written at 00:29:44; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.