Wandering Thoughts

2019-09-17

Finding metrics that are missing labels in Prometheus (for alert metrics)

One of the things you can abuse metrics for in Prometheus is to configure different alert levels, alert destinations, and so on for different labels within the same metric, as I wrote about back in my entry on using group_* vector matching for database lookups. The example in that entry used two metrics for filesystems, our_zfs_avail_gb and our_zfs_minfree_gb, the former showing the current available space and the latter describing the alert levels and so on we want. Once we're using metrics this way, one of the interesting questions we could ask is what filesystems don't have a space alert set. As it turns out, we can answer this relatively easily.

The first step is to be precise about what we want. Here, we want to know what 'fs' labels are missing from our_zfs_minfree_gb. A fs label is missing if it's not present in our_zfs_minfree_gb but is present in our_zfs_avail_gb. Since we're talking about sets of labels, answering this requires some sort of set operation.

If our_zfs_minfree_gb only has unique values for the fs label (ie, we only ever set one alert per filesystem), then this is relatively straightforward:

our_zfs_avail_gb UNLESS ON(fs) our_zfs_minfree_gb

The our_zfs_avail_gb metric generates our initial set of known fs labels. Then we use UNLESS to subtract the set of all fs labels that are present in our_zfs_minfree_gb. We have to use 'ON(fs)' because the only label we want to match on between the two metrics is the fs label itself.

However, this only works if our_zfs_minfree_gb has no duplicate fs labels. If it does (eg if different people can set their own alerts for the same filesystem), we'd get a 'duplicate series' error from this expression. The usual fix is to use a one to many match, but those can't be combined with set operators like 'unless'. Instead we must get creative. Since all we care about is the labels and not the values, we can use an aggregation operation to give us a single series for each label on the right side of the expression:

our_zfs_avail_gb UNLESS ON(fs)
   count(our_zfs_minfree_gb) by (fs)

As a side effect of what they do, all aggregation operators condense multiple instances of a label value this way. It's very convenient if you just want one instance of it; if you care about the resulting value being one that exists in your underlying metrics you can use max() or min().

You can obviously invert this operation to determine 'phantom' alerts, alerts that have fs labels that don't exist in your underlying metric. That expression is:

count(our_zfs_minfree_gb) by (fs) UNLESS ON(fs)
   our_zfs_avail_gb

(Here I'm assuimg our_zfs_minfree_gb has duplicate fs labels; if it doesn't, you get a simpler expression.)

Such phantom alerts might come about from typos, filesystems that haven't been created yet but you've pre-set alert levels for, or filesystems that have been removed since alert levels were set for them.

This general approach can be applied to any two metrics where some label ought to be paired up across both. For instance, you could cross-check that every node_info_uname metric is matched by one or more custom per-host informational metrics that your own software is supposed to generate and expose through the node exporter's textfile collector.

(This entry was sparked by a prometheus-users mailing list thread that caused me to work out the specifics of how to do this.)

sysadmin/PrometheusFindUnpairedMetrics written at 00:12:27; Add Comment

2019-09-16

The problem of 'triangular' Network Address Translation

In my entry on our use of bidirectional NAT and split horizon DNS, I mentioned that we couldn't apply our bidirectional NAT translation to all of our internal traffic in the way that we can for external traffic for two reasons, an obvious one and a subtle one. The obvious reason is our current network topology, which I'm going to discuss in a sidebar below. The more interesting subtle reason is the general problem of what I'm going to call triangular NAT.

Normally when you NAT something in a firewall or a gateway, you're in a situation where the traffic in both directions passes through you. This allows you to do a straightforward NAT implementation where you only rewrite one of the pair of IP addresses involved; either you rewrite the destination address from you to the internal IP and then send the traffic to the internal IP, or you rewrite the source address from the internal IP to you and then send the traffic to the external IP.

However, this straightforward implementation breaks down if the return traffic will not flow through you when it has its original source IP. The obvious case of this is if a client machine is trying to contact a NAT'd server that is actually on its own network. It will send its initial packet to the public IP of the NAT'd machine and this packet will hit your firewall, get its destination address rewritten, and then passed to the server. However, when it replies to the packet, the server will see a destination IP on its local network and just send it directly to the client machine. The client machine will then go 'who are you?', because it's expecting the reply to come from the server's nominal public IP, not its internal one.

(Asymmetric routing can also create this situation, for instance if the machine you're talking to has multiple interfaces and a route to you that doesn't go out the firewall-traversing one.)

In general the only way to handle triangular NAT situations is to force the return traffic to flow through your firewall by always rewriting both IP addresses. Unfortunately this has side effects, the most obvious one being that the server no longer gets the IP address of who it's really talking to; as far as it's concerned, all of the connections are coming from your firewall. This is often less than desirable.

(As an additional practical issue, not all NAT implementations are very enthusiastic about doing such two-sided rewriting.)

Sidebar: Our obvious problem is network topology

At the moment, our network topology basically has three layers; there is the outside world, our perimeter firewall, our public IP subnets with various servers and firewalls, and then our internal RFC 1918 'sandbox' subnets (behind those firewalls). Our mostly virtual BINAT subnet with the public IPs of BINAT machines basically hangs off the side of our public subnets. This creates two topology problems. The first topology problem is that there's no firewall to do NAT translation between our public subnets and the BINAT subnet. The larger topology problem is that if we just put a firewall in, we'd be creating a version of the triangular NAT problem because the firewall would have to basically be a virtual one that rewrote incoming traffic out the same interface it came in on.

To make internal BINAT work, we would have to actually add a network layer. The sandbox subnet firewalls would have to live on a separate subnet from all of our other servers, and there would have to be an additional firewall between that subnet and our other public subnets that did the NAT translation for most incoming traffic. This would impose additional network hops and bottlenecks on all internal traffic that wasn't BINAT'd (right now our firewalls deliberately live on the same subnet as our main servers).

tech/TriangleNATProblem written at 00:31:27; Add Comment

2019-09-14

Some notes on the structure of Go binaries (primarily for ELF)

I'll start with the background. I keep around a bunch of third party programs written in Go, and one of the things that I do periodically is rebuild them, possibly because I've updated some of them to their latest versions. When doing this, it's useful to have a way to report the package that a Go binary was built from, ideally a fast way. I have traditionally used binstale for this, but it's not fast. Recently I tried out gobin, which is fast and looked like it had great promise, except that I discovered it didn't report about all of my binaries. My attempts to fix that resulted in various adventures but only partial success.

All of the following is mostly for ELF format binaries, which is the binary format used on most Unixes (except MacOS). Much of the general information applies to other binary formats that Go supports, but the specifics will be different. For a general introduction to ELF, you can see eg here. Also, all of the following assumes that you haven't stripped the Go binaries, for example by building with '-w' or '-s'.

All Go programs have a .note.go.buildid ELF section that has the build ID (also). If you read the ELF sections of a binary and it doesn't have that, you can give up; either this isn't a Go binary or something deeply weird is going on.

Programs built as Go modules contain an embedded chunk of information about the modules used in building them, including the main program; this can be printed with 'go version -m <program>'. There is no official interface to extract this information from other binaries (inside a program you can use runtime/debug.ReadBuildInfo()), but it's currently stored in the binary's data section as a chunk of plain text. See version.go for how Go itself finds and extracts this information, which is probably going to be reasonably stable (so that newer versions of Go can still run 'go version -m <program>' against programs built with older versions of Go). If you can extract this information from a binary, it's authoritative, and it should always be present even if the binary has been stripped.

If you don't have module information (or don't want to copy version.go's code in order to extract it), the only approach I know to determine the package a binary was built from is to determine the full file path of the source code where main() is, and then reverse engineer that to create a package name (and possibly a module version). The general approach is:

  1. extract Go debug data from the binary and use debug/gosym to create a LineTable and a Table.
  2. look up the main.main function in the table to get its starting address, and then use Table.PCToLine() to get the file name for that starting address.
  3. convert the file name into a package name.

Binaries built from $GOPATH will have file names of the form $GOPATH/src/example.org/fred/cmd/barney/main.go. If you take the directory name of this and take off the $GOPATH/src part, you have the package name this was built from. This includes module-aware builds done in $GOPATH. Binaries built directly from modules with 'go get example.org/fred/cmd/barney@latest' will have a file path of the form $GOPATH/pkg/mod/example.org/fred@v.../cmd/barney/main.go. To convert this to a module name, you have to take off '$GOPATH/pkg/mod/' and move the version to the end if it's not already there. For binaries built outside some $GOPATH, with either module-aware builds or plain builds, you are unfortunately on your own; there is no general way to turn their file names into package names.

(There are a number of hacks if the source is present on your local system; for example, you can try to find out what module or VCS repository it's part of if there's a go.mod or VCS control directory somewhere in its directory tree.)

However, to do this you must first extract the Go debug data from your ELF binary. For ordinary unstripped Go binaries, this debugging information is in the .gopclntab and .gosymtab ELF sections of the binary, and can be read out with debug/elf/File.Section() and Section.Data(). Unfortunately, Go binaries that use cgo do not have these Go ELF sections. As mentioned in Building a better Go linker:

For “cgo” binaries, which may make arbitrary use of C libraries, the Go linker links all of the Go code into a single native object file and then invokes the system linker to produce the final binary.

This linkage obliterates .gopclntab and .gosymtab as separate ELF sections. I believe that their data is still there in the final binary, but I don't know how to extract them. The Go debugger Delve doesn't even try; instead, it uses the general DWARF .debug_line section (or its compressed version), which seems to be more complicated to deal with. Delve has its DWARF code as sub-packages, so perhaps you could reuse them to read and process the DWARF debug line information to do the same thing (as far as I know the file name information is present there too).

Since I have and use several third party cgo-based programs, this is where I gave up. My hacked branch of the which package can deal with most things short of "cgo" binaries, but unfortunately that's not enough to make it useful for me.

(Since I spent some time working through all of this, I want to write it down before I forget it.)

PS: I suspect that this situation will never improve for non-module builds, since the Go developers want everyone to move away from them. For Go module builds, there may someday be a relatively official and supported API for extracting module information from existing binaries, either in the official Go packages or in one of the golang.org/x/ additional packages.

programming/GoBinaryStructureNotes written at 18:35:21; Add Comment

2019-09-13

Bidirectional NAT and split horizon DNS in our networking setup

Like many other places, we have far too many machines to give them all public IPs (or at least public IPv4 IPs), especially since they're spread across multiple groups and each group should get its own isolated subnet. Our solution is the traditional one; we use RFC 1918 IPv4 address space behind firewalls, give groups subnets within it (these days generally /16s), and put each group in what we call a sandbox. Outgoing traffic from each sandbox subnet is NAT'd so that it comes out from a gateway IP for that sandbox, or sometimes a small range of them.

However, sometimes people quite reasonably want to have some of their sandbox machines reachable from the outside world for various reasons, and also sometimes they need their machines to have unique and stable public IPs for outgoing traffic. To handle both of these cases, we use OpenBSD's support for bidirectional NAT. We have a 'BINAT subnet' in our public IP address space and each BINAT'd machine gets assigned an IP on it; as external traffic goes through our perimeter firewall, it does the necessary translation between internal addresses and external ones. Although all public BINAT IPs are on a single subnet, the internal IPs are scattered all over all of our sandbox subnets. All of this is pretty standard.

(The public BINAT subnet is mostly virtual, although not entirely so; for various peculiar reasons there are a few real machines on it.)

However, this leaves us with a DNS problem for internal machines (machines behind our perimeter firewall) and internal traffic to these BINAT'd machines. People and machines on our networks want to be able to talk to these machines using their public DNS names, but the way our networks are set up, they must use the internal IP addresses to do so; the public BINAT IP addresses don't work. Fortunately we already have a split-horizon DNS setup, because we long ago made the decision to have a private top level domain for all of our sandbox networks, so we use our existing DNS infrastructure to give BINAT'd machines different IP addresses in the internal and external views. The external view gives you the public IP, which works (only) if you come in through our perimeter firewall; the internal view gives you the internal RFC 1918 IP address, which works only inside our networks.

(In a world where new gTLDs are created like popcorn, having our own top level domain isn't necessarily a great idea, but we set this up many years before the profusion of gTLDs started. And I can hope that it will stop before someone decides to grab the one we use. Even if they do grab it, the available evidence suggests that we may not care if we can't resolve public names in it.)

Using split-horizon DNS this way does leave people (including us) with some additional problems. The first one is cached DNS answers, or in general not talking to the right DNS servers. If your machine moves between internal and external networks, it needs to somehow flush and re-resolve these names. Also, if you're on one of our internal networks and you do DNS queries to someone else's DNS server, you'll wind up with the public IPs and things won't work. This is a periodic source of problems for users, especially since one of the ways to move on or off our internal networks is to connect to our VPN or disconnect from it.

The other problem is that we need to have internal DNS for any public name that your BINAT'd machine has. This is no problem if you give your BINAT machine a name inside our subdomain, since we already run DNS for that, but if you go off to register your own domain for it (for instance, for a web site), things can get sticky, especially if you want your public DNS to be handled by someone else. We don't have any particularly great solutions for this, although there are decent ones that work in some situations.

(Also, you have to tell us what names your BINAT'd machine has. People don't always do this, probably partly because the need for it isn't necessarily obvious to them. We understand the implications of our BINAT system, but we can't expect that our users do.)

(There's both an obvious reason and a subtle reason why we can't apply BINAT translation to all internal traffic, but that's for another entry because the subtle reason is somewhat complicated.)

sysadmin/BinatAndSplitHorizonDNS written at 22:22:40; Add Comment

2019-09-12

The mystery of why my Fedora 30 office workstation was booting fine

The other day, I upgraded the kernel on my office workstation, much as I have any number of times before, and rebooted. Things did not go well:

So the latest Fedora 30 updates (including a kernel update) build an initramfs that refuses to bring up software RAID devices, including the one that my root filesystem is on. Things do not go well afterwards.

Then I said:

Fedora's systemd, Dracut and kernel parameters setup have now silently changed to require either rd.md.uuid for your root filesystem or rd.auto. The same kernel command line booted previous kernels with previous initramfs's.

The first part of this is wrong, and that leads to the mystery.

In Fedora 29, my kernel command line was specifying both the root filesystem device by name ('root=/dev/md20') and the software RAID arrays for the initramfs to bring up (as 'rd.md.uuid=...'). When I upgraded to Fedora 30 in mid-August, various things happened and I wound up removing both of those from the kernel command line, specifying the root filesystem device only by UUID ('root=UUID=...'). This kernel command line booted a series of Fedora 30 kernels, most recently 5.2.11 on September 4th, right up until yesterday.

However, it shouldn't have. As the dracut.cmdline manpage says, the default since Dracut 024 has been to not auto-assemble software RAID arrays in the absence of either rd.auto or rd.md.uuid. And the initramfs for older kernels (at least 5.2.11) was theoretically enforcing that; the journal for that September 4th boot contains a report of:

dracut-pre-trigger[492]: rd.md=0: removing MD RAID activation

But then a few lines later, md/raid1:md20 is activated:

kernel: md/raid1:md20: active with 2 out of 2 mirrors

(The boot log for the new kernel for a failed boot also had the dracut-pre-trigger line, but obviously no mention of the RAID being activated.)

I unpacked the initramfs for both kernels and as far as I can tell they're identical in terms of the kernel modules included and the configuration files and scripts (there are differences in some binaries, which is expected since systemd and some other things got upgraded between September 4th and now). Nor has the kernel configuration changed between the two kernels according to the config-* files in /boot.

So by all evidence, the old kernel and initramfs should not auto-assemble my root filesystem's software RAID and thus shouldn't boot. But, they do. In fact they did yesterday, because when the new kernel failed to boot the first thing I did was boot with the old one. I just don't know why, and that's the mystery.

My fix for my boot issue is straightforward; I've updated my kernel command line to have the 'rd.md.uuid=...' that it should have had all along. This works fine.

(My initial recovery from the boot failure was to use 'rd.auto', but I've decided that I don't want to auto-assemble anything and everything that the initramfs needs. I'll have the initramfs only assemble the bare minimum, just in case. While my swap is also on software RAID, I specifically decided to not assemble it in the initramfs; I don't really need it until later.)

linux/Fedora30BootMystery written at 23:02:06; Add Comment

2019-09-11

Making your own changes to things that use Go modules

Suppose, not hypothetically, that you have found a useful Go program but when you test it you discover that it has a bug that's a problem for you, and that after you dig into the bug you discover that the problem is actually in a separate package that the program uses. You would like to try to diagnose and fix the bug, at least for your own uses, which requires hacking around in that second package.

In a non-module environment, how you do this is relatively straightforward, although not necessarily elegant. Since building programs just uses what's found in in $GOPATH/src, you can cd directly into your local clone of the second package and start hacking away. If you need to make a pull request, you can create a branch, fork the repo on Github or whatever, add your new fork as an additional remote, and then push your branch to it. If you didn't want to contaminate your main $GOPATH with your changes to the upstream (since they'd be visible to everything you built that used that package), you could work in a separate directory hierarchy and set your $GOPATH when you were working on it.

If the program has been migrated to Go modules, things are not quite as straightforward. You probably don't have a clone of the second package in your $GOPATH, and even if you do, any changes to it will be ignored when you rebuild the program (if you do it in a module-aware way). Instead, you make local changes by using the 'replace' directive of the program's go.mod, and in some ways it's better than the non-module approach.

First you need local clones of both packages. These clones can be a direct clone of the upstream or they can be clones of Github (or Gitlab or etc) forks that you've made. Then, in the program's module, you want to change go.mod to point the second package to your local copy of its repo:

replace github.com/rjeczalik/which => /u/cks/src/scratch/which

You can edit this in directly (as I did when I was working on this) or you can use 'go mod edit'.

If the second package has not been migrated to Go modules, you need to create a go.mod in your local clone (the Go documentation will tell you this if you read all of it). Contrary to what I initially thought, this new go.mod does not need to have the module name of the package you're replacing, but it will probably be most convenient if it does claim to be, eg, github.com/rjeczalik/which, because this means that any commands or tests it has that import the module will use your hacks, instead of quietly building against the unchanged official version (again, assuming that you build them in a module-aware way).

(You don't need a replace line in the second package's go.mod; Go's module handling is smart enough to get this right.)

As an important note, as of Go 1.13 you must do 'go get' to build and install commands from inside this source tree even if it's under $GOPATH. If it's under $GOPATH and you do 'go get <blah>/cmd/gobin', Go does a non-module 'go get' even though the directory tree has a go.mod file and this will use the official version of the second package, not your replacement. This is documented but perhaps surprising.

When you're replacing with a local directory this way, you don't need to commit your changes in the VCS before building the program; in fact, I don't think you even need the directory tree to be a VCS repository. For better or worse, building the program will use the current state of your directory tree (well, both trees), whatever that is.

If you want to see what your module-based binaries were actually built with in order to verify that they're actually using your modified local version, the best tool for this is 'go version -m'. This will show you something like:

go/bin/gobin go1.13
  path github.com/rjeczalik/bin/cmd/gobin
  mod  github.com/rjeczalik/bin    (devel)
  dep  github.com/rjeczalik/which  v0.0.0-2014[...]
  =>    /u/cks/go/src/github.com/siebenmann/which

I believe that the '(devel)' appears if the binary was built directly from inside a source tree, and the '=>' is showing a 'replace' in action. If you build one of the second package's commands (from inside its source tree), 'go version -m' doesn't report the replacement, just that it's a '(devel)' of the module.

(Note that this output doesn't tell us anything about the version of the second package that was actually used to build the binary, except that it was the current state of the filesystem as of the build. The 'v0.0.0-2014[...]' version stamp is for the original version, not our replacement, and comes from the first package's go.mod.)

PS: If 'go version -m' merely reports the 'go1.13' bit, you managed to build the program in a non module-aware way.

Sidebar: Replacing with another repo instead of a directory tree

The syntax for this uses your alternate repository, and I believe it must have some form of version identifier. This version identifier can be a branch, or at least it can start out as a branch in your go.mod, so it looks like this:

replace github.com/rjeczalik/which => github.com/siebenmann/which reliable-find

After you run 'go build' or the like, the go command will quietly rewrite this to refer to the specific current commit on that branch. If you push up a new version of your changes, you need to re-edit your go.mod to say 'reliable-find' or 'master' or the like again.

Your upstream repository doesn't have to have a go.mod file, unlike the case with a local directory tree. If it does have a go.mod, I think that the claimed package name can be relatively liberal (for instance, I think it can be the module that you're replacing). However, some experimentation with sticking in random upstreams suggests that you want the final component of the module name to match (eg, '<something>/which' in my case).

programming/GoHackingWithModules written at 20:49:02; Add Comment

2019-09-10

Catching Control-C and a gotcha with shell scripts

Suppose, not entirely hypothetically, that you have some sort of spiffy program that wants to use Control-C as a key binding to get it to take some action. In Unix, there are two ways of catching Control-C for this sort of thing. First, you can put the terminal into raw mode, where Control-C becomes just another character that you read from the terminal and you can react to it in any way you like. This is very general but it has various drawbacks, like you have to manage the terminal state and you have to be actively reading from the terminal so you can notice when the key is typed. The simpler alternative way of catching Control-C is to set a signal handler for SIGINT and then react when it's invoked. With a signal handler, the kernel's standard tty input handling does all of that hard work for you and you just get the end result in the form of an asynchronous SIGINT signal. It's quite convenient and leaves you with a lot less code and complexity in your spiffy Control-C catching program.

Then some day you run your spiffy program from inside a shell script (perhaps you wanted to add some locking), hit Control-C to signal your program, and suddenly you have a mess (what sort of a mess depends on whether or not your shell does job control). The problem is that when you let the kernel handle Control-C by delivering a SIGINT signal, it doesn't just deliver it to your program; it delivers it to the shell script and in fact any other programs that the shell script is also running (such as a flock command used to add locking). The shell script and these other programs are not expecting to receive SIGINT signals and haven't set up anything special to handle it, so they will get killed.

(Specifically, the kernel will send the SIGINT to all processes in the foreground process group.)

Since your shell was running the shell script as your command and the shell script exited, many shells will decide that your command has finished. This means they'll show you the shell prompt and start interacting with you again. This can leave your spiffy program and your shell fighting over terminal output and perhaps terminal input as well. Even if your shell and your spiffy program don't fight for input and write their output and shell prompt all over each other, generally things don't go well; for example, the rest of your shell script isn't getting run, because the shell script died.

Unfortunately there isn't a good general way around this problem. If you can arrange it, the ideal is for the wrapper shell script to wind up directly exec'ing your spiffy program so there's nothing else a SIGINT will be sent to (and kill). Failing that, you might have to make the wrapper script trap and ignore SIGINT while it's running your program (and to make your program unconditionally install its SIGINT signal handler, even if SIGINT is ignored when the program starts).

Speaking from painful personal experience, this is an easy issue to overlook (and a mysterious one to diagnose). And of course everything works when you test your spiffy program by running it directly, because then the only process getting a SIGINT is the one that's prepared for it.

unix/CatchingCtrlCAndScripts written at 20:54:47; Add Comment

2019-09-09

A safety note about using (or having) go.mod inside $GOPATH in Go 1.13

One of the things in the Go 1.13 release notes is a little note about improved support for go.mod. This is worth quoting in more or less full:

The GO111MODULE environment variable continues to default to auto, but the auto setting now activates the module-aware mode of the go command whenever the current working directory contains, or is below a directory containing, a go.mod file — even if the current directory is within GOPATH/src.

The important safety note is that this potentially creates a confusing situation, and also it may be easy for other people to misunderstand what this actually says in the same way that I did.

Suppose that there is a Go program that is part of a module, example.org/fred/cmd/bar (with the module being example.org/fred). If you do 'go get example.org/fred/cmd/bar', you're fetching and building things in non-module mode, and you will wind up with a $GOPATH/src/example.org/fred VCS clone, which will have a go.mod file at its root, ie $GOPATH/src/example.org/fred/go.mod. Despite the fact that there is a go.mod file right there on disk, re-running 'go get example.org/fred/cmd/bar' while you're in (say) your home directory will not do a module-aware build. This is because, as the note says, module-aware builds only happen if your current directory or its parents contain a go.mod file, not just if there happens to be a go.mod file in the package (and module) tree being built. So the only way to do a proper module aware build is to actually be in the command's subdirectory:

cd $GOPATH/src/example.org/fred/cmd/bar
go get

(You can get very odd results if you cd to $GOPATH/src/example.org and then attempt to 'go get example.org/fred/cmd/bar'. The result is sort of module-aware but weird.)

This makes it rather more awkward to build or rebuild Go programs through scripts, especially if they involve various programs that introspect your existing Go binaries. It's also easy to slip up and de-modularize a Go binary; one absent-minded 'go get example.org/...' will do it.

In a way, Go modules don't exist on disk unless you're in their directory tree. If that tree is inside $GOPATH and you're not in it, you have a plain Go package, not a module.

(If the directory tree is outside $GOPATH, well, you're not doing much with it without cd'ing into it, at which point you have a module.)

The easiest way to see whether a binary was built module-aware or not is 'goversion -m PROGRAM'. If the program was built module-aware, you will get a list of all of the modules involved. If it wasn't, you'll just get a report of what Go version it was built with. Also, it turns out that you can build a program with modules without it having a go.mod:

GO111MODULE=on go get rsc.io/goversion@latest

The repository has tags but no go.mod. This also works on repositories with no tags at all. If the program uses outside packages, they too can be non-modular, and 'goversion -m PROGRAM' will (still) produce a report of what tags, dates, and hashes they were at.

Update: in Go 1.13, 'go version -m PROGRAM' also reports the module build information, with module hashes included as well.

This does mean that in theory you could switch over to building all third party Go programs you use this way. If the program hasn't converted to modules you get more or less the same results as today, and if the program has converted, you get their hopefully stable go.mod settings. You'd lose having a local copy of everything in your $GOPATH, though, which opens up some issues.

programming/Go113AndGoModInGOPATH written at 23:55:53; Add Comment

2019-09-08

Jumping backward and forward in GNU Emacs

In my recent entry on writing Go with Emacs's lsp-mode, I noted that lsp-mode or more accurately lsp-ui has a 'peek' feature that winds up letting you jump to a definition or a reference of a thing, but I didn't know how to jump back to where you were before. The straightforward but limited answer to my question is that jumping back from a LSP peek is done with the M-, keybinding (which is surprisingly awkward to write about in text). This is not a special LSP key binding and function; instead it is a standard binding that runs xref-pop-marker-stack, which is part of GNU Emacs' standard xref package. This M-, binding is right next to the standard M-. and M-? xref bindings for jumping to definitions and references. It also works with go-mode's godef-jump function and its C-c C-j key binding.

(Lsp-ui doesn't set up any bindings for its 'peek' functions, but if you like what the 'peek' feature does in general you probably want to bind them to M-. and M-? in the lsp-ui-mode-map keybindings so that they take over from the xref versions. The xref versions still work in lsp-mode, it's just that they aren't as spiffy. This is convenient because it means that the standard xref binding 'C-x 4 .' can be used to immediately jump to a definition in another Emacs-level 'window'.)

I call this the limited answer for a couple of reasons. First, this only works in one direction; once you've jumped back, there is no general way to go forward again. You get to remember yourself what you did to jump forward and then do it again, which is easy if you jumped to a definition but not so straightforward if you jumped to a reference. Second, this isn't a general feature; it's specific to the xref package and to things that deliberately go out of their way to hook into it, which includes lsp-ui and go-mode. Because Emacs is ultimately a big ball of mud, any particular 'jump to thing' operation from any particular may or may not hook into the xref marker stack.

(A core Emacs concept is the mark, but core mark(s) are not directly tied to the xref marker stack. It's usually the case that things that use the xref marker stack will also push an entry onto the plain mark ring, but this is up to the whims of the package author. The plain mark ring is also context dependent on just what happened, with no universal 'jump back to where I was' operation. If you moved within a file you can return with C-u C-space, but if you moved to a different file you need to use C-x C-space instead. Using the wrong one gets bad results. M-, is universal in that it doesn't matter whether you moved within your current file or moved to another one, you always jump backward with the same key.)

The closest thing I've found in GNU Emacs to a browser style backwards and forwards navigation is a third party package called backward-forward (also gitlab). This specifically attempts to implement universal jumping in both directions, and it seems to work pretty well. Unfortunately its ring of navigation is global, not per (Emacs) window, but for my use this isn't fatal; I'm generally using Emacs within a single context anyway, rather than having several things at once the way I do in browsers.

Because I want browser style navigation, I've changed from the default backward-forward key bindings by removing its C-left and C-right bindings in favor of M-left and M-right (ie Alt-left and Alt-right, the standard browser key bindings for Back and Forward), and also added bindings for my mouse rocker buttons. How I have it set up so that it works on Fedora and Ubuntu 18.04 is as follows (using use-package, as everyone seems to these days):

(use-package backward-forward
  :demand
  :config
  (backward-forward-mode t)
  :bind (:map backward-forward-mode-map
              ("<C-left>" . nil)
              ("<C-right>" . nil)
              ("<M-left>" . backward-forward-previous-location)
              ("<M-right>" . backward-forward-next-location)
              ("<mouse-8>" . backward-forward-previous-location)
              ("<mouse-9>" . backward-forward-next-location)
              )
  )

(The use-package :demand is necessary on Ubuntu 18.04 to get the key bindings to work. I don't know enough about Emacs to understand why.)

PS: Normal Emacs and Lisp people would probably stack those stray )'s at the end of the last real line. One of my peculiarities in ELisp is that I don't; I would rather see a clear signal of where blocks end, rather than lose track of them in a stack of ')))'. Perhaps I will change this in time.

(In credit where credit is due, George Hartzell pointed out xref-pop-marker-stack to me in email in response to my first entry, which later led to me finding backward-forward.)

programming/EmacsBackForward written at 22:48:45; Add Comment

2019-09-07

CentOS 7 and Python 3

Over on Twitter, I said:

Today I was unpleasantly reminded that CentOS 7 (still) doesn't ship with any version of Python 3 available. You have to add the EPEL repositories to get Python 3.6.

This came up because of a combination of two things. The first is that we need to set up CentOS 7 to host a piece of commercial software, because CentOS 7 is the most recent Linux release it supports. The second is that an increasing number of our local management tools are now in Python 3 and for various reasons, this particular CentOS 7 machine needs to run them (or at least wants to ) when our existing CentOS 7 machines haven't. The result was that when I set up various pieces of our standard environment on a newly installed CentOS 7 virtual machine, they failed to run because there was no /usr/bin/python3.

At one level this is easily fixed. Adding the EPEL repositories is a straightforward 'yum install epel-release', and after that installing Python 3.6 is 'yum install python36'. You don't get a pip3 with this and I'm not sure how to change that, but for our purposes pip3 isn't necessary; we don't install packages system-wide through PIP under anything except exceptional circumstances.

(The current exceptional circumstances is for Tensorflow on our GPU compute servers. These run Ubuntu 18.04, where pip3 is available more or less standard. If we had general-use CentOS 7 machines it would be an issue, because pip3 is necessary for personal installs of things like the Python LSP server.)

Even having Python 3.6 instead of 3.7 isn't particularly bad right now; our Ubuntu 16.04 machines have Python 3.5.2 and even our 18.04 ones only have 3.6.8. Even not considering CentOS 7, it will be years before we can safely move any of our code past 3.6.8, since some of our 18.04 machines will not be upgraded to 20.04 next year and will probably stay on 18.04 until early 2023 when support starts to run out. This is surprisingly close to the CentOS 7 likely end of life in mid 2024 (which is much closer than I thought before I started writing this entry), so it seems like CentOS 7 only having Python 3.6 is not going to hold our code back very much, if at all.

(Hopefully by 2023 either EPEL will have a more recent version of Python 3 available on CentOS 7 or this commercial software will finally support CentOS 8. I can't blame them for not supporting RHEL 8 just yet, since it's only been out for a relatively short length of time.)

PS: I don't know what the difference is between the epel-release repositories you get by doing it this way and the epel-release-latest repositories you get from following the instructions in the EPEL wiki. The latter repos still don't seem to have Python 3.7, so I'm not worrying about it; I'm not very picky about the specific version of Python 3.6 I get, especially since our code has to run on 3.5 anyway.

python/Python3AndCentOS7 written at 23:24:39; Add Comment

(Previous 10 or go back to September 2019 at 2019/09/06)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.