Converting a Go pointer to an integer doesn't quite do what it looks like
Over on r/golang, an interesting question was asked:
[Is it] possible to parse a struct or interface to get its pointer address as an integer? [...]
The practical answer today is yes, as noted in the answers to the
question. You can convert any Go pointer to
uintptr by going through
and then convert the
uintptr into some more conventional integer
type if you want. If you're going to convert to another integer
type, you should probably use
uint64 for safety, since that should
uintptr value on any current Go platform.
However, the theoretical answer is no, in that this conversion
doesn't quite get you what you might think it does. What this
conversion really gives you is the address that the addressable
value had at the moment the conversion to
uintptr was done. Go very carefully does not guarantee that this
past address is the same as the current address, although it always
will be today.
(I'm assuming here that there are other references to the addressable value that keep it from being garbage collected.)
Go's current garbage collector is a non-compacting garbage collector, where once things are allocated somewhere in memory, they never move for as long as they're alive. Since a non-compacting garbage collector has stable memory addresses for things, converting an address to an integer gives you something that is always the integer value of the current address of that thing. However, there are also compacting garbage collectors, which move live things around during garbage collection for various reasons. In these garbage collectors, the memory address of things is not stable.
Go is deliberately specified so that you could implement it using
a compacting GC, and at one point this was the long term plan. When it moved things as part
of garbage collection, such a Go would update the address of actual
pointers to them to the new value. However, it would not magically
update integer values derived from those pointers, whether they're
uintptrs or some other integer types. In a compacting GC world,
uintptr of the address of something twice at different
times could give you two different values. Each value was accurate
at the moment you got it, but it's not guaranteed to be accurate
one instant past that; a GC pass could happen at any time and thus
the thing could be moved at any time.
Leaving the door open for a compacting GC is one of the reasons
that the rules surrounding the use of
uintptr are so
carefully and narrowly specified, as we've seen before. In fact the documentation points this out
A uintptr is an integer, not a reference. Converting a Pointer to a uintptr creates an integer value with no pointer semantics. Even if a uintptr holds the address of some object, the garbage collector will not update that uintptr's value if the object moves, nor will that uintptr keep the object from being reclaimed.
(The emphasis is mine.)
The Go garbage collector never moves things today, which leads to the practical answer for today of 'yes, you can do this'. But the theoretical answer is that the address of things could be constantly changing, and maybe someday in the future they sometimes will.
Update: As pointed out in the r/golang comments on my entry, I'm wrong here. In Go today, stacks are movable as stack usage grows and shrinks, and you can take the address of a value that is on the stack and that subsequently gets moved with the stack.
Some notes on the structure of Go binaries (primarily for ELF)
I'll start with the background. I keep around a bunch of third party
programs written in Go, and one of the things that I do periodically
is rebuild them, possibly because I've updated some of them to
their latest versions. When doing this,
it's useful to have a way to report the package that a Go binary was
built from, ideally a fast way. I have traditionally used
binstale for this, but it's not
fast. Recently I tried out
which is fast and looked like it had great promise, except that I
discovered it didn't report about all of my binaries. My attempts to
fix that resulted in various adventures but only partial success.
All of the following is mostly for ELF
format binaries, which is the binary format used on most Unixes
(except MacOS). Much of the general information applies to other
binary formats that Go supports, but the specifics will be different.
For a general introduction to ELF, you can see eg here.
Also, all of the following assumes that you haven't stripped the
Go binaries, for example by building with '
-w' or '
All Go programs have a
.note.go.buildid ELF section that has the
build ID (also).
If you read the ELF sections of a binary and it doesn't have that,
you can give up; either this isn't a Go binary or something deeply
weird is going on.
Programs built as Go modules contain an
embedded chunk of information about the modules used in building
them, including the main program; this can be printed with '
version -m <program>'. There is no official interface to extract
this information from other binaries (inside a program you can use
runtime/debug.ReadBuildInfo()), but it's
currently stored in the binary's data section as a chunk of plain
text. See version.go
for how Go itself finds and extracts this information, which is
probably going to be reasonably stable (so that newer versions of
Go can still run '
go version -m <program>' against programs built
with older versions of Go). If you can extract this information
from a binary, it's authoritative, and it should always be present
even if the binary has been stripped.
If you don't have module information (or don't want to copy
version.go's code in order to extract it), the only approach I know
to determine the package a binary was built from is to determine
the full file path of the source code where
main() is, and then
reverse engineer that to create a package name (and possibly a
module version). The general approach is:
- extract Go debug data from the binary and use debug/gosym to create a
- look up the
main.mainfunction in the table to get its starting address, and then use
Table.PCToLine()to get the file name for that starting address.
- convert the file name into a package name.
Binaries built from
$GOPATH will have file names of the form
$GOPATH/src/example.org/fred/cmd/barney/main.go. If you take the
directory name of this and take off the
$GOPATH/src part, you
have the package name this was built from. This includes module-aware
builds done in
$GOPATH. Binaries built directly from modules with
go get example.org/fred/cmd/barney@latest' will have a file path
of the form
To convert this to a module name, you have to take off '
and move the version to the end if it's not already there. For
binaries built outside some
$GOPATH, with either module-aware
builds or plain builds, you are unfortunately on your own; there
is no general way to turn their file names into package names.
(There are a number of hacks if the source is present on your local
system; for example, you can try to find out what module or VCS
repository it's part of if there's a
go.mod or VCS control directory
somewhere in its directory tree.)
However, to do this you must first extract the Go debug data from
your ELF binary. For ordinary unstripped Go binaries, this debugging
information is in the
.gosymtab ELF sections of
the binary, and can be read out with
Go binaries that use cgo do not have these Go ELF sections. As
mentioned in Building a better Go linker:
For “cgo” binaries, which may make arbitrary use of C libraries, the Go linker links all of the Go code into a single native object file and then invokes the system linker to produce the final binary.
This linkage obliterates
.gosymtab as separate
ELF sections. I believe that their data is still there in the final
binary, but I don't know how to extract them. The Go debugger Delve doesn't even try; instead, it
uses the general DWARF
.debug_line section (or its compressed version), which seems
to be more complicated to deal with. Delve has its DWARF code as
sub-packages, so perhaps you could reuse them to read and process the
DWARF debug line information to do the same thing (as far as I know
the file name information is present there too).
Since I have and use several third party cgo-based programs, this
is where I gave up. My hacked branch of the
which package can deal
with most things short of "cgo" binaries, but unfortunately that's
not enough to make it useful for me.
(Since I spent some time working through all of this, I want to write it down before I forget it.)
PS: I suspect that this situation will never improve for non-module builds, since the Go developers want everyone to move away from them. For Go module builds, there may someday be a relatively official and supported API for extracting module information from existing binaries, either in the official Go packages or in one of the golang.org/x/ additional packages.
Making your own changes to things that use Go modules
Suppose, not hypothetically, that you have found a useful Go program but when you test it you discover that it has a bug that's a problem for you, and that after you dig into the bug you discover that the problem is actually in a separate package that the program uses. You would like to try to diagnose and fix the bug, at least for your own uses, which requires hacking around in that second package.
In a non-module environment, how you do this is relatively
straightforward, although not necessarily elegant. Since building
programs just uses what's found in in
$GOPATH/src, you can
directly into your local clone of the second package and start
hacking away. If you need to make a pull request, you can create a
branch, fork the repo on Github or whatever, add your new fork as
an additional remote, and then push your branch to it. If you didn't
want to contaminate your main
$GOPATH with your changes to the
upstream (since they'd be visible to everything you built that used
that package), you could work in a separate directory hierarchy and
$GOPATH when you were working on it.
If the program has been migrated to Go modules, things are not
quite as straightforward. You probably don't have a clone of the
second package in your
$GOPATH, and even if you do, any changes
to it will be ignored when you rebuild the program (if you do it
in a module-aware way). Instead, you make
local changes by using the '
replace' directive of the program's
go.mod, and in some ways it's better than the non-module approach.
First you need local clones of both packages. These clones can be
a direct clone of the upstream or they can be clones of Github (or
Gitlab or etc) forks that you've made. Then, in the program's module,
you want to change
go.mod to point the second package to your
local copy of its repo:
replace github.com/rjeczalik/which => /u/cks/src/scratch/which
You can edit this in directly (as I did when I was working on this)
or you can use '
go mod edit'.
If the second package has
not been migrated to Go modules, you need to create a
your local clone (the Go documentation will tell you this if you
read all of it).
Contrary to what I initially thought, this new
go.mod does not
need to have the module name of the package you're replacing, but
it will probably be most convenient if it does claim to be, eg,
github.com/rjeczalik/which, because this means that any commands
or tests it has that import the module will use your hacks, instead
of quietly building against the unchanged official version (again,
assuming that you build them in a module-aware way).
(You don't need a
replace line in the second package's
Go's module handling is smart enough to get this right.)
As an important note, as of Go 1.13 you must do '
go get' to
build and install commands from inside this source tree even if
$GOPATH. If it's under
$GOPATH and you do '
<blah>/cmd/gobin', Go does a non-module '
go get' even though the
directory tree has a
go.mod file and this will use the official
version of the second package, not your replacement. This is
documented but perhaps surprising.
When you're replacing with a local directory this way, you don't need to commit your changes in the VCS before building the program; in fact, I don't think you even need the directory tree to be a VCS repository. For better or worse, building the program will use the current state of your directory tree (well, both trees), whatever that is.
If you want to see what your module-based binaries were actually
built with in order to verify that they're actually using your
modified local version, the best tool for this is '
go version -m'.
This will show you something like:
go/bin/gobin go1.13 path github.com/rjeczalik/bin/cmd/gobin mod github.com/rjeczalik/bin (devel) dep github.com/rjeczalik/which v0.0.0-2014[...] => /u/cks/go/src/github.com/siebenmann/which
I believe that the '(devel)' appears if the binary was built directly
from inside a source tree, and the '=>' is showing a '
in action. If you build one of the second package's commands (from
inside its source tree), '
go version -m' doesn't report the
replacement, just that it's a '(devel)' of the module.
(Note that this output doesn't tell us anything about the version
of the second package that was actually used to build the binary,
except that it was the current state of the filesystem as of the
build. The 'v0.0.0-2014[...]' version stamp is for the original
version, not our replacement, and comes from the first package's
PS: If '
go version -m' merely reports the 'go1.13' bit, you managed
to build the program in a non module-aware way.
Sidebar: Replacing with another repo instead of a directory tree
The syntax for this uses your alternate repository, and I believe it
must have some form of version identifier. This version identifier
can be a branch, or at least it can start out as a branch in your
go.mod, so it looks like this:
replace github.com/rjeczalik/which => github.com/siebenmann/which reliable-find
After you run '
go build' or the like, the
go command will quietly
rewrite this to refer to the specific current commit on that branch.
If you push up a new version of your changes, you need to re-edit
go.mod to say '
reliable-find' or '
master' or the like
Your upstream repository doesn't have to have a
unlike the case with a local directory tree. If it does have a
go.mod, I think that the claimed package name can be relatively
liberal (for instance, I think it can be the module that you're
replacing). However, some experimentation with sticking in random
upstreams suggests that you want the final component of the module
name to match (eg, '<something>/which' in my case).
A safety note about using (or having)
$GOPATH in Go 1.13
One of the things in the Go 1.13 release notes is a little note
about improved support for
go.mod. This is worth quoting in
more or less full:
GO111MODULEenvironment variable continues to default to
auto, but the
autosetting now activates the module-aware mode of the go command whenever the current working directory contains, or is below a directory containing, a
go.modfile — even if the current directory is within
The important safety note is that this potentially creates a confusing situation, and also it may be easy for other people to misunderstand what this actually says in the same way that I did.
Suppose that there is a Go program that is part of a module,
example.org/fred/cmd/bar (with the module being example.org/fred).
If you do '
go get example.org/fred/cmd/bar', you're fetching and
building things in non-module mode, and you will wind up with a
$GOPATH/src/example.org/fred VCS clone, which will have a
file at its root, ie
the fact that there is a
go.mod file right there on disk, re-running
go get example.org/fred/cmd/bar' while you're in (say) your home
directory will not do a module-aware build. This is because, as the
note says, module-aware builds only happen if your current directory
or its parents contain a
go.mod file, not just if there happens
to be a
go.mod file in the package (and module) tree being built.
So the only way to do a proper module aware build is to actually
be in the command's subdirectory:
cd $GOPATH/src/example.org/fred/cmd/bar go get
(You can get very odd results if you cd to
and then attempt to '
go get example.org/fred/cmd/bar'. The result
is sort of module-aware but weird.)
This makes it rather more awkward to build or rebuild Go programs
through scripts, especially if they involve various programs that introspect your existing
Go binaries. It's also easy to slip up and de-modularize a Go binary;
one absent-minded '
go get example.org/...' will do it.
In a way, Go modules don't exist on disk unless you're in their
directory tree. If that tree is inside
$GOPATH and you're not in
it, you have a plain Go package, not a module.
(If the directory tree is outside
$GOPATH, well, you're not doing
much with it without
cd'ing into it, at which point you have a
The easiest way to see whether a binary was built module-aware or
not is '
goversion -m PROGRAM'. If
the program was built module-aware, you will get a list of all of
the modules involved. If it wasn't, you'll just get a report of
what Go version it was built with. Also, it turns out that you can
build a program with modules without it having a
GO111MODULE=on go get rsc.io/goversion@latest
The repository has tags but no
go.mod. This also works on
repositories with no tags at all. If the program uses outside
packages, they too can be non-modular, and '
goversion -m PROGRAM'
will (still) produce a report of what tags, dates, and hashes they
Update: in Go 1.13, '
go version -m PROGRAM' also reports the
module build information, with module hashes included as well.
This does mean that in theory you could switch over to building all
third party Go programs you use this way. If the program hasn't
converted to modules you get more or less the same results as today,
and if the program has converted, you get their hopefully stable
go.mod settings. You'd lose having a local copy of everything in
$GOPATH, though, which opens up some issues.
Jumping backward and forward in GNU Emacs
In my recent entry on writing Go with Emacs's lsp-mode, I noted that lsp-mode or more accurately lsp-ui has a 'peek' feature that winds up letting you jump to a definition or a reference of a thing, but I didn't know how to jump back to where you were before. The straightforward but limited answer to my question is that jumping back from a LSP peek is done with the M-, keybinding (which is surprisingly awkward to write about in text). This is not a special LSP key binding and function; instead it is a standard binding that runs xref-pop-marker-stack, which is part of GNU Emacs' standard xref package. This M-, binding is right next to the standard M-. and M-? xref bindings for jumping to definitions and references. It also works with go-mode's godef-jump function and its C-c C-j key binding.
(Lsp-ui doesn't set up any bindings for its 'peek' functions, but if you like what the 'peek' feature does in general you probably want to bind them to M-. and M-? in the lsp-ui-mode-map keybindings so that they take over from the xref versions. The xref versions still work in lsp-mode, it's just that they aren't as spiffy. This is convenient because it means that the standard xref binding 'C-x 4 .' can be used to immediately jump to a definition in another Emacs-level 'window'.)
I call this the limited answer for a couple of reasons. First, this only works in one direction; once you've jumped back, there is no general way to go forward again. You get to remember yourself what you did to jump forward and then do it again, which is easy if you jumped to a definition but not so straightforward if you jumped to a reference. Second, this isn't a general feature; it's specific to the xref package and to things that deliberately go out of their way to hook into it, which includes lsp-ui and go-mode. Because Emacs is ultimately a big ball of mud, any particular 'jump to thing' operation from any particular may or may not hook into the xref marker stack.
(A core Emacs concept is the mark, but core mark(s) are not directly tied to the xref marker stack. It's usually the case that things that use the xref marker stack will also push an entry onto the plain mark ring, but this is up to the whims of the package author. The plain mark ring is also context dependent on just what happened, with no universal 'jump back to where I was' operation. If you moved within a file you can return with C-u C-space, but if you moved to a different file you need to use C-x C-space instead. Using the wrong one gets bad results. M-, is universal in that it doesn't matter whether you moved within your current file or moved to another one, you always jump backward with the same key.)
The closest thing I've found in GNU Emacs to a browser style backwards and forwards navigation is a third party package called backward-forward (also gitlab). This specifically attempts to implement universal jumping in both directions, and it seems to work pretty well. Unfortunately its ring of navigation is global, not per (Emacs) window, but for my use this isn't fatal; I'm generally using Emacs within a single context anyway, rather than having several things at once the way I do in browsers.
Because I want browser style navigation, I've changed from the default backward-forward key bindings by removing its C-left and C-right bindings in favor of M-left and M-right (ie Alt-left and Alt-right, the standard browser key bindings for Back and Forward), and also added bindings for my mouse rocker buttons. How I have it set up so that it works on Fedora and Ubuntu 18.04 is as follows (using use-package, as everyone seems to these days):
(use-package backward-forward :demand :config (backward-forward-mode t) :bind (:map backward-forward-mode-map ("<C-left>" . nil) ("<C-right>" . nil) ("<M-left>" . backward-forward-previous-location) ("<M-right>" . backward-forward-next-location) ("<mouse-8>" . backward-forward-previous-location) ("<mouse-9>" . backward-forward-next-location) ) )
:demand is necessary on Ubuntu 18.04 to get
the key bindings to work. I don't know enough about Emacs to
PS: Normal Emacs and Lisp people would probably stack those stray )'s at the end of the last real line. One of my peculiarities in ELisp is that I don't; I would rather see a clear signal of where blocks end, rather than lose track of them in a stack of ')))'. Perhaps I will change this in time.
Go modules and the problem of noticing updates to dependencies
Now that Go 1.13 has been released, we're moving that much closer to a module-based Go world. I've become cautiously but broadly positive towards Go 1.13 and this shift (somewhat in contrast to what I expected earlier), and I'm probably going to switch over to Go 1.13 everywhere and move toward modules in my own work. Thinking about working in this environment has left me with some questions.
Let's suppose that you have some programs or code that uses third party
packages, and these are generally stable programs that don't really
need any development or change. In the Go module world, the version
of those packages that you use is locked down by your
and won't change unless you manually update, even if new versions
are released. In theory you can keep on using your current versions
forever, but in practice as a matter of good maintenance hygiene
you probably want to update every so often to pick up bug fixes,
improvements, and perhaps security updates. As always, updating
regularly also makes the changes smaller and easier to deal with
if there are problems.
In the pre-module world, how I found out about such updates was
that I ran Go-Package-Store every so often and
looked at what it reported (I could also use gostatus). I also had (and have) tools
like binstale and gobin, which I could use with scripting
to basically '
go get -u' everything I currently had a Go binary
for (which makes some of the problems from my old entry on using
Go-Package-Store not applicable any more).
I'm not sure how to do this in a world of Go modules. Go-Package-Store
works by scanning your
$GOPATH, but with modules the only things
there are (perhaps) your actual programs, not their dependencies.
You can see updates for the dependencies of any particular program
or module with '
go list -u -m all' (in a cryptic format; anything
with a '[...]' after it has an update to that version available),
but I don't think anyone has built anything to do a large scale
scan, try to find out what the changes are, and show them to you.
(The current module behavior of '
go get', '
go list', and company
also seems surprising to me in some areas that complicate interpreting
go list -u -m all' output, although perhaps it's working as
Relying on Go modules also brings up a related issue of what to do
if the upstream source just goes away. In the pre-module world, you
have a full VCS clone of the upstream in your
$GOPATH/src, so you
can turn around and re-publish it somewhere yourself (or someone
else can and you know you can trust their version because it's the
same as your local copy). In the module world you only have a
snapshot of a specific version (or versions) in your
tree or in the Go module proxy you're using. Even if you vendor
things as well, you're not going to have the transparent and full
version history of the original package that you do today, and the
lack of that history will make it harder to do various things to
recover from a disappearing or abruptly changed package.
(I'm a cautious sysadmin who has been around for a while. I've seen all sorts of repositories just disappear one day for all sorts of different reasons.)
(Perhaps someday there will be a Go module proxy that deliberately makes a full VCS clone when you request a module.)