2021-01-13
What you can and can't build in Go's module mode
Go modules are soon going to be our only option for building Go
programs, which means that it's useful to
understand what we can and can't build with Go in 'module mode',
and how to do certain customary things as part of this. There are
a lot of articles on using Go modules as a developer, but my major
use of 'go get
' and friends is to build and consume other people's
programs, so that's what I'm focusing on here.
(Today in Go 1.15, you're in module mode if you use 'GO111MODULE=on
'
or are inside a directory tree with a go.mod
. When Go 1.16 is
released soon, you will be in module mode all of the time by default,
and in Go 1.17 you won't even have the option to be in GOPATH mode,
as covered before.)
To just install a binary for a Go program in $HOME/go/bin directly
from the upstream source, you do 'go get <thing>
' (provided that
you're not in any source repository for a Go module). This works
both for programs that are Go modules (or part of ones) and non-modular
programs, even if the non-modular Go programs use third party
dependencies. If there are any version tags in the source repository,
you get what Go considers to be the most recent version (I believe
including v2.x and so on versions). Otherwise, you get the most
recent commit on the main branch. I don't believe there's any way
to use this form of 'go get
' to deliberately build an in-progress
development version instead of a tagged release once the latter
exists. If a new version is released, I believe that re-running
the 'go get <thing>
' will update you to the new latest version.
(There is a 'go get <thing>@latest
' syntax, but it doesn't appear to
do anything different than plain 'go get <thing>
' in this case.)
You can also use 'go install <thing>@latest
' to do this in current
(and future) Go versions, which has the advantage today that it
always works in module mode (or fails outright). In the future, the
Go developers plan to remove 'go get
's support for actually getting
and building Go programs, leaving 'go install <thing>@latest
' as
the only option. This means people can look forward to years of
annoyance from trying to follow the README
s on old and perfectly
useful Go programs (including some modularized ones).
If you have a local source repository of a Go program that's part
of a Go module, you can build the current version in the repo by
'cd /where/ever; go install -mod readonly
'. It's important to
use '-mod readonly
' unless you're actually developing the package,
because otherwise Go's tooling can make changes that will cause
conflicts in future VCS updates. The local source repository doesn't
have to be under $HOME/go/src.
If you have a local source repository of a Go program that hasn't
been modularized, it's likely that you can't build from this source
repository in Go 1.16 without special steps and you won't be able
to build from it in Go 1.17 at all (if the Go developers stick to
their plan). In Go 1.16, because GOPATH mode remains supported, you
can do non-modular builds in $HOME/go/src with 'GO111MODULE=off
go get <thing>
' (or 'GO111MODULE=off go install
' in the right
directory). If you think that the upstream will not update the
program to make it a Go module, you can do 'go mod init <thing>
'
to create your own local go.mod
and modularize the program.
Modularizing the program yourself will be the only option once
GOPATH mode is no longer supported at all.
(This means that it will be easier to build old un-modularized
Go programs directly from Github et al than from a local copy, since
'go install ...@latest
' will still work in the former case. I
sure hope those upstream repositories never get removed.)
In module mode, there's no way to use 'go get
' to clone the source
repository of a program, whether or not it's modular. While Go still
supports non-modular mode, you can force Go to clone repos into
$HOME/go/src with 'GO111MODULE=off go get -d <thing>
'. As far as
I know there's no standard Go tool that will tell you the actual
source repository and VCS used for a given Go package path, so that
you can readily deal with custom import paths (a 'vanity import
path,
also). Perhaps
there will be someday. Similarly, if you want to fetch the latest
updates you must directly use an appropriate VCS command from within
the source repository. This is usually 'git pull --ff-only
', but
there are still some people using Mercurial and other alternatives
so it's up to you to keep track of it.
(If you have the source repo in the right spot under $HOME/go/src, in
Go 1.16 you can force an update with 'GO111MODULE=off go get -u -d
<thing>
'. This will also update your local copy of any dependencies,
but you probably don't care about that.)
If you were previously tracking the development of an upstream
program by doing 'go get -u <thing>
' periodically, you now need
to do a multi step process to update and build the latest development
state:
cd /where/ever || exit 0 git pull --ff-only # (probably) go install -mod readonly
You'll also need to manually clone the repository the first time around. Although Go downloads the source code for modules, it's just the source code for whatever version you're using, not the full source repository (and Go doesn't normally expose it to you anyway).
(If you put your cloned source repositories in the right place under
$HOME/go/src, you can use gostatus
to check if they're out of
date. Otherwise, there's 'git fetch --dry-run
', although that's
pretty verbose if there are updates. Perhaps someone will write or
has already written a Git remote status checking program like
gostatus
that works on arbitrary directories.)
If you just want to periodically update to the latest released
version of a program, if any, and perhaps rebuild the version with
your current version of Go, I believe that
'go install <thing>@latest
' will always do what you want. Further,
given the module information that's embedded in binaries compiled
in module mode, you can recover the necessary '<thing>' from the
binary itself.
A Go binary that was built in module mode carries module information
in it that can be reported by 'go version -m
', which will give
you the module and package of the main program. This includes
non-modularized programs fetched and built directly in module mode
with 'go install <thing>@latest
' (or 'go get <thing>
' while
that still works). However, the reported information does not include
the local path of the source code. If you need to get such paths
once in a while, probably the simplest way today is to use Delve:
$ dlv exec someprogram Type 'help' for list of commands. (dlv) ls main.main Showing <path>/<something>.go: [....]
(As I found out when I looked into it, there's a lot of complexity in determining this information from a Go binary.)
2020-12-28
A little puzzle with printf()
and C argument passing
In The Easy Ones – Three Bugs Hiding in the Open, Bruce Dawson gave us a little C puzzle in passing:
The variable arguments in printf formatting means that it is easy to get type mismatches. The practical results vary considerably:
- printf(“0x%08lx”, p); // Printing a pointer as an int – truncation or worse on 64-bit
- printf(“%d, %f”, f, i); // Swapping float and int – could print nonsense, or might actually work (!)
- printf(“%s %d”, i, s); // Swapping the order of string and int – will probably crash
[...] (aside: understanding why #2 often prints the desired result is a good ABI puzzle)
I had to think about this for a bit, and then I realized why and how it can work (and why similar integer versus float argument confusion can also work for other functions, even ones with fixed argument lists). What it comes down to is that in some ABIs, arguments are passed in registers (at least early arguments, before you run out of registers), and floating point arguments are passed in different registers than integers (and pointers). This is true even for functions that take variable arguments and will walk through them using stdarg macros (or at least it can be, depending on the ABI).
Because floating point and non floating point arguments are passed in different sets of registers, what matters isn't the total order of arguments but the order of floating point or non-fp arguments. So here, regardless of where '%f' is in the printf format, it always causes printf() to get the first floating point argument, which can never be confused with an integer argument. Similarly, the first '%d' causes printf() to look for the second non-fp argument, regardless of where it was in the argument order; it could be at the end of several floating point arguments and still work.
(The '%d' makes printf() look for the second non-fp argument because the first one was the format string. In an ABI that passed pointers in a separate place than integers, it would still work out, since now the first '%d' would be looking for the first integer argument.)
Using the excellent services of godbolt.org, we can see this in
action on 64-bit x86 in a very small example (I used a very small example and a
decent optimization level to get clear, minimal assembly code). The
floating point argument is passed in xmm0
, while the format string
and the integer argument are passed in edi
and esi
respectively
(I don't know what eax
is doing, but it probably has something
to do with the ABI). A similar thing happens on 64-bit ARM v8 (aka
Aarch64), as we can also see on godbolt with the same example on
Aarch64.
(Based on this page,
the Aarch64 x0
and w1
are in the same set of registers. Apparently
d0
is a 64-bit version of the first floating point register, from
here
[pdf]. I wound up looking up all of this to be sure I understood
what was going on in the Aarch64 call, so I might as well write it
down here.)
Since pointers and integers are normally passed in the same set of
registers (at least on 64-bit x86 and Aarch64), we can also see why
the third example is very likely to fail. Since the same set of
registers is used for both argument types, it's possible to use an
integer argument as a pointer argument, with a segmentation fault
as the likely result. Similarly, we can predict that 'printf("%s
%f", f, s);
' might well work.
PS: This confusion can happen in any language that follows the C ABI on a platform with this sort of split usage of registers (although many languages may prevent this sort of argument type confusion). Not all languages do; famously, Go currently passes all arguments on the stack (as of Go 1.15 and soon Go 1.16).
2020-12-20
Go modules are soon going to be the only future
One of the things that the Go project talked about in Eleven Years of Go, posted in November, is the plans for Go 1.16 and Go 1.17, and along with them the plans for Go modules. A short summary of all of that information is that Go modules are soon going to be our only option. This is said straight up in the article:
We will also finally wind down support for GOPATH-based development: any programs using dependencies other than the standard library will need a
go.mod
.
The current specifics are set out in the GOPATH wiki page. Go 1.16, expected in
February 2021, will change the default to GO111MODULE=on
(which
requires go.mod
even if you are under $GOPATH/src). Go 1.17 will
entirely remove support for anything other than Go module mode.
Given how the Go authors usually work, I expect this to remove the
actual code that supports 'GOPATH development mode', not just remove
access to it (for instance by removing checking $GO111MODULE and
hard-wiring the value).
(It's possible that this timeline will turn out to be too aggressive and the Go authors will get enough pushback to postpone the removal past Go 1.17. The upgrade to Go 1.16 will be when people who are currently quietly working in GOPATH development mode will be forced to confront this, so this may uncover more people with more practical problems than expected.)
Given this aggressive timeline, I don't think there's any point in
explicitly setting $GO111MODULE to 'auto' or 'off' (contrary to
what I thought a year or so ago, which I
admit that I didn't follow). If anything, you should set GO111MODULE=on
now to buy yourself a bit of extra time to find any problems and
figure out workarounds and new ways of working if you need them.
I'm not going to go that far, but I'm lazy.
I have mixed feelings about the shift overall. I think Go modules
are clearly better for development (including for enabling easy
self-contained source code), but as someone
who mostly consumes Go programs that I get with 'go get ...
',
it's probably going to make my life more complicated.
(I'd say that it'll make it harder for me to keep up with what
programs have updated source code (cf),
but in practice I no longer even look; I just do 'go get -u
' on
everything every so often.)
2020-12-18
On Go, release timing, and new machines
One of the early articles I read about the Apple ARM Macs was On the Apple Silicon M1 MacBook Pro (via). In it, the author (a developer) observed that Go wouldn't officially support these new machines until February 2021, when Go 1.16 is released (cf). I read this (in context) as being unhappy with the time it will take for Go to add this support and have it out in the world, and I had some feelings about that.
Here is the thing. Go, and open source projects in general, are under no obligation to do (or rush out) out of cycle releases just because a company has released a new product. Go's release cycles are very predictable; it releases every six months, with a stabilization period of a couple of months beforehand (Go 1.16 Beta has just been released). This means that when Apple announced these machines in June and offered to let people buy development kits, it was already far too late for the Go 1.15 release in August (Go 1.15 Beta 1 came at the start of June).
It also seems that rapidly landing support for macOS ARM64 would have had its own quality problems. I watch the development version of Go, so I've been able to see a stream of changes being made to add support for macOS ARM64, some of which work around kernel bugs or deal with issues only experienced on production Apple Silicon machines. Some changes did land before production hardware was available (for example), but a version of Go released before or just as the M1 machines themselves were available would not necessarily have been a good one.
(The earliest clear macOS ARM64 change I can find in the git logs is this one, from September 2nd.)
More broadly, Go has a variety of priorities for Go 1.16, as the development team does for every release. Supporting macOS ARM64 is only one of them, and probably not the largest one given what they wrote about the upcoming release in Eleven Years of Go. You cannot expect an open source project to put its other priorities on hold to rush out support for new machines.
This is not unique to Go and Apple M1s. It applies to all open source projects and all new things that need work to support; you always may need to wait a while. People who've used Linux for long enough and buy new machines are very familiar with this; it used to be that you should never buy a new machine (or new components) shortly after it was released if you wanted to immediately use Linux on it (this is still true for some sorts of hardware).
(Since Go already supports macOS and ARM64, you might think that this is a small change that's easy to do. Unfortunately there's no guarantee that the combination of the two works the way the code expects, and the Go code also apparently contained a number of assumptions about what various things implied, required, and so on. This is not surprising in any code; as people say repeatedly, if it's not tested, it doesn't work.)
2020-11-27
Setting up self-contained Go program source that uses packages
Suppose, not entirely hypothetically,
that you're writing a Go program in an environment that normally
doesn't use Go. You're completely
familiar with Go, with a $GOPATH
and a custom Go environment and so on, so you can easily build your program.
But your coworkers aren't, and you would like to give them source
code that is as close to completely self-contained as possible,
where they can rebuild your program with, say, 'cd /some/where;
some-command
' and they don't need to follow a ten-step procedure.
At the same time, you'd like to use Go packages to modularize your
own code so that you don't have to have everything in package main
.
(You might also want to use some external packages, like golang.org/x/crypto/ssh.)
When I started thinking about this in 2018, doing this was a bit complicated. On modern versions of Go, ones with support for modules, it's gotten much simpler, at least for single programs (as opposed to a collection of them). On anything from Go 1.11 onward (I believe), what you want to do is as follows:
- If you haven't already done so, set up a
go.mod
for your program and add all of the dependencies. This more or less follows Using go modules, but assumes that you already have a working program that you haven't modularized.go mod init cslab/ssh-validation go mod tidy
If you don't publish your program anywhere, it's fine to give it some internal name. Otherwise you should use the official published name.
- Vendor everything that you use:
go mod vendor
- Do modular builds using the vendored version of the packages. Not
using the vendored version should work (assuming that all external
packages are still there), but it will download things and clutter
up your
$GOPATH/pkg
directory (wherever that is).go build -mod vendor
You may want to create a
Makefile
that does this so that people (including you in the future) can just run 'make
' instead of having to remember the extra arguments to 'go build
'.(Since I haven't kept track of Go module support very well, I had to look up that '
go build -mod vendor
' has been supported since Go 1.11, which is also the first version of Go to support modules.)
On modern versions of Go, this will automatically work right even
if you have the source inside $GOPATH/src
. On older versions you
may need to force GO111MODULE=yesGO111MODULE=on
(and so you may want to put this in your Makefile
). On very old
versions of Go you'll have problems, because they have either no
Go module support or very limited support.
Unfortunately one of those old versions of Go is what is on Ubuntu 18.04
LTS, which ships with go 1.10.4 and has never been updated. If you're
in this situation, things are much more complicated. Increasingly my
view is that old versions of Go without good module support are now not
very usable and you're going to need to persuade people to use updated
ones. The easiest way to do this is probably to set up a tree of a
suitable Go version (you can use the official binaries if you want) and
then change your program's Makefile
to explicitly use that local copy
of Go.
PS: Use of an explicit '-mod vendor
' argument may not be necessary
under some circumstances; see the footnote here.
I've seen somewhat inconsistent results with this, though.
2020-11-09
Getting the git tags that are before and after a commit (in simple cases)
When I investigate something in a code base, I often wind up wanting
to know when a particular change became available in a release, or
in general to know when it was made in terms not of time but of
releases. Using release dates is both not reliable (since a change
can land early in a side branch and then be merged into mainline
only much later) and a certain amount of pain (you have to look up
release dates somewhere). For git-based projects, my general approach
so far has been to fire up gitk
, put in the SHA1 of the commit I
care about, and see what gitk says it was Before and After.
As it turns out, there is a better way to do this with Git command
line tools, with 'git describe
'
(as I found out when I bothered to do some Internet searches). For
my future reference, the two commands I want to get the tag before
and after a commit are:
; git describe f3a7f6610f zfs-0.6.3-49-gf3a7f6610 ; git describe --contains f3a7f6610f zfs-0.6.4~197
I was already sort of familiar with the first form of 'git describe', because various projects I build use it to automatically generate identifiers for what I'm building. The second form is new to me, as is the '~<count>' that means 'so many commits before tag <X>'.
I imagine that there are all sorts of complex git tree states that can make this question harder to answer. Fortunately I don't think I deal with any projects that might give me that sort of heartburn; the ones where I care about this question like relatively linear git histories.
PS: I don't know how git is establishing what tag is the most recent before a commit or the first after one, but I trust it to basically work. This is where I start having to look up the difference between plain tags and annotated tags, and other stuff like that (cf).
2020-10-16
Go is gaining the ability to trace init calls on program startup
Go packages can have init()
initialization functions, which are
called when a Go program starts as part of package initialization. One of the
practical issues with init
functions in Go so far is that their
performance and even their existence is relatively opaque, so that
it's hard to tell how much of an impact they have on the startup
time of your programs.
The good news is that the Go team is moving to change this lack of visibility, as tracked through this issue and recently landed in the development version of Go (what will become Go 1.16) in this change. To quote the change:
runtime: implement GODEBUG=inittrace=1 support
Setting inittrace=1 causes the runtime to emit a single line to standard error for each package with init work, summarizing the execution time and memory allocation.
The emitted debug information for init functions can be used to find bottlenecks or regressions in Go startup performance.
Somewhat to my surprise, this starts acting early enough that it
reports on the init
functions even in the runtime
package. For
me, the consistent first two lines for program startup, present even
with a program that does nothing, are:
init internal/bytealg @0 ms, 0 ms clock, 0 bytes, 0 allocs init runtime @0.062 ms, 0.069 ms clock, 0 bytes, 0 allocs
On the one hand, I think that making init
functions more visible
is a good thing in general, and will definitely encourage people
to make them minimal. On the other hand, I wonder if people seeing
a long list of init
functions, even in typical programs, will
lead to discouraging their use entirely even if the replacement
isn't as good (for instance, doing the same work with sync.Once
). It's certainly a bit startling
to see how many init
functions there are in typical Go programs.
(One rule of thumb is that you get what you measure, and reporting
init
functions is now implicitly measuring them.)
2020-10-15
Go packages can have more than one init()
function
Go has some surprisingly complex rules for how packages
are initialized, partly because package level variables can be
initialized based on the value returned from function and method
calls (and then other variables can be initialized from them). As
part of package initialization, you can have an initialization
function, called init()
, that will be called.
Or at least that's what I would have told you before I actually had
a reason to read that section of the Go language specification
today. In fact, the specification is very clear that you can have
more than one init()
function in a single package:
Variables may also be initialized using functions named
init
declared in the package block, with no arguments and no result parameters.
func init() { … }
Multiple such functions may be defined per package, even within a single source file. [...]
(Emphasis mine. Package initialization then has details
on what order these init
functions are run in.)
At first this surprised me, but once I thought more it makes sense. On a practical engineering level, it means that you don't have to jam all initialization in a package into a single function in a single file that everyone has to touch; you can spread it out in small pieces wherever is logical and clear.
(You do have to keep track of it all, and the order that functions
in different files get run in depends on how they're built and
linked. The Package initialization section has some suggestions
about that down at the bottom, which you probably don't have to worry
about if you build things with plain usage of go
since it should do
it right for you.)
Because I was curious, I scanned the Go source tree itself to see
if anything used multiple init
functions, especially in the same
file. There is definitely a decent amount of usage of this within
the same package, and even a few cases in the same file (for example,
in cmd/go/main.go).
Unsurprisingly, the runtime
package is a big user of this, since
it covers a lot of functionality; a lot of files in src/runtime
have their own init
functions to cover their specific concerns.
(However the champion user of init
functions is
cmd/compile/internal/ssa/gen.)
2020-09-29
Where (and how) you limit your concurrency in Go can matter
At the start of September, I wrote about how concurrency is still not easy even in Go, using a section of real code with a deadlock as the example. In that entry, I proposed three fixes to remove the deadlock. Since Hillel Wayne's Finding Goroutine Bugs with TLA+ has now formally demonstrated that all three of my proposed fixes work, I can talk about the practical differences between them.
For convenience, here's the original code from the first entry:
func FindAll() []P { pss, err := ps.Processes() [...] found := make(chan P) limitCh := make(chan struct{}, concurrencyProcesses) for _, pr := range pss { // deadlocks here: limitCh <- struct{}{} pr := pr go func() { defer func() { <-limitCh }() [... get a P with some error checking ...] // and deadlocks here: found <- P }() } [...] var results []P for p := range found { results = append(results, p) } return results }
The buffered limitCh
channel is used to implement a limited supply
of tokens, to hold down the number of goroutines that are getting
P's at once. The bug in this code is that the goroutines only receive
from limitCh
to release their token after sending their result
to the unbuffered found
channel, while the main code only starts
receiving from found
after running through the entire for
loop,
and the main code takes the token in the loop and blocks if no
tokens are available. (For more, see the original entry.)
There are at least three fixes possible: the goroutines can send
to limitCh
instead of the main function doing it, the goroutines
can receive from limitCh
before sending to found
, or the entire
for
loop can be in an additional goroutine so that it doesn't
block the main function from starting to receive from found
. All
three of these fixes work, as proven by Hillel Wayne, but they have
different effects on the number of goroutines that this code will
run if pss
is large and what the state of those goroutines is.
If our goal is to minimize resource usage, the worst fix is for
goroutines to receive from limitCh
before sending to found
.
This fix will cause almost all goroutines to stall in the send to
found
, because all but a few of them must be started and run
almost to completion before the main code can finish the for
loop
and start receiving from found
to unblock all of those sends and
let the goroutines exit. These waiting to send goroutines are keeping
used their fully expanded goroutine stacks, and possibly other
resources that have not yet been released by them exiting and things
becoming unused so the garbage collector can collect them (or by
additional defer
statements releasing things).
The middling fix is for goroutines to receive from limitCh
instead
of the for
loop doing it. We will probably immediately create and
start almost all of the full pss
worth of goroutines, which could
be bad if pss
is very large, but at least they all block immediately
with almost no resources used and with very small goroutine stacks.
Still, this is a bunch of memory and a bunch of (Go) scheduler churn
to start all of those goroutines only to have most of them immediately
block receiving from limitCh
. There's also going to be a lot of
contention on internal runtime locks associated with limitCh
,
since a lot of goroutines are queueing up on it.
The best fix for resource usage is to push the for
loop into its
own goroutine but to otherwise keep things the same. Because the
for
loop is still receiving from limitCh
before it creates a
new goroutine, the number of simultaneous goroutines we ever have
will generally be limited to around our desired concurrency level
(there will be some extra that have received from limitCh
but not
yet finished completely exiting).
It's likely that none of this matters if the for
loop only has to
deal with a few hundred entries, and that's probably the case for
this code (at least most of the time). But it makes for a useful
illustration. When you're writing code with enforced limited concurrency
it's probably worthwhile to think about where you want to limit the
concurrency and what effects that has on things overall. As we can see
here, small implementation choices can have potentially large impacts.
(Also, sometimes I think too much about this sort of thing.)
2020-09-16
Why I write recursive descent parsers (despite their issues)
Today I read Laurence Tratt's Which Parsing Approach? (via), which has a decent overview of how parsing computer languages (including little domain specific languages) is not quite the well solved problem we'd like it to be. As part of the article, Tratt discusses how recursive descent parsers have a number of issues in practice and recommends using other things, such as a LR parser generator.
I have a long standing interest in parsing, I'm reasonably well aware of the annoyances of recursive descent parsers (although some of the issues Tratt raised hadn't occurred to me before now), and I've been exposed to parser generators like Yacc. Despite that, my normal approach to parsing any new little language for real is to write a recursive descent parser in whatever language I'm using, and Tratt's article is not going to change that. My choice here is for entirely pragmatic reasons, because to me recursive descent parsers generally have two significant advantages over all other real parsers.
The first advantage is that almost always, a recursive descent parser is the only or at least easiest form of parser you can readily create using only the language's standard library and tooling. In particular, parsing LR, LALR, and similar formal grammars generally requires you to find, select, and install a parser generator tool (or more rarely, an additional package). Very few languages ship their standard environment with a parser generator (or a lexer, which is often required in some form by the parser).
(The closest I know of is C on Unix, where you will almost always find some version of lex and yacc. Not entirely coincidentally, I've used lex and yacc to write a parser in C, although a long time ago.)
By contrast, a recursive descent parser is just code in the language. You can obviously write that in any language, and you can build a little lexer to go along with it that's custom fitted to your particular recursive descent parser and your language's needs. This also leads to the second significant advantage, which is that if you write a recursive descent parser, you don't need to learn a new language, the language of the parser generator, and also learn how to hook that new language to the language of your program, and then debug the result. Your entire recursive descent parser (and your entire lexer) are written in one language, the language you're already working in.
If I was routinely working in a language that had a well respected de facto standard parser generator and lexer, and regularly building parsers for little languages for my programs, it would probably be worth mastering these tools. The time and effort required to do so would be more than paid back in the end, and I would probably have a higher quality grammar too (Tratt points out how recursive descent parsers hide ambiguity, for example). But in practice I bounce back and forth between two languages right now (Go and Python, neither of which have such a standard parser ecology), and I don't need to write even a half-baked parser all that often. So writing another recursive descent parser using my standard process for this has been the easiest way to do it every time I needed one.
(I've developed a standard process for writing recursive descent parsers that makes the whole thing pretty mechanical, but that's a discussion for another entry or really a series of them.)
PS: I can't comment about how easy it is to generate good error messages in modern parser generators, because I haven't used any of them. My experience with my own recursive descent parsers is that it's generally straightforward to get decent error messages for the style of languages that I create, and usually simple to tweak the result to give clearer errors in some specific situations (eg, also).