Wandering Thoughts

2021-01-13

What you can and can't build in Go's module mode

Go modules are soon going to be our only option for building Go programs, which means that it's useful to understand what we can and can't build with Go in 'module mode', and how to do certain customary things as part of this. There are a lot of articles on using Go modules as a developer, but my major use of 'go get' and friends is to build and consume other people's programs, so that's what I'm focusing on here.

(Today in Go 1.15, you're in module mode if you use 'GO111MODULE=on' or are inside a directory tree with a go.mod. When Go 1.16 is released soon, you will be in module mode all of the time by default, and in Go 1.17 you won't even have the option to be in GOPATH mode, as covered before.)

To just install a binary for a Go program in $HOME/go/bin directly from the upstream source, you do 'go get <thing>' (provided that you're not in any source repository for a Go module). This works both for programs that are Go modules (or part of ones) and non-modular programs, even if the non-modular Go programs use third party dependencies. If there are any version tags in the source repository, you get what Go considers to be the most recent version (I believe including v2.x and so on versions). Otherwise, you get the most recent commit on the main branch. I don't believe there's any way to use this form of 'go get' to deliberately build an in-progress development version instead of a tagged release once the latter exists. If a new version is released, I believe that re-running the 'go get <thing>' will update you to the new latest version.

(There is a 'go get <thing>@latest' syntax, but it doesn't appear to do anything different than plain 'go get <thing>' in this case.)

You can also use 'go install <thing>@latest' to do this in current (and future) Go versions, which has the advantage today that it always works in module mode (or fails outright). In the future, the Go developers plan to remove 'go get's support for actually getting and building Go programs, leaving 'go install <thing>@latest' as the only option. This means people can look forward to years of annoyance from trying to follow the READMEs on old and perfectly useful Go programs (including some modularized ones).

If you have a local source repository of a Go program that's part of a Go module, you can build the current version in the repo by 'cd /where/ever; go install -mod readonly'. It's important to use '-mod readonly' unless you're actually developing the package, because otherwise Go's tooling can make changes that will cause conflicts in future VCS updates. The local source repository doesn't have to be under $HOME/go/src.

If you have a local source repository of a Go program that hasn't been modularized, it's likely that you can't build from this source repository in Go 1.16 without special steps and you won't be able to build from it in Go 1.17 at all (if the Go developers stick to their plan). In Go 1.16, because GOPATH mode remains supported, you can do non-modular builds in $HOME/go/src with 'GO111MODULE=off go get <thing>' (or 'GO111MODULE=off go install' in the right directory). If you think that the upstream will not update the program to make it a Go module, you can do 'go mod init <thing>' to create your own local go.mod and modularize the program. Modularizing the program yourself will be the only option once GOPATH mode is no longer supported at all.

(This means that it will be easier to build old un-modularized Go programs directly from Github et al than from a local copy, since 'go install ...@latest' will still work in the former case. I sure hope those upstream repositories never get removed.)

In module mode, there's no way to use 'go get' to clone the source repository of a program, whether or not it's modular. While Go still supports non-modular mode, you can force Go to clone repos into $HOME/go/src with 'GO111MODULE=off go get -d <thing>'. As far as I know there's no standard Go tool that will tell you the actual source repository and VCS used for a given Go package path, so that you can readily deal with custom import paths (a 'vanity import path, also). Perhaps there will be someday. Similarly, if you want to fetch the latest updates you must directly use an appropriate VCS command from within the source repository. This is usually 'git pull --ff-only', but there are still some people using Mercurial and other alternatives so it's up to you to keep track of it.

(If you have the source repo in the right spot under $HOME/go/src, in Go 1.16 you can force an update with 'GO111MODULE=off go get -u -d <thing>'. This will also update your local copy of any dependencies, but you probably don't care about that.)

If you were previously tracking the development of an upstream program by doing 'go get -u <thing>' periodically, you now need to do a multi step process to update and build the latest development state:

cd /where/ever || exit 0
git pull --ff-only # (probably)
go install -mod readonly

You'll also need to manually clone the repository the first time around. Although Go downloads the source code for modules, it's just the source code for whatever version you're using, not the full source repository (and Go doesn't normally expose it to you anyway).

(If you put your cloned source repositories in the right place under $HOME/go/src, you can use gostatus to check if they're out of date. Otherwise, there's 'git fetch --dry-run', although that's pretty verbose if there are updates. Perhaps someone will write or has already written a Git remote status checking program like gostatus that works on arbitrary directories.)

If you just want to periodically update to the latest released version of a program, if any, and perhaps rebuild the version with your current version of Go, I believe that 'go install <thing>@latest' will always do what you want. Further, given the module information that's embedded in binaries compiled in module mode, you can recover the necessary '<thing>' from the binary itself.

A Go binary that was built in module mode carries module information in it that can be reported by 'go version -m', which will give you the module and package of the main program. This includes non-modularized programs fetched and built directly in module mode with 'go install <thing>@latest' (or 'go get <thing>' while that still works). However, the reported information does not include the local path of the source code. If you need to get such paths once in a while, probably the simplest way today is to use Delve:

$ dlv exec someprogram
Type 'help' for list of commands.
(dlv) ls main.main
Showing <path>/<something>.go:
[....]

(As I found out when I looked into it, there's a lot of complexity in determining this information from a Go binary.)

GoModuleBuildsWhatPossible written at 01:19:34; Add Comment

2020-12-28

A little puzzle with printf() and C argument passing

In The Easy Ones – Three Bugs Hiding in the Open, Bruce Dawson gave us a little C puzzle in passing:

The variable arguments in printf formatting means that it is easy to get type mismatches. The practical results vary considerably:

  1. printf(“0x%08lx”, p); // Printing a pointer as an int – truncation or worse on 64-bit
  2. printf(“%d, %f”, f, i); // Swapping float and int – could print nonsense, or might actually work (!)
  3. printf(“%s %d”, i, s); // Swapping the order of string and int – will probably crash

[...] (aside: understanding why #2 often prints the desired result is a good ABI puzzle)

I had to think about this for a bit, and then I realized why and how it can work (and why similar integer versus float argument confusion can also work for other functions, even ones with fixed argument lists). What it comes down to is that in some ABIs, arguments are passed in registers (at least early arguments, before you run out of registers), and floating point arguments are passed in different registers than integers (and pointers). This is true even for functions that take variable arguments and will walk through them using stdarg macros (or at least it can be, depending on the ABI).

Because floating point and non floating point arguments are passed in different sets of registers, what matters isn't the total order of arguments but the order of floating point or non-fp arguments. So here, regardless of where '%f' is in the printf format, it always causes printf() to get the first floating point argument, which can never be confused with an integer argument. Similarly, the first '%d' causes printf() to look for the second non-fp argument, regardless of where it was in the argument order; it could be at the end of several floating point arguments and still work.

(The '%d' makes printf() look for the second non-fp argument because the first one was the format string. In an ABI that passed pointers in a separate place than integers, it would still work out, since now the first '%d' would be looking for the first integer argument.)

Using the excellent services of godbolt.org, we can see this in action on 64-bit x86 in a very small example (I used a very small example and a decent optimization level to get clear, minimal assembly code). The floating point argument is passed in xmm0, while the format string and the integer argument are passed in edi and esi respectively (I don't know what eax is doing, but it probably has something to do with the ABI). A similar thing happens on 64-bit ARM v8 (aka Aarch64), as we can also see on godbolt with the same example on Aarch64.

(Based on this page, the Aarch64 x0 and w1 are in the same set of registers. Apparently d0 is a 64-bit version of the first floating point register, from here [pdf]. I wound up looking up all of this to be sure I understood what was going on in the Aarch64 call, so I might as well write it down here.)

Since pointers and integers are normally passed in the same set of registers (at least on 64-bit x86 and Aarch64), we can also see why the third example is very likely to fail. Since the same set of registers is used for both argument types, it's possible to use an integer argument as a pointer argument, with a segmentation fault as the likely result. Similarly, we can predict that 'printf("%s %f", f, s);' might well work.

PS: This confusion can happen in any language that follows the C ABI on a platform with this sort of split usage of registers (although many languages may prevent this sort of argument type confusion). Not all languages do; famously, Go currently passes all arguments on the stack (as of Go 1.15 and soon Go 1.16).

PrintfAndArgumentPassing written at 00:13:49; Add Comment

2020-12-20

Go modules are soon going to be the only future

One of the things that the Go project talked about in Eleven Years of Go, posted in November, is the plans for Go 1.16 and Go 1.17, and along with them the plans for Go modules. A short summary of all of that information is that Go modules are soon going to be our only option. This is said straight up in the article:

We will also finally wind down support for GOPATH-based development: any programs using dependencies other than the standard library will need a go.mod.

The current specifics are set out in the GOPATH wiki page. Go 1.16, expected in February 2021, will change the default to GO111MODULE=on (which requires go.mod even if you are under $GOPATH/src). Go 1.17 will entirely remove support for anything other than Go module mode. Given how the Go authors usually work, I expect this to remove the actual code that supports 'GOPATH development mode', not just remove access to it (for instance by removing checking $GO111MODULE and hard-wiring the value).

(It's possible that this timeline will turn out to be too aggressive and the Go authors will get enough pushback to postpone the removal past Go 1.17. The upgrade to Go 1.16 will be when people who are currently quietly working in GOPATH development mode will be forced to confront this, so this may uncover more people with more practical problems than expected.)

Given this aggressive timeline, I don't think there's any point in explicitly setting $GO111MODULE to 'auto' or 'off' (contrary to what I thought a year or so ago, which I admit that I didn't follow). If anything, you should set GO111MODULE=on now to buy yourself a bit of extra time to find any problems and figure out workarounds and new ways of working if you need them. I'm not going to go that far, but I'm lazy.

I have mixed feelings about the shift overall. I think Go modules are clearly better for development (including for enabling easy self-contained source code), but as someone who mostly consumes Go programs that I get with 'go get ...', it's probably going to make my life more complicated.

(I'd say that it'll make it harder for me to keep up with what programs have updated source code (cf), but in practice I no longer even look; I just do 'go get -u' on everything every so often.)

GoModulesOnlyFuture written at 00:52:36; Add Comment

2020-12-18

On Go, release timing, and new machines

One of the early articles I read about the Apple ARM Macs was On the Apple Silicon M1 MacBook Pro (via). In it, the author (a developer) observed that Go wouldn't officially support these new machines until February 2021, when Go 1.16 is released (cf). I read this (in context) as being unhappy with the time it will take for Go to add this support and have it out in the world, and I had some feelings about that.

Here is the thing. Go, and open source projects in general, are under no obligation to do (or rush out) out of cycle releases just because a company has released a new product. Go's release cycles are very predictable; it releases every six months, with a stabilization period of a couple of months beforehand (Go 1.16 Beta has just been released). This means that when Apple announced these machines in June and offered to let people buy development kits, it was already far too late for the Go 1.15 release in August (Go 1.15 Beta 1 came at the start of June).

It also seems that rapidly landing support for macOS ARM64 would have had its own quality problems. I watch the development version of Go, so I've been able to see a stream of changes being made to add support for macOS ARM64, some of which work around kernel bugs or deal with issues only experienced on production Apple Silicon machines. Some changes did land before production hardware was available (for example), but a version of Go released before or just as the M1 machines themselves were available would not necessarily have been a good one.

(The earliest clear macOS ARM64 change I can find in the git logs is this one, from September 2nd.)

More broadly, Go has a variety of priorities for Go 1.16, as the development team does for every release. Supporting macOS ARM64 is only one of them, and probably not the largest one given what they wrote about the upcoming release in Eleven Years of Go. You cannot expect an open source project to put its other priorities on hold to rush out support for new machines.

This is not unique to Go and Apple M1s. It applies to all open source projects and all new things that need work to support; you always may need to wait a while. People who've used Linux for long enough and buy new machines are very familiar with this; it used to be that you should never buy a new machine (or new components) shortly after it was released if you wanted to immediately use Linux on it (this is still true for some sorts of hardware).

(Since Go already supports macOS and ARM64, you might think that this is a small change that's easy to do. Unfortunately there's no guarantee that the combination of the two works the way the code expects, and the Go code also apparently contained a number of assumptions about what various things implied, required, and so on. This is not surprising in any code; as people say repeatedly, if it's not tested, it doesn't work.)

GoTimingAndNewMachines written at 23:35:54; Add Comment

2020-11-27

Setting up self-contained Go program source that uses packages

Suppose, not entirely hypothetically, that you're writing a Go program in an environment that normally doesn't use Go. You're completely familiar with Go, with a $GOPATH and a custom Go environment and so on, so you can easily build your program. But your coworkers aren't, and you would like to give them source code that is as close to completely self-contained as possible, where they can rebuild your program with, say, 'cd /some/where; some-command' and they don't need to follow a ten-step procedure. At the same time, you'd like to use Go packages to modularize your own code so that you don't have to have everything in package main.

(You might also want to use some external packages, like golang.org/x/crypto/ssh.)

When I started thinking about this in 2018, doing this was a bit complicated. On modern versions of Go, ones with support for modules, it's gotten much simpler, at least for single programs (as opposed to a collection of them). On anything from Go 1.11 onward (I believe), what you want to do is as follows:

  • If you haven't already done so, set up a go.mod for your program and add all of the dependencies. This more or less follows Using go modules, but assumes that you already have a working program that you haven't modularized.

    go mod init cslab/ssh-validation
    go mod tidy
    

    If you don't publish your program anywhere, it's fine to give it some internal name. Otherwise you should use the official published name.

  • Vendor everything that you use:

    go mod vendor
    

  • Do modular builds using the vendored version of the packages. Not using the vendored version should work (assuming that all external packages are still there), but it will download things and clutter up your $GOPATH/pkg directory (wherever that is).

    go build -mod vendor
    

    You may want to create a Makefile that does this so that people (including you in the future) can just run 'make' instead of having to remember the extra arguments to 'go build'.

    (Since I haven't kept track of Go module support very well, I had to look up that 'go build -mod vendor' has been supported since Go 1.11, which is also the first version of Go to support modules.)

On modern versions of Go, this will automatically work right even if you have the source inside $GOPATH/src. On older versions you may need to force GO111MODULE=yes GO111MODULE=on (and so you may want to put this in your Makefile). On very old versions of Go you'll have problems, because they have either no Go module support or very limited support.

Unfortunately one of those old versions of Go is what is on Ubuntu 18.04 LTS, which ships with go 1.10.4 and has never been updated. If you're in this situation, things are much more complicated. Increasingly my view is that old versions of Go without good module support are now not very usable and you're going to need to persuade people to use updated ones. The easiest way to do this is probably to set up a tree of a suitable Go version (you can use the official binaries if you want) and then change your program's Makefile to explicitly use that local copy of Go.

PS: Use of an explicit '-mod vendor' argument may not be necessary under some circumstances; see the footnote here. I've seen somewhat inconsistent results with this, though.

GoSelfContainedSource written at 20:04:45; Add Comment

2020-11-09

Getting the git tags that are before and after a commit (in simple cases)

When I investigate something in a code base, I often wind up wanting to know when a particular change became available in a release, or in general to know when it was made in terms not of time but of releases. Using release dates is both not reliable (since a change can land early in a side branch and then be merged into mainline only much later) and a certain amount of pain (you have to look up release dates somewhere). For git-based projects, my general approach so far has been to fire up gitk, put in the SHA1 of the commit I care about, and see what gitk says it was Before and After.

As it turns out, there is a better way to do this with Git command line tools, with 'git describe' (as I found out when I bothered to do some Internet searches). For my future reference, the two commands I want to get the tag before and after a commit are:

; git describe f3a7f6610f
zfs-0.6.3-49-gf3a7f6610
; git describe --contains f3a7f6610f
zfs-0.6.4~197

I was already sort of familiar with the first form of 'git describe', because various projects I build use it to automatically generate identifiers for what I'm building. The second form is new to me, as is the '~<count>' that means 'so many commits before tag <X>'.

I imagine that there are all sorts of complex git tree states that can make this question harder to answer. Fortunately I don't think I deal with any projects that might give me that sort of heartburn; the ones where I care about this question like relatively linear git histories.

PS: I don't know how git is establishing what tag is the most recent before a commit or the first after one, but I trust it to basically work. This is where I start having to look up the difference between plain tags and annotated tags, and other stuff like that (cf).

GitTagsBeforeAfterCommit written at 15:49:16; Add Comment

2020-10-16

Go is gaining the ability to trace init calls on program startup

Go packages can have init() initialization functions, which are called when a Go program starts as part of package initialization. One of the practical issues with init functions in Go so far is that their performance and even their existence is relatively opaque, so that it's hard to tell how much of an impact they have on the startup time of your programs.

The good news is that the Go team is moving to change this lack of visibility, as tracked through this issue and recently landed in the development version of Go (what will become Go 1.16) in this change. To quote the change:

runtime: implement GODEBUG=inittrace=1 support

Setting inittrace=1 causes the runtime to emit a single line to standard error for each package with init work, summarizing the execution time and memory allocation.

The emitted debug information for init functions can be used to find bottlenecks or regressions in Go startup performance.

Somewhat to my surprise, this starts acting early enough that it reports on the init functions even in the runtime package. For me, the consistent first two lines for program startup, present even with a program that does nothing, are:

init internal/bytealg @0 ms, 0 ms clock, 0 bytes, 0 allocs
init runtime @0.062 ms, 0.069 ms clock, 0 bytes, 0 allocs

On the one hand, I think that making init functions more visible is a good thing in general, and will definitely encourage people to make them minimal. On the other hand, I wonder if people seeing a long list of init functions, even in typical programs, will lead to discouraging their use entirely even if the replacement isn't as good (for instance, doing the same work with sync.Once). It's certainly a bit startling to see how many init functions there are in typical Go programs.

(One rule of thumb is that you get what you measure, and reporting init functions is now implicitly measuring them.)

GoTracingInitCalls written at 00:24:46; Add Comment

2020-10-15

Go packages can have more than one init() function

Go has some surprisingly complex rules for how packages are initialized, partly because package level variables can be initialized based on the value returned from function and method calls (and then other variables can be initialized from them). As part of package initialization, you can have an initialization function, called init(), that will be called.

Or at least that's what I would have told you before I actually had a reason to read that section of the Go language specification today. In fact, the specification is very clear that you can have more than one init() function in a single package:

Variables may also be initialized using functions named init declared in the package block, with no arguments and no result parameters.

func init() { … }

Multiple such functions may be defined per package, even within a single source file. [...]

(Emphasis mine. Package initialization then has details on what order these init functions are run in.)

At first this surprised me, but once I thought more it makes sense. On a practical engineering level, it means that you don't have to jam all initialization in a package into a single function in a single file that everyone has to touch; you can spread it out in small pieces wherever is logical and clear.

(You do have to keep track of it all, and the order that functions in different files get run in depends on how they're built and linked. The Package initialization section has some suggestions about that down at the bottom, which you probably don't have to worry about if you build things with plain usage of go since it should do it right for you.)

Because I was curious, I scanned the Go source tree itself to see if anything used multiple init functions, especially in the same file. There is definitely a decent amount of usage of this within the same package, and even a few cases in the same file (for example, in cmd/go/main.go). Unsurprisingly, the runtime package is a big user of this, since it covers a lot of functionality; a lot of files in src/runtime have their own init functions to cover their specific concerns.

(However the champion user of init functions is cmd/compile/internal/ssa/gen.)

GoMultipleInitFunctions written at 00:07:32; Add Comment

2020-09-29

Where (and how) you limit your concurrency in Go can matter

At the start of September, I wrote about how concurrency is still not easy even in Go, using a section of real code with a deadlock as the example. In that entry, I proposed three fixes to remove the deadlock. Since Hillel Wayne's Finding Goroutine Bugs with TLA+ has now formally demonstrated that all three of my proposed fixes work, I can talk about the practical differences between them.

For convenience, here's the original code from the first entry:

func FindAll() []P {
   pss, err := ps.Processes()
   [...]
   found := make(chan P)
   limitCh := make(chan struct{}, concurrencyProcesses)

   for _, pr := range pss {
      // deadlocks here:
      limitCh <- struct{}{}
      pr := pr
      go func() {
         defer func() { <-limitCh }()
         [... get a P with some error checking ...]
         // and deadlocks here:
         found <- P
      }()
   }
   [...]

   var results []P
   for p := range found {
      results = append(results, p)
   }
   return results
}

The buffered limitCh channel is used to implement a limited supply of tokens, to hold down the number of goroutines that are getting P's at once. The bug in this code is that the goroutines only receive from limitCh to release their token after sending their result to the unbuffered found channel, while the main code only starts receiving from found after running through the entire for loop, and the main code takes the token in the loop and blocks if no tokens are available. (For more, see the original entry.)

There are at least three fixes possible: the goroutines can send to limitCh instead of the main function doing it, the goroutines can receive from limitCh before sending to found, or the entire for loop can be in an additional goroutine so that it doesn't block the main function from starting to receive from found. All three of these fixes work, as proven by Hillel Wayne, but they have different effects on the number of goroutines that this code will run if pss is large and what the state of those goroutines is.

If our goal is to minimize resource usage, the worst fix is for goroutines to receive from limitCh before sending to found. This fix will cause almost all goroutines to stall in the send to found, because all but a few of them must be started and run almost to completion before the main code can finish the for loop and start receiving from found to unblock all of those sends and let the goroutines exit. These waiting to send goroutines are keeping used their fully expanded goroutine stacks, and possibly other resources that have not yet been released by them exiting and things becoming unused so the garbage collector can collect them (or by additional defer statements releasing things).

The middling fix is for goroutines to receive from limitCh instead of the for loop doing it. We will probably immediately create and start almost all of the full pss worth of goroutines, which could be bad if pss is very large, but at least they all block immediately with almost no resources used and with very small goroutine stacks. Still, this is a bunch of memory and a bunch of (Go) scheduler churn to start all of those goroutines only to have most of them immediately block receiving from limitCh. There's also going to be a lot of contention on internal runtime locks associated with limitCh, since a lot of goroutines are queueing up on it.

The best fix for resource usage is to push the for loop into its own goroutine but to otherwise keep things the same. Because the for loop is still receiving from limitCh before it creates a new goroutine, the number of simultaneous goroutines we ever have will generally be limited to around our desired concurrency level (there will be some extra that have received from limitCh but not yet finished completely exiting).

It's likely that none of this matters if the for loop only has to deal with a few hundred entries, and that's probably the case for this code (at least most of the time). But it makes for a useful illustration. When you're writing code with enforced limited concurrency it's probably worthwhile to think about where you want to limit the concurrency and what effects that has on things overall. As we can see here, small implementation choices can have potentially large impacts.

(Also, sometimes I think too much about this sort of thing.)

GoConcurrencyLimitsWhere written at 00:57:04; Add Comment

2020-09-16

Why I write recursive descent parsers (despite their issues)

Today I read Laurence Tratt's Which Parsing Approach? (via), which has a decent overview of how parsing computer languages (including little domain specific languages) is not quite the well solved problem we'd like it to be. As part of the article, Tratt discusses how recursive descent parsers have a number of issues in practice and recommends using other things, such as a LR parser generator.

I have a long standing interest in parsing, I'm reasonably well aware of the annoyances of recursive descent parsers (although some of the issues Tratt raised hadn't occurred to me before now), and I've been exposed to parser generators like Yacc. Despite that, my normal approach to parsing any new little language for real is to write a recursive descent parser in whatever language I'm using, and Tratt's article is not going to change that. My choice here is for entirely pragmatic reasons, because to me recursive descent parsers generally have two significant advantages over all other real parsers.

The first advantage is that almost always, a recursive descent parser is the only or at least easiest form of parser you can readily create using only the language's standard library and tooling. In particular, parsing LR, LALR, and similar formal grammars generally requires you to find, select, and install a parser generator tool (or more rarely, an additional package). Very few languages ship their standard environment with a parser generator (or a lexer, which is often required in some form by the parser).

(The closest I know of is C on Unix, where you will almost always find some version of lex and yacc. Not entirely coincidentally, I've used lex and yacc to write a parser in C, although a long time ago.)

By contrast, a recursive descent parser is just code in the language. You can obviously write that in any language, and you can build a little lexer to go along with it that's custom fitted to your particular recursive descent parser and your language's needs. This also leads to the second significant advantage, which is that if you write a recursive descent parser, you don't need to learn a new language, the language of the parser generator, and also learn how to hook that new language to the language of your program, and then debug the result. Your entire recursive descent parser (and your entire lexer) are written in one language, the language you're already working in.

If I was routinely working in a language that had a well respected de facto standard parser generator and lexer, and regularly building parsers for little languages for my programs, it would probably be worth mastering these tools. The time and effort required to do so would be more than paid back in the end, and I would probably have a higher quality grammar too (Tratt points out how recursive descent parsers hide ambiguity, for example). But in practice I bounce back and forth between two languages right now (Go and Python, neither of which have such a standard parser ecology), and I don't need to write even a half-baked parser all that often. So writing another recursive descent parser using my standard process for this has been the easiest way to do it every time I needed one.

(I've developed a standard process for writing recursive descent parsers that makes the whole thing pretty mechanical, but that's a discussion for another entry or really a series of them.)

PS: I can't comment about how easy it is to generate good error messages in modern parser generators, because I haven't used any of them. My experience with my own recursive descent parsers is that it's generally straightforward to get decent error messages for the style of languages that I create, and usually simple to tweak the result to give clearer errors in some specific situations (eg, also).

WhyRDParsersForMe written at 00:33:22; Add Comment

(Previous 10 or go back to September 2020 at 2020/09/15)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.