Wandering Thoughts

2025-03-20

Go's choice of multiple return values was the simpler option

Yesterday I wrote about Go's use of multiple return values and Go types, in reaction to Mond's Were multiple return values Go's biggest mistake?. One of the things that I forgot to mention in that entry is that I think Go's choice to have multiple values for function returns and a few other things was the simpler and more conservative approach in its overall language design.

In a statically typed language that expects to routinely use multiple return values, as Go was designed to with the 'result, error' pattern, returning multiple values as a typed tuple means that tuple-based types are pervasive. This creates pressures on both the language design and the API of the standard library, especially if you start out (as Go did) being a fairly strongly nominally typed language, where different names for the same concrete type can't be casually interchanged. Or to put it another way, having a frequently used tuple container (meta-)type significantly interacts with and affects the rest of the language.

(For example, if Go had handled multiple values through tuples as explicit typed entities, it might have had to start out with something like type aliases (added only in Go 1.9) and it might have been pushed toward some degree of structural typing, because that probably makes it easier to interact with all of the return value tuples flying around.)

Having multiple values as a special case for function returns, range, and so on doesn't create anywhere near this additional influence and pressure on the rest of the language. There are a whole bunch of questions and issues you don't face because multiple values aren't types and can't be stored or manipulated as single entities. Of course you have to be careful in the language specification and it's not trivial, but it's simpler and more contained than going the tuple type route. I also feel it's the more conservative approach, since it doesn't affect the rest of the language as much as a widely used tuple container type would.

(As Mond criticizes, it does create special cases. But Go is a pragmatic language that's willing to live with special cases.)

GoMultipleReturnValuesSimpler written at 22:56:14;

2025-03-19

Go's multiple return values and (Go) types

Recently I read Were multiple return values Go's biggest mistake? (via), which wishes that Go had full blown tuple types (to put my spin on it). One of the things that struck me about Go's situation when I read the article is exactly the inverse of what the article is complaining about, which is that because Go allows multiple values for function return types (and in a few other places), it doesn't have to have tuple types.

One problem with tuple types in a statically typed language is that they must exist as types, whether declared explicitly or implicitly. In a language like Go, where type definitions create new distinct types even if the structure is the same, it isn't particularly difficult to wind up with an ergonomics problem. Suppose that you want to return a tuple that is a net.Conn and an error, a common pair of return values in the net package today. If that tuple is given a named type, everyone must use that type in various places; merely returning or storing an implicitly declared type that's structurally the same is not acceptable under Go's current type rules. Conversely, if that tuple is not given a type name in the net package, everyone is forced to stick to an anonymous tuple type. In addition, this up front choice is now an API; it's not API compatible to give your previously anonymous tuple type a name or vice versa, even if the types are structurally compatible.

(Since returning something and error is so common an idiom in Go, we're also looking at either a lot of anonymous types or a lot more named types. Consider how many different combinations of multiple return values you find in the net package alone.)

One advantage of multiple return values (and the other forms of tuple assignment, and for range clauses) is that they don't require actual formal types. Functions have a 'result type', which doesn't exist as an actual type, but you also needed to handle the same sort of 'not an actual type' thing for their 'parameter type'. My guess is that this let Go's designers skip a certain amount of complexity in Go's type system, because they didn't have to define an actual tuple (meta-)type or alternately expand how structs worked to cover the tuple usage case,

(Looked at from the right angle, structs are tuples with named fields, although then you get into questions of nested structs act in tuple-like contexts.)

A dynamically typed language like Python doesn't have this problem because there are no explicit types, so there's no need to have different types for different combinations of (return) values. There's simply a general tuple container type that can be any shape you want or need, and can be created and destructured on demand.

(I assume that some statically typed languages have worked out how to handle tuples as a data type within their type system. Rust has tuples, for example; I haven't looked into how they work in Rust's type system, for reasons.)

GoMultipleReturnValuesAndTypes written at 23:31:30;

2025-03-17

I don't think error handling is a solved problem in language design

There are certain things about programming language design that are more or less solved problems, where we generally know what the good and bad approaches are. For example, over time we've wound up agreeing on various common control structures like for and while loops, if statements, and multi-option switch/case/etc statements. The syntax may vary (sometimes very much, as for example in Lisp), but the approach is more or less the same because we've come up with good approaches.

I don't believe this is the case with handling errors. One way to see this is to look at the wide variety of approaches and patterns that languages today take to error handling. There is at least 'errors as exceptions' (for example, Python), 'errors as values' (Go and C), and 'errors instead of results and you have to check' combined with 'if errors happen, panic' (both Rust). Even in Rust there are multiple idioms for dealing with errors; some Rust code will explicitly check its Result types, while other Rust code sprinkles '?' around and accepts that if the program sails off the happy path, it simply dies.

If you were creating a new programming language from scratch, there's no clear agreed answer to what error handling approach you should pick, not the way we have more or less agreed on how for, while, and so on should work. You'd be left to evaluate trade offs in language design and language ergonomics and to make (and justify) your choices, and there probably would always be people who think you should have chosen differently. The same is true of changing or evolving existing languages, where there's no generally agreed on 'good error handling' to move toward.

(The obvious corollary of this is that there's no generally agreed on keywords or other syntax for error handling, the way 'for' and 'while' are widely accepted as keywords as well as concepts. The closest we've come is that some forms of error handling have generally accepted keywords, such as try/catch for exception handling.)

I like to think that this will change at some point in the future. Surely there actually is a good pattern for error handling out there and at some point we will find it (if it hasn't already been found) and then converge on it, as we've converged on programming language things before. But I feel it's clear that we're not there yet today.

ErrorHandlingNotSolvedProblem written at 22:53:22;

2025-03-02

Updating local commits with more changes in Git (the harder way)

One of the things I do with Git is maintain personal changes locally on top of the upstream version, with my changes updated via rebasing every time I pull upstream to update it. In the simple case, I have only a single local change and commit, but in more complex cases I split my changes into multiple local commits; my local version of Firefox currently carries 12 separate personal commits. Every so often, upstream changes something that causes one of those personal changes to need an update, without actually breaking the rebase of that change. When this happens I need to update my local commit with more changes, and often it's not the 'top' local commit (which can be updated simply).

In theory, the third party tool git-absorb should be ideal for this, and I believe I've used it successfully for this purpose in the past. In my most recent instance, though, git-absorb frustratingly refused to do anything in a situation where it felt it should work fine. I had an additional change to a file that was changed in exactly one of my local commits, which feels like an easy case.

(Reading the git-absorb readme carefully suggests that I may be running into a situation where my new change doesn't clash with any existing change. This makes git-absorb more limited than I'd like, but so it goes.)

In Git, what I want is called a 'fixup commit', and how to use it is covered in this Stackoverflow answer. The sequence of commands is basically:

# modify some/file with new changes, then
git add some/file

# Use this to find your existing commit ID
git log some/file

# with the existing commid ID
git commit --fixup=<commit ID>
git rebase --interactive --autosquash <commit ID>^

This will open an editor buffer with what 'git rebase' is about to do, which I can immediately exit out of because the defaults are exactly what I want (assuming I don't want to shuffle around the order of my local commits, which I probably don't, especially as part of a fixup).

I can probably also use 'origin/main' instead of '<commit ID>^', but that will rebase more things than is strictly necessary. And I need the commit ID for the 'git commit --fixup' invocation anyway.

(Sufficiently experienced Git people can probably put together a script that would do this automatically. It would get all of the files staged in the index, find the most recent commit that modified each of them, abort if they're not all the same commit, make a fixup commit to that most recent commit, and then potentially run the 'git rebase' for you.)

GitUpdatingLocalChangesWithMore written at 22:34:31;

2025-02-24

Go's behavior for zero value channels and maps is partly a choice

How Go behaves if you have a zero value channel or map (a 'nil' channel or map) is somewhat confusing (cf, via). When we talk about it, it's worth remembering that this behavior is a somewhat arbitrary choice on Go's part, not a fundamental set of requirements that stems from, for example, other language semantics. Go has reasons to have channels and maps behave as they do, but some those reasons have to do with how channel and map values are implemented and some are about what's convenient for programming.

As hinted at by how their zero value is called a 'nil' value, channel and map values are both implemented as pointers to runtime data structures. A nil channel or map has no such runtime data structure allocated for it (and the pointer value is nil); these structures are allocated by make(). However, this doesn't entirely allow us to predict what happens when you use nil values of either type. It's not unreasonable for an attempt to assign an element to a nil map to panic, since the nil map has no runtime data structure allocated to hold anything we try to put in it. But you don't have to say that a nil map is empty and looking up elements in it gives you a zero value; I think you could have this panic instead, just as assigning an element does. However, this would probably result in less safe code that paniced more (and probably had more checks for nil maps, too).

Then there's nil channels, which don't behave like nil maps. It would make sense for receiving from a nil channel to yield the zero value, much like looking up an element in a nil map, and for sending to a nil channel to panic, again like assigning to an element in a nil map (although in the channel case it would be because there's no runtime data structure where your goroutine could metaphorically hang its hat waiting for a receiver). Instead Go chooses to make both operations (permanently) block your goroutine, with panicing on send reserved for sending to a non-nil but closed channel.

The current semantics of sending on a closed channel combined with select statements (and to a lesser extent receiving from a closed channel) means that Go needs a channel zero value that is never ready to send or receive. However, I believe that Go could readily make actual sends or receives on nil channels panic without any language problems. As a practical matter, sending or receiving on a nil channel is a bug that will leak your goroutine even if your program doesn't deadlock.

Similarly, Go could choose to allocate an empty map runtime data structure for zero value maps, and then let you assign to elements in the resulting map rather than panicing. If desired, I think you could preserve a distinction between empty maps and nil maps. There would be some drawbacks to this that cut against Go's general philosophy of being relatively explicit about (heap) allocations and you'd want a clever compiler that didn't bother creating those zero value runtime map data structures when they'd just be overwritten by 'make()' or a return value from a function call or the like.

(I can certainly imagine a quite Go like language where maps don't have to be explicitly set up any more than slices do, although you might still use 'make()' if you wanted to provide size hints to the runtime.)

Sidebar: why you need something like nil channels

We all know that sometimes you want to stop sending or receiving on a channel in a select statement. On first impression it looks like closing a channel (instead of setting the channel to nil) could be made to work for this (it doesn't currently). The problem is that closing a channel is a global thing, while you may only want a local effect; you want to remove the channel from your select, but not close down other uses of it by other goroutines.

This need for a local effect pretty much requires a special, distinct channel value that is never ready for sending or receiving, so you can overwrite the old channel value with this special value, which we might as well call a 'nil channel'. Without a channel value that serves this purpose you'd have to complicate select statements with some other way to disable specific channels.

(I had to work this out in my head as part of writing this entry so I might as well write it down for my future self.)

GoNilChannelsMapsAreAChoice written at 23:30:28;

2025-02-02

Build systems and their effects on versioning and API changes

In a comment on my entry on modern languages and bad packaging outcomes at scale, sapphirepaw said (about backward and forward compatibility within language ecologies), well, I'm going to quote from it because it's good (but go read the whole comment):

I think there’s a social contract that has broken down somewhere.

[...]

If a library version did break things, it was generally considered a bug, and developers assumed it would be fixed in short order. Then, for the most part, only distributions had to worry about specific package/library-version incompatibilities.

This all falls apart if a developer, or the ecosystem of libraries/language they depend on, ends up discarding that compatibility-across-time. That was the part that made it feasible to build a distribution from a collection of projects that were, themselves, released across time.

I have a somewhat different view. I think that the way it was in the old days was less a social contract and more an effect of the environment that software was released into and built in, and now that the environment has changed, the effects have too.

C famously has a terrible story around its (lack of a) build system and dependency management, and for much of its life you couldn't assume pervasive and inexpensive Internet connectivity (well, you still can't assume the latter globally, but people have stopped caring about such places). This gave authors of open source software a strong incentive to be both backward and forward compatible. If you released a program that required the features of a very recent version of a library, you reduced your audience to people who already had the recent version (or better) or who were willing to go through the significant manual effort to get and build that version of the library, and then perhaps make all of their other programs work with it, since C environments often more or less forced global installation of libraries. If you were a library author releasing a new minor version or patch level that had incompatibilities, people would be very slow to actually install and adopt that version because of those incompatibilities; most of their programs using your libraries wouldn't update on the spot, and there was no good mechanism to use the old version of the library for some programs.

(Technically you could make this work with static linking, but static linking was out of favour for a long time.)

All of this creates a quite strong practical and social push toward stability. If you wanted your program or its new version to be used widely (and you usually did), it had better work with the old versions of libraries that people already had; requiring new APIs or new library behavior was dangerous. If you wanted the new version of your library to be used widely, it had better be compatible with old programs using the old API, and if you wanted a brand new library to be used by people in programs, it had better demonstrate that it was going to be stable.

Much of this spilled over into other languages like Perl and Python. Although both of these developed central package repositories and dependency management schemes, for a long time these mostly worked globally, just like the C library and header ecology, and so they faced similar pressures. Python only added fully supported virtual environments in 2012, for example (in Python 3.3).

Modern languages like Go and Rust (and the Node.js/NPM ecosystem, and modern Python venv based operation) don't work like that. Modern languages mostly use static linking instead of shared libraries (or the equivalent of static linking for dynamic languages, such as Python venvs), and they have build systems that explicitly support automatically fetching and using specific versions of dependencies (or version ranges; most build systems are optimistic about forward compatibility). This has created an ecology where it's much easier to use a recent version of something than it was in C, and where API changes in dependencies often have much less effect because it's much easier (and sometimes even the default) to build old programs with old dependency versions.

(In some languages this has resulted in a lot of programs and packages implicitly requiring relatively recent versions of their dependencies, even if they don't say so and claim wide backward compatibility. This happens because people would have to take explicit steps to test with their stated minimum version requirements and often people don't, with predictable results. Go is an exception here because of its choice of 'minimum version selection' for dependencies over 'maximum version selection', but even then it's easy to drift into using new language features or new standard library APIs without specifically requiring that version of Go.)

One of the things about technology is that technology absolutely affects social issues, so different technology creates different social expectations. I think that's what's happened with social expectations around modern languages. Because they have standard build systems that make it easy to do it, people feel free to have their programs require specific version ranges of dependencies (modern as well as old), and package authors feel free to break things and then maybe fix them later, because programs can opt in or not and aren't stuck with the package's choices for a particular version. There are still forces pushing towards compatibility, but they're weaker than they used to be and more often violated.

Or to put it another way, there was a social contract of sorts for C libraries in the old days but the social contract was a consequence of the restrictions of the technology. When the technology changed, the 'social contract' also changed, with unfortunate effects at scale, which most developers don't care about (most developers aren't operating at scale, they're scratching their own itch). The new technology and the new social expectations are probably better for the developers of programs, who can now easily use new features of dependencies (or alternately not have to update their code to the latest upstream whims), and for the developers of libraries and packages, who can change things more easily and who generally see their new work being used faster than before.

(In one perspective, the entire 'semantic versioning' movement is a reaction to developers not following the expected compatibility that semver people want. If developers were already doing semver, there would be no need for a movement for it; the semver movement exists precisely because people weren't. We didn't have a 'semver' movement for C libraries in the 1990s because no one needed to ask for it, it simply happened.)

BuildSystemsAndAPIChanges written at 16:52:44;

2025-01-19

Sometimes print-based debugging is your only choice

Recently I had to investigate a mysterious issue in our Django based Python web application. This issue happened only when the application was actually running as part of the web server (using mod_wsgi, which effectively runs as an Apache process). The only particularly feasible way to dig into what was going on was everyone's stand-by, print based debugging (because I could print into Apache's error log; I could have used any form of logging that would surface the information). Even if I might have somehow been able to attach a debugger to things to debug a HTTP request in flight, using print based debugging was a lot easier and faster in practice.

I'm a long time fan of print based debugging. Sometimes this is because print based debugging is easier if you only dip into a language every so often, but that points to a deeper issue, which is that almost every environment can print or log. Print or log based 'debugging' is an almost universal way to extract information from a system, and sometimes you have no other practical way to do that.

(The low level programming people sometimes can't even print things out, but there are other very basic ways to communicate things.)

As in my example, one of the general cases where you have very little access other than logs is when your issue only shows up in some sort of isolated or encapsulated environment (a 'production' environment). We have a lot of ways of isolating things these days, things like daemon processes, containers, 'cattle' (virtual) servers, and so on, but they all share the common trait that they deliberately detach themselves away from you. There are good reasons for this (which often can be boiled down to wanting to run in a controlled and repeatable environment), but it has its downsides.

Should print based debugging be the first thing you reach for? Maybe not; some sorts of bugs cause me to reach for a debugger, and in general if you're a regular user of your chosen debugger you can probably get a lot of information with it quite easily, easier than sprinkling print statements all over. But I think that you probably should build up some print debugging capabilities, because sooner or later you'll probably need them.

PrintDebuggingOnlyChoice written at 23:20:09;

2025-01-09

Realizing why Go reflection restricts what struct fields can be modified

Recently I read Rust, reflection and access rules. Among other things, it describes how a hypothetical Rust reflection system couldn't safely allow access to private fields of things, and especially how it couldn't allow code to set them through reflection. My short paraphrase of the article's discussion is that in Rust, private fields can be in use as part of invariants that allow unsafe operations to be done safely through suitable public APIs. This brought into clarity what had previously been a somewhat odd seeming restriction in Go's reflect package.

Famously (for people who've dabbled in reflect), you can only set exported struct fields. This is covered in both the Value.CanSet() package documentation and The Laws of Reflection (in passing). Since one of the uses of reflection is for going between JSON and structs, encoding/json only works on exported struct fields and you'll find a lot of such fields in lots of code. This requirement can be a bit annoying. Wouldn't it be nice if you didn't have to make your fields public just to serialize them easily?

(You can use encoding/json and still serialize non-exported struct fields, but you have to write some custom methods instead of just marking struct fields the way you could if they were exported.)

Go has this reflect restriction, presumably, for the same reason that reflection in Rust wouldn't be able to modify private fields. Since private fields in a Go struct may be used by functions and methods in the package to properly manage the struct, modifying those fields yourself is unsafe (in the general sense). The reflect package will let you see the fields (and their values) but not change their values. You're allowed to change exported fields because (in theory) arbitrary Go code can already change the value of those fields, and so code in the struct's package can't count on them having any particular value. It can at least sort of count on private fields having approved values (or the zero value, I believe).

(I understand why the reflect documentation doesn't explain the logic of not being able to modify private fields, since package documentation isn't necessarily the right place for a rationale. Also, perhaps it was considered obvious.)

GoReflectWhyFieldRestriction written at 23:19:14;

2024-12-16

Some notes on "closed interfaces" in Go

One reaction to basic proposals for union types in Go is to note that "closed interfaces" provide a lot of these features (cf). When I saw this I had to refresh myself about what such a closed interface is, and then think about some of the potential issues and limitations involved, leaving me with some things I want to note down for my later reference.

What I've seen called a closed interface is an interface that requires an unexported method:

type Closed interface {
  isClosed()

  NormalMethod1(...) ...
  NormalMethod2(...) ...
  [...]
}

Although it's not spelled out exactly in the Go language specification, an interface with an unexported method like this can't be implemented by any type outside of its package, because such an external type can't have the right 'isClosed()' method on it (since it's not exported, the identifier for the method is unique to the package, or at least I believe that's how the logic flows). The 'isClosed()' method doesn't do anything and need never be called, it just has to be declared as an (empty) method function on everything that is going to be part of the Closed interface.

This means that there is a finite and known list of types that implement the Closed interface, instead of the potentially unknown list of them that could exist for an open interface. Go tooling can use this knowledge to see, for example, that a type switch on Closed is exhaustive, or that a given type assertion will never succeed. However, in the current state of godoc, I don't believe that generated package documentation will automatically tell you which types implement Closed; you'll have to remember to document that explicitly.

(I don't know if there's any Go tooling that does this today, especially when run in the context of other packages instead of the package that defines the Closed interface.)

Code in other packages can still construct a nil Closed interface value, or zero values of any exported types that implement Closed (and then turn such a zero value into a non-nil Closed interface value). If you want to be extra tricky, you can make all the types that implement Closed be unexported; at that point the only way people outside your package can easily create or manipulate those types is through an instance of Closed that you give them, and implicitly only through its methods. The Closed interface is not merely a closed interface, it has become an opaque interface (short of people getting out the chainsaw of reflect and perhaps unsafe).

However, this is also a limitation of the Closed interface, and of closed interfaces in general. For good reason, you can't add methods to types declared outside your package, so you can't make an outside type be a member of the Closed interface, even though you control the interface's definition in your package. In order to induct a useful outside type into the world of Closed, you must wrap it in a type from your package, and this type must be a concrete type, even if what you want to pull in is another interface. I believe that under at least some circumstances, this will cost you some extra memory. More broadly, I don't think you can really have two separate packages that cooperate so each defines some types that are part of Closed. One way or another you have to put all the types in one package.

In my view, this means that a closed interface isn't really useful to document inside your package that this particular interface will only ever contain a limited number of outside types (or a mixture of outside types and inside types), including outside types from a related package. You can use it for this but there's a chunk of bureaucracy involved for each outside type you want to pull in. If you go to this effort, presumably you have tooling that can deduce what's going on and take advantage of this knowledge.

(These days you could define a generic type that wraps another type and implements your Closed interface for it, making this sort of bureaucracy easier, at least.)

GoClosedInterfacesNotes written at 22:43:29;

2024-12-15

I think Go union type proposals should start with their objectives

At this point I've skimmed a number of relatively serious union type proposals for Go (which is to say, people who were serious enough to write something substantial in the Go issue tracker). One of the feelings I've wound up with as a result of this is that any such union type proposal should probably start out by describing what its objectives are, not what its proposed syntax is.

There are a number of different things union types in Go could do, ranging from creating an interface type that only allows a limited range of concrete types to (potentially) reducing the amount of space needed to store one of several types of values to requiring people to do specific things to extract an interior value from a union value, implicitly forcing them to check for errors (or panic). Some of these objectives will be more or less complete by themselves, but some others will really want (or at least benefit from) additional changes to things like 'go vet', or in some cases be incomplete without other language changes.

Proposals that bury their objectives (or don't clearly spell them out) invite people to guess at what they really want, and then argue about it. Such proposals are also harder to evaluate, since a reader can't judge whether the proposal would have problems achieving its objectives, or even to what extent the objectives are reasonable for Go. Without objectives, people are left to discuss what the proposal does and doesn't achieve without understanding which of those things are important (and need to be kept in any changes) and which aren't (and could be altered or removed).

Presumably the proposer had an objective in mind and so can write it down. If they haven't actually considered their objectives, they should definitely start with that, not with the syntax or language semantics. I won't say that objectives are the most important thing about a change to Go, because the syntax and semantics matter too, but a clearly expressed objective is (in my view) necessary. And if you convince people that the objective is a good one, you have a much better chance of having that objective achieved somehow, even if it happens in a different way than you thought of.

(One reason that people may be reluctant to write down their objective is for fear that others won't like it and will be against the proposal as a result. One of my views is that if you can't persuade people of the objectives, your proposal should fail, and trying to sneak it in is a bad look.)

PS: One of the reasons I'm writing this down is for myself, because I was briefly tempted to write up a straw-person union types idea here on Wandering Thoughts. Had I done so, it definitely wouldn't have started from objectives, but from semantics and some effects. It's easy to have a clever idea (or at least a clever looking idea, one that sounds good for long enough for me to write a blog entry about it), but mere clever ideas are at best a distraction.

GoUnionTypesStartWithGoals written at 23:24:35;

(Previous 10 or go back to December 2024 at 2024/12/02)

Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.