Sorting out Go's 'for ... = range ..' and when it copies things

March 19, 2020

I recently read Some tricks and tips for using for range in GoLang, where it said, somewhat in passing:

[...] As explained earlier, when the loop begins, it will copy the original array to a new one and loop through the elements, hence when appending elements to the original array, the copied array actually doesn't change.

My eyebrows went up because I'd forgotten this little bit of Go, and I promptly scuttled off to the official specification to read and understand the details. So here are some notes, because the issues behind this turn out to be more interesting than I expected.

Let's start with the basic form, which is 'for ... := range a { ... }'. The expression to the right of the range is called the range expression. The specification says (emphasis mine):

The range expression x is evaluated once before beginning the loop, with one exception: if at most one iteration variable is present and len(x) is constant, the range expression is not evaluated.

Obviously if the range expression is a function call, the function call must be made (once) and then the return value used in the range expression. However, in Go even evaluating an expression that's a single variable produces a copy of the value of that variable (in the abstract; in the concrete the compiler may optimize this out). So when you write 'for a, b := range c', Go (nominally) evaluates c and uses the resulting copy of c's current value.

(Among other consequences, this means that assigning a different value to c itself inside the loop doesn't change what the loop does; c's value is frozen at the start, when it's evaluated.)

As the additional bit of the specification explains, this doesn't happen if you use at most one iteration value and you're ranging over one of the small number of things where len(x) is a constant (the rules for this are somewhat legalistic). If you use two iteration variables, you always evaluate the range expression and make a copy, which is another reason for Go to prefer the single variable version (to go with nudging you to not copy actual values unless necessary).

However, things get tricky if you use pointers. Here:

a := [5]int{1, 2, 3, 4, 5}
for _, v := range a {
    a[3] = 10
    fmt.Println("Pass 1:", v)
// reset our mutation
a[3] = 4
// loop via pointer:
b := &a
for _, v := range b {
    b[3] = 10
    fmt.Println("Pass 2:", v)

In the second loop, what gets copied when the range expression is evaluated is the pointer, not the array it points to (note that b is not a slice, it's a pointer to an array). Go's implicit dereferencing of pointers means that the code for the two loops looks exactly the same, although they behave differently (the first prints the original array values before the mutation in the loop, the second mutates 'a[3]' before printing it).

On the one hand, this may be confusing. On the other hand, this provides a way to effectively sidestep all sorts of range expression copying (if you don't want it); all you have to do is pointerize your range expression, and almost nothing will care about the difference. Fortunately often you don't care about the copying to begin with, because making copies of strings, slices, and maps doesn't require copying the underlying data. The only thing that you can range over that's expensive to copy is an actual array, and directly using actual arrays in Go is relatively rare (especially when using real arrays can cause interesting errors).

If you do a 'copying' range over anything other than a real array (which is copied) or a string (which is immutable), you can still mutate the values from what you're ranging over in your range loop in a way that future iterations of your range loop will or at least may see. Probably you don't want to do this.

(This is the consequence of ranging over slices and maps not making a copy of the underlying data. Because your range copies the slice itself, shrinking or enlarging the original slice won't change the number of iterations. You can potentially change the number of iterations of a map inside of the loop, though.)

Probably I don't need to care about this range copying, at least from an efficiency perspective (I had better remember its other consequences). My Go code (and Go in general) only very rarely uses fixed size arrays, which are the only expensive thing to copy. Copying slices and maps is pretty close to free, and those are usually what I range over (apart from channels, which I consider a special case).

Written on 19 March 2020.
« Understanding X mouse cursors (and their several layers of history)
Make sure to keep useful labels in your Prometheus alert rules »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Mar 19 00:55:56 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.