Jalex Chang

2023.11.16

Per-Iteration Loop Variable in Go

Jalex Chang

  • Saff Software Engineer @ Crescendo Lab
  • Gopher
  • Love software engineering, database systems, and distributed systems

 

Agenda

  • Introduction

  • New language specification

  • Compatibility

  • Discussions

  • Summary

Introduction

In this tech talk, we want to introduce the semantic changes on For Loop Variables since Go 1.22.

 

Topics will be covered in the talk:

  • Rationales behind the changes

  • New language specification

  • How Go handles compatibility for breaking changes

  • How does semantic change affect our daily life

For Loop variables at present

  • Current For Loop variables are per-loop scoped.
  • It's annoying/troublesome when using For Loop variables with closure, pointer, and goroutine.
var prints []func()

for i := 0; i < 3; i++ {
	prints = append(prints, func() {
        println(i) 
    })
}

for _, print := range prints {
    print()
}

// Output:
// 3
// 3
// 3
var prints []func()

for _, s := range []string{"a", "b", "c"} {
	prints = append(prints, func() {
        println(s) 
    })
}

for _, print := range prints {
    print()
}

// output:
// c
// c
// c

For Clause:

For Range:

Workarounds

Explicitly declare variables again within an iteration scope, forcing the variables to become per-iteration scoped.

var prints []func()

for i := 0; i < 3; i++ {
	i := i
	prints = append(prints, func() { println(i) })
}

for _, print := range prints {
		print()
}

// Output:
// 0
// 1
// 2
var prints []func()

for _, s := range []string{"a", "b", "c"} {
	s:=s
	prints = append(prints, func() { println(s) })
}

for _, print := range prints {
	print()
}

// output:
// a
// b
// c

For Clause:

For Range:

Why are semantics worth changing? And why now?

  1. Probably every Go programmer in the world has suffered from this before.
    • We meet the problem again and again.
  2. Current workarounds are unclear and sometimes unnecessary.
    • 12K of top 14k git repos have used the workaround "x:=x".
    • Half of the commits were unnecessary.
  3. Go module (go.mod) enables fine-grained compiling control.
    • It gives us a way to guarantee that all old code is unaffected, even in a build containing new code.
    • Only when we change required Go version in go.mod, packages in that module get the new semantics.

Semantics Changes

New semantics since Go 1.22 - For Clause

The init statement may be a short variable declaration (:=), but the post statement must not. Each iteration has its own separate declared variable (or variables). The variable used by the first iteration is declared by the init statement. The variable used by each subsequent iteration is declared implicitly before executing the post statement and initialized to the value of the previous iteration's variable at that moment.

var prints []func()

for i := 0; i < 3; i++ {
    prints = append(prints, func() { println(i) })
}
var prints []func()

{
    i_outer := 0
    first := true
    for {
        i := i_outer
        if first {
            first = false
        } else {
            i++
        }
        if !(i < 3) {
            break
        }
        prints = append(prints, func() { println(i) })
        i_outer = i
    }
}

New semantics since Go 1.22 - For Range

The iteration variables may be declared by the “range” clause using a form of short variable declaration (:=). In this case their types are set to the types of the respective iteration values and their scope is the block of the “for” statement; each iteration has its own separate variables. If the iteration variables are declared outside the “for” statement, after execution their values will be those of the last iteration

var prints []func()

for _, s := range []string{"a", "b", "c"} {
    prints = append(prints, func() { println(s) })
}
var prints []func()
{
    var s_outer string
    for _, s_outer = range []string{"a", "b", "c"} {
        s := s_outer
        prints = append(prints, func() { println(s) })
    }
}

Experimental demo in Go 1.21

Go's official demo: https://go.dev/play/p/lDFLrPOcdz3

Enable the experiment flag "loopvar"  to hint at Go compiler rewrites For Loop.

// GOEXPERIMENT=loopvar

package main

func main() {
    var prints []func()
    for i := range make([]int, 5) {
        prints = append(prints, func() { println(i) })
    }
    for _, p := range prints {
        p()
    }
}

// output:
// 0
// 1
// 2
// 3
// 4

Compatibility

User controllable "breaking changes"

The change in language specification will fix far more programs than it breaks, but it may break a very small number of programs - buggy-already codes

 

To make the potential breakage completely user-controlled, the rollout would decide whether to use the new semantics based on the go line in each package’s go.mod file.

  • Enable the mew semantics (per-iteration lool variables) only if Go compiler and Go program’s required version ≥ 1.22.
  • Otherwise, keep the old semantics (per-loop loop variables).

 

Example - control semantics by Go modules

Transition support tooling

To transit the new semantics safely, two tools are supported:

  • Compiler flag loopvar: reports every loop compiling differently due to the new semantics.
    • go build (or test) gcflags=-d=loopvar=2 ...
  • bisect: a new program runs a test repeatedly with different sets of loops opted into the new semantics.
    • Using a binary search-like algorithm, bisect can pinpoint the exact loop or loops that, when converted to the new semantics, cause a test failure.
    • bisect -compile=loopvar go test ...

We have used bisect in a conversion of Google's internal monorepo to the new loop semantics. The rate of test failure caused by the change was about 1 in 8,000.

Example - compiler flag

$ go test ./loopvar/...
ok      command-line-arguments  0.247s
// loopvar/sum_test.go
package main
import "testing"

func TestSum(t *testing.T) {
    list := []int{2, 4, 6}
    want := 12
    if got := Sum(l); got != want {
        t.Errorf("Sum(%v) = %v, want %v", list, got, want)
    }
}
$ go install golang.org/dl/gotip@latest
$ gotip download

$ GOEXPERIMENT=loopvar gotip test ./loopvar/...
--- FAIL: TestSum (0.00s)
    sum_test.go:9: Sum([2 4 6]) = 2, want 12
FAIL
FAIL    loopvar      0.244s
FAIL

Old semantics:

New semantics:

// loopvar/sum.go
package main

func Sum(list []int) int {
    m := make(map[*int]int)
    for _, x := range list {
        // In old semantic,
        // value of &x is always the same.
        m[&x] += x
    }
    for _, sum := range m {
       return sum
    }
    return 0
}

func main() {
    list := []int{2, 4, 6}
    print(Sum(list))
}
$ gotip build -gcflags=-d=loopvar=2 ./loopvar
loopvar/sum.go:5:9: loop variable x \
now per-iteration, heap-allocated

Example - compiler flag (real-world application)

$ GOEXPERIMENT=loopvar gotip test ./internal/...
ok      github.com/chatbotgang/cantata/internal/adapter/eventbroker     1.299s
ok      github.com/chatbotgang/cantata/internal/adapter/repository/es   1.267s
ok      github.com/chatbotgang/cantata/internal/adapter/repository/gcs  1.506s
ok      github.com/chatbotgang/cantata/internal/adapter/repository/local        1.793s
ok      github.com/chatbotgang/cantata/internal/adapter/repository/postgres     2.402s
ok      github.com/chatbotgang/cantata/internal/app/service/auth        1.317s
ok      github.com/chatbotgang/cantata/internal/app/service/cdp 1.250s
ok      github.com/chatbotgang/cantata/internal/app/service/chat        2.788s
ok      github.com/chatbotgang/cantata/internal/app/service/organization        1.504s
......
ok      github.com/chatbotgang/cantata/internal/app/service/utils       1.408s]
ok      github.com/chatbotgang/cantata/internal/app/service/workertask  1.274s
ok      github.com/chatbotgang/cantata/internal/domain/chat     1.299s
ok      github.com/chatbotgang/cantata/internal/domain/common   1.449s
ok      github.com/chatbotgang/cantata/internal/domain/common/requestid 1.668s
ok      github.com/chatbotgang/cantata/internal/domain/organization     1.357s
ok      github.com/chatbotgang/cantata/internal/router  1.305s

$ gotip build -gcflags=-d=loopvar=2 -o bin/cantata ./cmd/cantata
$

Example - bisect

$ go install golang.org/x/tools/cmd/bisect@latest
$ bisect -compile=loopvar gotip test ./cmd/loopvar/...
bisect: checking target with all changes disabled
bisect: run: GOCOMPILEDEBUG=loopvarhash=n gotip test ./loopvar/...... ok (11 matches)
bisect: run: GOCOMPILEDEBUG=loopvarhash=n gotip test ./loopvar/...... ok (11 matches)
bisect: checking target with all changes enabled
bisect: run: GOCOMPILEDEBUG=loopvarhash=y gotip test ./loopvar/...... FAIL (11 matches)
bisect: run: GOCOMPILEDEBUG=loopvarhash=y gotip test ./loopvar/...... FAIL (11 matches)
bisect: target succeeds with no changes, fails with all changes
bisect: searching for minimal set of enabled changes causing failure
bisect: run: GOCOMPILEDEBUG=loopvarhash=+0 gotip test ./loopvar/...... FAIL (7 matches)
bisect: run: GOCOMPILEDEBUG=loopvarhash=+0 gotip test ./loopvar/...... FAIL (7 matches)
bisect: run: GOCOMPILEDEBUG=loopvarhash=+00 gotip test ./loopvar/...... FAIL (3 matches)
bisect: run: GOCOMPILEDEBUG=loopvarhash=+00 gotip test ./loopvar/...... FAIL (3 matches)
......
bisect: run: GOCOMPILEDEBUG=loopvarhash=v+x0b0 gotip test ./loopvar/...... FAIL (1 matches)
bisect: run: GOCOMPILEDEBUG=loopvarhash=v+x0b0 gotip test ./loopvar/...... FAIL (1 matches)
bisect: FOUND failing change set
--- change set #1 (enabling changes causes failure)
loopvar/sum.go:5:9: loop variable x now per-iteration
loopvar/sum.go:5:9: loop variable x now per-iteration (loop inlined into loopvar/sum.go:17)
loopvar/sum.go:5:9: loop variable x now per-iteration (loop inlined into loopvar/sum_test.go:8)
---

Use the same example as compiler flags (loopvar_test.go):

Example - bisect (real-world application)

$ bisect -compile=loopvar gotip test ./internal/...

bisect: checking target with all changes disabled
bisect: run: GOCOMPILEDEBUG=loopvarhash=n gotip test ./internal/...... ok (102 matches)
bisect: run: GOCOMPILEDEBUG=loopvarhash=n gotip test ./internal/...... ok (102 matches)

bisect: checking target with all changes enabled
bisect: run: GOCOMPILEDEBUG=loopvarhash=y gotip test ./internal/...... ok (102 matches)
bisect: run: GOCOMPILEDEBUG=loopvarhash=y gotip test ./internal/...... ok (102 matches)

bisect: fatal error: target succeeds with no changes and all changes

Takeaways

In this sharing, we have introduced the semantics changed on For Loop Variables since Go 1.22.

  • The scope of Loop variables is changed to per-iteration.
    • Go compiler re-writes applications automatically.
  • Go module (go.mod) provides a great compiling control
    • 📌 Keep the required Go version < 1.22 if your applications aren't ready for the breaking changes.
  • 📌 Use bisect if your application is covered by tests well.
  • 📌 Use compiler flag loopvar when building.
  • Finally, let's pray 😇

We are hiring now! 

Thanks for listening

Per-Iteration Loop Variable in Go

By Jalex Chang

Per-Iteration Loop Variable in Go

Go is planned to change For Loop semantics in 1.22 (https://go.dev/blog/loopvar-preview), changing the scope of loop variables from per-loop to per-iteration. In this sharing, let’s talk about (1) the rationale behind this “breaking change” and (2) how it affects our systems and daily life.

  • 130