How the Go language detects goroutine leaks in testing

Original link: How the Go language detects goroutine leaks in testing

foreword

Hello, everyone, I'm asong;

As we all know, the design of goroutine is the core part of the concurrent implementation of Go language. It is easy to use, but it also encounters various intractable diseases. Among them, goroutine leakage is one of the serious problems. It often takes a long time to investigate its occurrence. Some people say that pprof can be used to troubleshoot. , although it can achieve its purpose, but these performance analysis tools are often used with the help of their auxiliary troubleshooting after problems occur. Is there a tool that can prevent problems before they occur? Of course, goleak is here. It is open sourced by the Uber team and can be used to detect goroutine leaks. It can be combined with unit testing to prevent it from happening. In this article, we will take a look at goleak.

goroutine leak

I don’t know if you have encountered goroutine leaks in your daily development. Goroutine leaks are actually goroutine blocking. These blocked goroutines will survive until the end of the process, and the stack memory they occupy cannot be released, resulting in more and more available memory of the system. Come less and less until it crashes! A brief summary of several common leak causes:

  • The logic in the Goroutine enters an infinite loop and keeps occupying resources
  • When Goroutine is used with channel/mutex, it has been blocked due to improper use
  • The logic in the Goroutine waits for a long time, causing the number of goroutines to explode

Next we use the classic combination of Goroutine+channel to show goroutine leaks;

func GetData() {
    var ch chan struct{}
    go func() {
        <- ch
    }()
}

func main()  {
    defer func() {
        fmt.Println("goroutines: ", runtime.NumGoroutine())
    }()
    GetData()
    time.Sleep(2 * time.Second)
}

This example is that the channel forgets to initialize, and both read and write operations will cause blocking. If this method is to write a single test, it will not be able to detect the problem:

func TestGetData(t *testing.T) {
    GetData()
}

operation result:

=== RUN   TestGetData
--- PASS: TestGetData (0.00s)
PASS

The built-in test cannot be satisfied. Next, we introduce goleak to test it.

goleak

github address: https://github.com/uber-go/go...

Using goleak mainly focus on two methods: VerifyNone, VerifyTestMain, VerifyNone is used for testing in a single test case, VerifyTestMain can be added in TestMain, which can reduce the invasion of test code, for example as follows:

Use VerifyNone:

func TestGetDataWithGoleak(t *testing.T) {
    defer goleak.VerifyNone(t)
    GetData()
}

operation result:

=== RUN   TestGetDataWithGoleak
    leaks.go:78: found unexpected goroutines:
        [Goroutine 35 in state chan receive (nil chan), with asong.cloud/Golang_Dream/code_demo/goroutine_oos_detector.GetData.func1 on top of the stack:
        goroutine 35 [chan receive (nil chan)]:
        asong.cloud/Golang_Dream/code_demo/goroutine_oos_detector.GetData.func1()
            /Users/go/src/asong.cloud/Golang_Dream/code_demo/goroutine_oos_detector/main.go:12 +0x1f
        created by asong.cloud/Golang_Dream/code_demo/goroutine_oos_detector.GetData
            /Users/go/src/asong.cloud/Golang_Dream/code_demo/goroutine_oos_detector/main.go:11 +0x3c
        ]
--- FAIL: TestGetDataWithGoleak (0.45s)

FAIL

Process finished with the exit code 1

Through the running results, you can see the specific code segment where the goroutine leak occurs; using VerifyNone will invade our test code, you can use the VerifyTestMain method to integrate it into the test faster:

func TestMain(m *testing.M) {
    goleak.VerifyTestMain(m)
}

operation result:

=== RUN   TestGetData
--- PASS: TestGetData (0.00s)
PASS
goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 5 in state chan receive (nil chan), with asong.cloud/Golang_Dream/code_demo/goroutine_oos_detector.GetData.func1 on top of the stack:
goroutine 5 [chan receive (nil chan)]:
asong.cloud/Golang_Dream/code_demo/goroutine_oos_detector.GetData.func1()
    /Users/go/src/asong.cloud/Golang_Dream/code_demo/goroutine_oos_detector/main.go:12 +0x1f
created by asong.cloud/Golang_Dream/code_demo/goroutine_oos_detector.GetData
    /Users/go/src/asong.cloud/Golang_Dream/code_demo/goroutine_oos_detector/main.go:11 +0x3c
]

Process finished with the exit code 1

The running result of VerifyTestMain is a little different from VerifyNone. VerifyTestMain will first report the execution result of the test case, and then report the leak analysis. If there are multiple goroutine leaks in the test case, it is impossible to accurately locate the specific test where the leak occurred. You need to use the following script to further analyze:

# Create a test binary which will be used to run each test individually
$ go test -c -o tests

# Run each test individually, printing "." for successful tests, or the test name
# for failing tests.
$ for test in $(go test -list . | grep -E "^(Test|Example)"); do ./tests -test.run "^$test\$" &>/dev/null && echo -n "." || echo -e "\n$test failed"; done

This will print out exactly which test case failed.

goleak implementation principle

From the VerifyNone entry, we look at the source code, which calls the Find method:

// Find looks for extra goroutines, and returns a descriptive error if
// any are found.
func Find(options ...Option) error {
  // Get the ID of the current goroutine
    cur := stack.Current().ID()

    opts := buildOpts(options...)
    var stacks []stack.Stack
    retry := true
    for i := 0; retry; i++ {
    // Filter out useless goroutine s
        stacks = filterStacks(stack.All(), cur, opts)

        if len(stacks) == 0 {
            return nil
        }
        retry = opts.retry(i)
    }

    return fmt.Errorf("found unexpected goroutines:\n%s", stacks)
}

We're looking at the filterStacks method:

// filterStacks will filter any stacks excluded by the given opts.
// filterStacks modifies the passed in stacks slice.
func filterStacks(stacks []stack.Stack, skipID int, opts *opts) []stack.Stack {
    filtered := stacks[:0]
    for _, stack := range stacks {
        // Always skip the running goroutine.
        if stack.ID() == skipID {
            continue
        }
        // Run any default or user-specified filters.
        if opts.filter(stack) {
            continue
        }
        filtered = append(filtered, stack)
    }
    return filtered
}

This is mainly to filter out some goroutine stack s that do not participate in the detection. If there are no custom filters, the default filters are used:

func buildOpts(options ...Option) *opts {
    opts := &opts{
        maxRetries: _defaultRetries,
        maxSleep:   100 * time.Millisecond,
    }
    opts.filters = append(opts.filters,
        isTestStack,
        isSyscallStack,
        isStdLibStack,
        isTraceStack,
    )
    for _, option := range options {
        option.apply(opts)
    }
    return opts
}

It can be seen from this that the default detection is 20 times, each time the default interval is 100ms; add default filters;

To summarize the implementation principle of goleak:

Use the runtime.Stack() method to obtain the stack information of all the currently running goroutine s. By default, filter items that do not need to be detected are defined by default. The number of detections + detection interval is defined by default, and the detection is performed periodically. Finally, after multiple checks, the remaining ones are still not found. The goroutine is judged that no goroutine leak has occurred.

Summarize

In this article, we share a tool that can detect goroutine leaks during testing, but it still requires complete test case support, which exposes the importance of test cases. Friends, good tools can help us find problems faster, but The quality of the code is still in our own hands, come on, boys~.

Well, this article ends here, I'm asong, see you in the next issue.

Welcome to the public account: Golang Dream Factory

References

Tags: Go Memory Leak

Posted by axon on Mon, 16 May 2022 18:59:13 +0300