A Practical Guide to Profiling in Go | The GoLand Blog
GoLand
The IDE for professional development in Go
Follow
Follow:
X X
Youtube Youtube
RSS RSS
slack slack
Download
GoLand<br>A Practical Guide to Profiling in Go
Dominika Stankiewicz
As is often the case with Go, the standard library comes with a great tool for profiling your programs – pprof. It samples your call stack, allowing you to later generate reports that help you analyze and visualize your software’s performance without installing any plugins. Everything you need is in the Go development kit.
The problem? It’s a bit of a hassle. In our discussions with Go developers, we’ve heard that some actually avoid it if they can. There could be a few reasons for this. For many developers, typical Go services perform well enough without optimizations, so when they do need to use profiling, it becomes a complex “rescue mission” tool they aren’t really experienced with. For some, the issue isn’t profiling in itself, but rather what to do with the results. Since pprof just shows developers a lot of low-level profiling data, it’s on them to make sense of it and find the root of the issue. On the other end of the spectrum, there are those who practice continuous profiling and use dedicated tools for it.
This article serves as a practical guide for those developers who would rather avoid dealing with Go’s confusing profiling tools. Profiling is incredibly useful – it helps you identify CPU bottlenecks, memory issues, and concurrency problems, all of which affect both your and your users’ experience with your product. So to help you make the best use of it, we will explain some of the main profiling types in Go (CPU, heap, allocs, mutex, block, and goroutine), as well as how to run and interpret them. And because you’re on the JetBrains blog, we’ll also show you how GoLand makes profiling as easy as pressing a single button. But first…
How does profiling work in Go?
Go profilers track program performance by sampling the call stack and additional data at either regular time intervals or upon specific runtime events, depending on the profile. They generate profile files that can then be analyzed using tools like the pprof CLI or its web interface, so you can see where your program spends time and memory. This helps you find functions that use unnecessary resources and slow down the program, without having to guess. For example, Go’s diagnostic documentation recommends profiling to identify expensive or frequently called code paths.
Types of profiles in Go
As mentioned in the intro, Go comes pre-equipped with a profiling tool called pprof, so you don’t need any external libraries. There are different things you can analyze with it, depending on your needs. The most popular profiles that we’ll be discussing in this article are:
CPU : Samples the call stack and tracks where CPU time was spent.
Memory (allocs / heap) : Tracks allocations (total / currently in use) to show you where memory is being used.
Block : Tracks blocking events, showing you where goroutines were blocked.
Mutex : Captures which goroutines blocked other goroutines, revealing lock contention.
Goroutine : Takes snapshots of stack traces of goroutines to show you how many there are at the moment, and what they’re doing.
It’s perhaps worth mentioning here that Go also has the runtime/trace package – an execution tracer that records specific runtime events, capturing the timeline rather than snapshots. runtime/trace will not be covered in this article.
CPU
CPU profiling is often the first step when diagnosing performance issues in Go programs. It records where your program spends CPU time by periodically sampling the stack of the goroutines that are being executed.
It’s good for things like finding hot paths in CPU-bound code (e.g. expensive parsing, serialization, hashing, or tight loops), understanding why a benchmark is slower than expected under realistic load, investigating the root cause of a Grafana alert, or generating input for profile-guided optimization.
What this profile does not tell you is where your program spends time waiting on locks or the network. Since CPU profiling samples active execution, blocking and contention need other profiles, like the block and mutex ones described below. This means the actual running time of a goroutine will not match its execution time on the CPU.
Memory profiles – heap and allocs
The memory profiles – heap and allocs – are perhaps the most confusing, even for seasoned Go developers. To clarify: heap and allocs are both types of memory profiling that give you insights into memory consumption, allocation patterns, and garbage collection (GC). Under the hood, both store the same data. The only difference is which sample type they present as the default.
The sampling types available in both profiles are:
inuse_space: The amount of memory (in bytes) that’s currently allocated and has not yet been garbage...