123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470 |
- <!--{
- "Title": "Diagnostics",
- "Template": true
- }-->
- <!--
- NOTE: In this document and others in this directory, the convention is to
- set fixed-width phrases with non-fixed-width spaces, as in
- <code>hello</code> <code>world</code>.
- Do not send CLs removing the interior tags from such phrases.
- -->
- <h2 id="introduction">Introduction</h2>
- <p>
- The Go ecosystem provides a large suite of APIs and tools to
- diagnose logic and performance problems in Go programs. This page
- summarizes the available tools and helps Go users pick the right one
- for their specific problem.
- </p>
- <p>
- Diagnostics solutions can be categorized into the following groups:
- </p>
- <ul>
- <li><strong>Profiling</strong>: Profiling tools analyze the complexity and costs of a
- Go program such as its memory usage and frequently called
- functions to identify the expensive sections of a Go program.</li>
- <li><strong>Tracing</strong>: Tracing is a way to instrument code to analyze latency
- throughout the lifecycle of a call or user request. Traces provide an
- overview of how much latency each component contributes to the overall
- latency in a system. Traces can span multiple Go processes.</li>
- <li><strong>Debugging</strong>: Debugging allows us to pause a Go program and examine
- its execution. Program state and flow can be verified with debugging.</li>
- <li><strong>Runtime statistics and events</strong>: Collection and analysis of runtime stats and events
- provides a high-level overview of the health of Go programs. Spikes/dips of metrics
- helps us to identify changes in throughput, utilization, and performance.</li>
- </ul>
- <p>
- Note: Some diagnostics tools may interfere with each other. For example, precise
- memory profiling skews CPU profiles and goroutine blocking profiling affects scheduler
- trace. Use tools in isolation to get more precise info.
- </p>
- <h2 id="profiling">Profiling</h2>
- <p>
- Profiling is useful for identifying expensive or frequently called sections
- of code. The Go runtime provides <a href="https://golang.org/pkg/runtime/pprof/">
- profiling data</a> in the format expected by the
- <a href="https://github.com/google/pprof/blob/master/doc/README.md">pprof visualization tool</a>.
- The profiling data can be collected during testing
- via <code>go</code> <code>test</code> or endpoints made available from the <a href="/pkg/net/http/pprof/">
- net/http/pprof</a> package. Users need to collect the profiling data and use pprof tools to filter
- and visualize the top code paths.
- </p>
- <p>Predefined profiles provided by the <a href="/pkg/runtime/pprof">runtime/pprof</a> package:</p>
- <ul>
- <li>
- <strong>cpu</strong>: CPU profile determines where a program spends
- its time while actively consuming CPU cycles (as opposed to while sleeping or waiting for I/O).
- </li>
- <li>
- <strong>heap</strong>: Heap profile reports memory allocation samples;
- used to monitor current and historical memory usage, and to check for memory leaks.
- </li>
- <li>
- <strong>threadcreate</strong>: Thread creation profile reports the sections
- of the program that lead the creation of new OS threads.
- </li>
- <li>
- <strong>goroutine</strong>: Goroutine profile reports the stack traces of all current goroutines.
- </li>
- <li>
- <strong>block</strong>: Block profile shows where goroutines block waiting on synchronization
- primitives (including timer channels). Block profile is not enabled by default;
- use <code>runtime.SetBlockProfileRate</code> to enable it.
- </li>
- <li>
- <strong>mutex</strong>: Mutex profile reports the lock contentions. When you think your
- CPU is not fully utilized due to a mutex contention, use this profile. Mutex profile
- is not enabled by default, see <code>runtime.SetMutexProfileFraction</code> to enable it.
- </li>
- </ul>
- <p><strong>What other profilers can I use to profile Go programs?</strong></p>
- <p>
- On Linux, <a href="https://perf.wiki.kernel.org/index.php/Tutorial">perf tools</a>
- can be used for profiling Go programs. Perf can profile
- and unwind cgo/SWIG code and kernel, so it can be useful to get insights into
- native/kernel performance bottlenecks. On macOS,
- <a href="https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/">Instruments</a>
- suite can be used profile Go programs.
- </p>
- <p><strong>Can I profile my production services?</strong></p>
- <p>Yes. It is safe to profile programs in production, but enabling
- some profiles (e.g. the CPU profile) adds cost. You should expect to
- see performance downgrade. The performance penalty can be estimated
- by measuring the overhead of the profiler before turning it on in
- production.
- </p>
- <p>
- You may want to periodically profile your production services.
- Especially in a system with many replicas of a single process, selecting
- a random replica periodically is a safe option.
- Select a production process, profile it for
- X seconds for every Y seconds and save the results for visualization and
- analysis; then repeat periodically. Results may be manually and/or automatically
- reviewed to find problems.
- Collection of profiles can interfere with each other,
- so it is recommended to collect only a single profile at a time.
- </p>
- <p>
- <strong>What are the best ways to visualize the profiling data?</strong>
- </p>
- <p>
- The Go tools provide text, graph, and <a href="http://valgrind.org/docs/manual/cl-manual.html">callgrind</a>
- visualization of the profile data using
- <code><a href="https://github.com/google/pprof/blob/master/doc/README.md">go tool pprof</a></code>.
- Read <a href="https://blog.golang.org/profiling-go-programs">Profiling Go programs</a>
- to see them in action.
- </p>
- <p>
- <img width="800" src="https://storage.googleapis.com/golangorg-assets/pprof-text.png">
- <br>
- <small>Listing of the most expensive calls as text.</small>
- </p>
- <p>
- <img width="800" src="https://storage.googleapis.com/golangorg-assets/pprof-dot.png">
- <br>
- <small>Visualization of the most expensive calls as a graph.</small>
- </p>
- <p>Weblist view displays the expensive parts of the source line by line in
- an HTML page. In the following example, 530ms is spent in the
- <code>runtime.concatstrings</code> and cost of each line is presented
- in the listing.</p>
- <p>
- <img width="800" src="https://storage.googleapis.com/golangorg-assets/pprof-weblist.png">
- <br>
- <small>Visualization of the most expensive calls as weblist.</small>
- </p>
- <p>
- Another way to visualize profile data is a <a href="http://www.brendangregg.com/flamegraphs.html">flame graph</a>.
- Flame graphs allow you to move in a specific ancestry path, so you can zoom
- in/out of specific sections of code.
- The <a href="https://github.com/google/pprof">upstream pprof</a>
- has support for flame graphs.
- </p>
- <p>
- <img width="800" src="https://storage.googleapis.com/golangorg-assets/flame.png">
- <br>
- <small>Flame graphs offers visualization to spot the most expensive code-paths.</small>
- </p>
- <p><strong>Am I restricted to the built-in profiles?</strong></p>
- <p>
- Additionally to what is provided by the runtime, Go users can create
- their custom profiles via <a href="/pkg/runtime/pprof/#Profile">pprof.Profile</a>
- and use the existing tools to examine them.
- </p>
- <p><strong>Can I serve the profiler handlers (/debug/pprof/...) on a different path and port?</strong></p>
- <p>
- Yes. The <code>net/http/pprof</code> package registers its handlers to the default
- mux by default, but you can also register them yourself by using the handlers
- exported from the package.
- </p>
- <p>
- For example, the following example will serve the pprof.Profile
- handler on :7777 at /custom_debug_path/profile:
- </p>
- <p>
- <pre>
- package main
- import (
- "log"
- "net/http"
- "net/http/pprof"
- )
- func main() {
- mux := http.NewServeMux()
- mux.HandleFunc("/custom_debug_path/profile", pprof.Profile)
- log.Fatal(http.ListenAndServe(":7777", mux))
- }
- </pre>
- </p>
- <h2 id="tracing">Tracing</h2>
- <p>
- Tracing is a way to instrument code to analyze latency throughout the
- lifecycle of a chain of calls. Go provides
- <a href="https://godoc.org/golang.org/x/net/trace">golang.org/x/net/trace</a>
- package as a minimal tracing backend per Go node and provides a minimal
- instrumentation library with a simple dashboard. Go also provides
- an execution tracer to trace the runtime events within an interval.
- </p>
- <p>Tracing enables us to:</p>
- <ul>
- <li>Instrument and analyze application latency in a Go process.</li>
- <li>Measure the cost of specific calls in a long chain of calls.</li>
- <li>Figure out the utilization and performance improvements.
- Bottlenecks are not always obvious without tracing data.</li>
- </ul>
- <p>
- In monolithic systems, it's relatively easy to collect diagnostic data
- from the building blocks of a program. All modules live within one
- process and share common resources to report logs, errors, and other
- diagnostic information. Once your system grows beyond a single process and
- starts to become distributed, it becomes harder to follow a call starting
- from the front-end web server to all of its back-ends until a response is
- returned back to the user. This is where distributed tracing plays a big
- role to instrument and analyze your production systems.
- </p>
- <p>
- Distributed tracing is a way to instrument code to analyze latency throughout
- the lifecycle of a user request. When a system is distributed and when
- conventional profiling and debugging tools don’t scale, you might want
- to use distributed tracing tools to analyze the performance of your user
- requests and RPCs.
- </p>
- <p>Distributed tracing enables us to:</p>
- <ul>
- <li>Instrument and profile application latency in a large system.</li>
- <li>Track all RPCs within the lifecycle of a user request and see integration issues
- that are only visible in production.</li>
- <li>Figure out performance improvements that can be applied to our systems.
- Many bottlenecks are not obvious before the collection of tracing data.</li>
- </ul>
- <p>The Go ecosystem provides various distributed tracing libraries per tracing system
- and backend-agnostic ones.</p>
- <p><strong>Is there a way to automatically intercept each function call and create traces?</strong></p>
- <p>
- Go doesn’t provide a way to automatically intercept every function call and create
- trace spans. You need to manually instrument your code to create, end, and annotate spans.
- </p>
- <p><strong>How should I propagate trace headers in Go libraries?</strong></p>
- <p>
- You can propagate trace identifiers and tags in the
- <a href="/pkg/context#Context"><code>context.Context</code></a>.
- There is no canonical trace key or common representation of trace headers
- in the industry yet. Each tracing provider is responsible for providing propagation
- utilities in their Go libraries.
- </p>
- <p>
- <strong>What other low-level events from the standard library or
- runtime can be included in a trace?</strong>
- </p>
- <p>
- The standard library and runtime are trying to expose several additional APIs
- to notify on low level internal events. For example,
- <a href="/pkg/net/http/httptrace#ClientTrace"><code>httptrace.ClientTrace</code></a>
- provides APIs to follow low-level events in the life cycle of an outgoing request.
- There is an ongoing effort to retrieve low-level runtime events from
- the runtime execution tracer and allow users to define and record their user events.
- </p>
- <h2 id="debugging">Debugging</h2>
- <p>
- Debugging is the process of identifying why a program misbehaves.
- Debuggers allow us to understand a program’s execution flow and current state.
- There are several styles of debugging; this section will only focus on attaching
- a debugger to a program and core dump debugging.
- </p>
- <p>Go users mostly use the following debuggers:</p>
- <ul>
- <li>
- <a href="https://github.com/derekparker/delve">Delve</a>:
- Delve is a debugger for the Go programming language. It has
- support for Go’s runtime concepts and built-in types. Delve is
- trying to be a fully featured reliable debugger for Go programs.
- </li>
- <li>
- <a href="https://golang.org/doc/gdb">GDB</a>:
- Go provides GDB support via the standard Go compiler and Gccgo.
- The stack management, threading, and runtime contain aspects that differ
- enough from the execution model GDB expects that they can confuse the
- debugger, even when the program is compiled with gccgo. Even though
- GDB can be used to debug Go programs, it is not ideal and may
- create confusion.
- </li>
- </ul>
- <p><strong>How well do debuggers work with Go programs?</strong></p>
- <p>
- The <code>gc</code> compiler performs optimizations such as
- function inlining and variable registerization. These optimizations
- sometimes make debugging with debuggers harder. There is an ongoing
- effort to improve the quality of the DWARF information generated for
- optimized binaries. Until those improvements are available, we recommend
- disabling optimizations when building the code being debugged. The following
- command builds a package with no compiler optimizations:
- <p>
- <pre>
- $ go build -gcflags=all="-N -l"
- </pre>
- </p>
- As part of the improvement effort, Go 1.10 introduced a new compiler
- flag <code>-dwarflocationlists</code>. The flag causes the compiler to
- add location lists that helps debuggers work with optimized binaries.
- The following command builds a package with optimizations but with
- the DWARF location lists:
- <p>
- <pre>
- $ go build -gcflags="-dwarflocationlists=true"
- </pre>
- </p>
- <p><strong>What’s the recommended debugger user interface?</strong></p>
- <p>
- Even though both delve and gdb provides CLIs, most editor integrations
- and IDEs provides debugging-specific user interfaces.
- </p>
- <p><strong>Is it possible to do postmortem debugging with Go programs?</strong></p>
- <p>
- A core dump file is a file that contains the memory dump of a running
- process and its process status. It is primarily used for post-mortem
- debugging of a program and to understand its state
- while it is still running. These two cases make debugging of core
- dumps a good diagnostic aid to postmortem and analyze production
- services. It is possible to obtain core files from Go programs and
- use delve or gdb to debug, see the
- <a href="https://golang.org/wiki/CoreDumpDebugging">core dump debugging</a>
- page for a step-by-step guide.
- </p>
- <h2 id="runtime">Runtime statistics and events</h2>
- <p>
- The runtime provides stats and reporting of internal events for
- users to diagnose performance and utilization problems at the
- runtime level.
- </p>
- <p>
- Users can monitor these stats to better understand the overall
- health and performance of Go programs.
- Some frequently monitored stats and states:
- </p>
- <ul>
- <li><code><a href="/pkg/runtime/#ReadMemStats">runtime.ReadMemStats</a></code>
- reports the metrics related to heap
- allocation and garbage collection. Memory stats are useful for
- monitoring how much memory resources a process is consuming,
- whether the process can utilize memory well, and to catch
- memory leaks.</li>
- <li><code><a href="/pkg/runtime/debug/#ReadGCStats">debug.ReadGCStats</a></code>
- reads statistics about garbage collection.
- It is useful to see how much of the resources are spent on GC pauses.
- It also reports a timeline of garbage collector pauses and pause time percentiles.</li>
- <li><code><a href="/pkg/runtime/debug/#Stack">debug.Stack</a></code>
- returns the current stack trace. Stack trace
- is useful to see how many goroutines are currently running,
- what they are doing, and whether they are blocked or not.</li>
- <li><code><a href="/pkg/runtime/debug/#WriteHeapDump">debug.WriteHeapDump</a></code>
- suspends the execution of all goroutines
- and allows you to dump the heap to a file. A heap dump is a
- snapshot of a Go process' memory at a given time. It contains all
- allocated objects as well as goroutines, finalizers, and more.</li>
- <li><code><a href="/pkg/runtime#NumGoroutine">runtime.NumGoroutine</a></code>
- returns the number of current goroutines.
- The value can be monitored to see whether enough goroutines are
- utilized, or to detect goroutine leaks.</li>
- </ul>
- <h3 id="execution-tracer">Execution tracer</h3>
- <p>Go comes with a runtime execution tracer to capture a wide range
- of runtime events. Scheduling, syscall, garbage collections,
- heap size, and other events are collected by runtime and available
- for visualization by the go tool trace. Execution tracer is a tool
- to detect latency and utilization problems. You can examine how well
- the CPU is utilized, and when networking or syscalls are a cause of
- preemption for the goroutines.</p>
- <p>Tracer is useful to:</p>
- <ul>
- <li>Understand how your goroutines execute.</li>
- <li>Understand some of the core runtime events such as GC runs.</li>
- <li>Identify poorly parallelized execution.</li>
- </ul>
- <p>However, it is not great for identifying hot spots such as
- analyzing the cause of excessive memory or CPU usage.
- Use profiling tools instead first to address them.</p>
- <p>
- <img width="800" src="https://storage.googleapis.com/golangorg-assets/tracer-lock.png">
- </p>
- <p>Above, the go tool trace visualization shows the execution started
- fine, and then it became serialized. It suggests that there might
- be lock contention for a shared resource that creates a bottleneck.</p>
- <p>See <a href="https://golang.org/cmd/trace/"><code>go</code> <code>tool</code> <code>trace</code></a>
- to collect and analyze runtime traces.
- </p>
- <h3 id="godebug">GODEBUG</h3>
- <p>Runtime also emits events and information if
- <a href="https://golang.org/pkg/runtime/#hdr-Environment_Variables">GODEBUG</a>
- environmental variable is set accordingly.</p>
- <ul>
- <li>GODEBUG=gctrace=1 prints garbage collector events at
- each collection, summarizing the amount of memory collected
- and the length of the pause.</li>
- <li>GODEBUG=schedtrace=X prints scheduling events every X milliseconds.</li>
- </ul>
- <p>The GODEBUG environmental variable can be used to disable use of
- instruction set extensions in the standard library and runtime.</p>
- <ul>
- <li>GODEBUG=cpu.all=off disables the use of all optional
- instruction set extensions.</li>
- <li>GODEBUG=cpu.<em>extension</em>=off disables use of instructions from the
- specified instruction set extension.<br>
- <em>extension</em> is the lower case name for the instruction set extension
- such as <em>sse41</em> or <em>avx</em>.</li>
- </ul>
|