mirror of
https://github.com/prometheus/node_exporter.git
synced 2025-01-22 03:01:30 -08:00
121 lines
4.3 KiB
Markdown
121 lines
4.3 KiB
Markdown
|
# Perf
|
||
|
[![GoDoc](https://godoc.org/github.com/hodgesds/perf-utils?status.svg)](https://godoc.org/github.com/hodgesds/perf-utils)
|
||
|
|
||
|
This package is a go library for interacting with the `perf` subsystem in
|
||
|
Linux. It allows you to do things like see how many CPU instructions a function
|
||
|
takes, profile a process for various hardware events, and other interesting
|
||
|
things. The library is by no means finalized and should be considered pre-alpha
|
||
|
at best.
|
||
|
|
||
|
# Use Cases
|
||
|
A majority of the utility methods in this package should only be used for
|
||
|
testing and/or debugging performance issues. Due to the nature of the go
|
||
|
runtime profiling on the goroutine level is extremely tricky, with the
|
||
|
exception of a long running worker goroutine locked to an OS thread. Eventually
|
||
|
this library could be used to implement many of the features of `perf` but in
|
||
|
accessible via Go directly.
|
||
|
|
||
|
## Caveats
|
||
|
* Some utility functions will call
|
||
|
[`runtime.LockOSThread`](https://golang.org/pkg/runtime/#LockOSThread) for
|
||
|
you, they will also unlock the thread after profiling. ***Note*** using these
|
||
|
utility functions will incur significant overhead.
|
||
|
* Overflow handling is not implemented.
|
||
|
|
||
|
# Setup
|
||
|
Most likely you will need to tweak some system settings unless you are running as root. From `man perf_event_open`:
|
||
|
|
||
|
```
|
||
|
perf_event related configuration files
|
||
|
Files in /proc/sys/kernel/
|
||
|
|
||
|
/proc/sys/kernel/perf_event_paranoid
|
||
|
The perf_event_paranoid file can be set to restrict access to the performance counters.
|
||
|
|
||
|
2 allow only user-space measurements (default since Linux 4.6).
|
||
|
1 allow both kernel and user measurements (default before Linux 4.6).
|
||
|
0 allow access to CPU-specific data but not raw tracepoint samples.
|
||
|
-1 no restrictions.
|
||
|
|
||
|
The existence of the perf_event_paranoid file is the official method for determining if a kernel supports perf_event_open().
|
||
|
|
||
|
/proc/sys/kernel/perf_event_max_sample_rate
|
||
|
This sets the maximum sample rate. Setting this too high can allow users to sample at a rate that impacts overall machine performance and potentially lock up the machine. The default value is 100000 (samples per
|
||
|
second).
|
||
|
|
||
|
/proc/sys/kernel/perf_event_max_stack
|
||
|
This file sets the maximum depth of stack frame entries reported when generating a call trace.
|
||
|
|
||
|
/proc/sys/kernel/perf_event_mlock_kb
|
||
|
Maximum number of pages an unprivileged user can mlock(2). The default is 516 (kB).
|
||
|
|
||
|
```
|
||
|
|
||
|
# Example
|
||
|
Say you wanted to see how many CPU instructions a particular function took:
|
||
|
|
||
|
```
|
||
|
package main
|
||
|
|
||
|
import (
|
||
|
"fmt"
|
||
|
"log"
|
||
|
"github.com/hodgesds/perf-utils"
|
||
|
)
|
||
|
|
||
|
func foo() error {
|
||
|
var total int
|
||
|
for i:=0;i<1000;i++ {
|
||
|
total++
|
||
|
}
|
||
|
return nil
|
||
|
}
|
||
|
|
||
|
func main() {
|
||
|
profileValue, err := perf.CPUInstructions(foo)
|
||
|
if err != nil {
|
||
|
log.Fatal(err)
|
||
|
}
|
||
|
fmt.Printf("CPU instructions: %+v\n", profileValue)
|
||
|
}
|
||
|
```
|
||
|
|
||
|
# Benchmarks
|
||
|
To profile a single function call there is an overhead of ~0.4ms.
|
||
|
|
||
|
```
|
||
|
$ go test -bench=BenchmarkCPUCycles .
|
||
|
goos: linux
|
||
|
goarch: amd64
|
||
|
pkg: github.com/hodgesds/perf-utils
|
||
|
BenchmarkCPUCycles-8 3000 397924 ns/op 32 B/op 1 allocs/op
|
||
|
PASS
|
||
|
ok github.com/hodgesds/perf-utils 1.255s
|
||
|
```
|
||
|
|
||
|
The `Profiler` interface has low overhead and suitable for many use cases:
|
||
|
|
||
|
```
|
||
|
$ go test -bench=BenchmarkProfiler .
|
||
|
goos: linux
|
||
|
goarch: amd64
|
||
|
pkg: github.com/hodgesds/perf-utils
|
||
|
BenchmarkProfiler-8 3000000 488 ns/op 32 B/op 1 allocs/op
|
||
|
PASS
|
||
|
ok github.com/hodgesds/perf-utils 1.981s
|
||
|
```
|
||
|
|
||
|
# BPF Support
|
||
|
BPF is supported by using the `BPFProfiler` which is available via the
|
||
|
`ProfileTracepoint` function. To use BPF you need to create the BPF program and
|
||
|
then call `AttachBPF` with the file descriptor of the BPF program. This is not
|
||
|
well tested so use at your own peril.
|
||
|
|
||
|
# Misc
|
||
|
Originally I set out to use `go generate` to build Go structs that were
|
||
|
compatible with perf, I found a really good
|
||
|
[article](https://utcc.utoronto.ca/~cks/space/blog/programming/GoCGoCompatibleStructs)
|
||
|
on how to do so. Eventually, after digging through some of the `/x/sys/unix`
|
||
|
code I found pretty much what I was needed. However, I think if you are
|
||
|
interested in interacting with the kernel it is a worthwhile read.
|