mirror of
https://github.com/prometheus/node_exporter.git
synced 2024-11-09 23:24:09 -08:00
b14168cf6a
* Add tracepoint collector option for perf collector Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
231 lines
12 KiB
Markdown
231 lines
12 KiB
Markdown
# Node exporter
|
|
|
|
[![CircleCI](https://circleci.com/gh/prometheus/node_exporter/tree/master.svg?style=shield)][circleci]
|
|
[![Buildkite status](https://badge.buildkite.com/94a0c1fb00b1f46883219c256efe9ce01d63b6505f3a942f9b.svg)](https://buildkite.com/prometheus/node-exporter)
|
|
[![Docker Repository on Quay](https://quay.io/repository/prometheus/node-exporter/status)][quay]
|
|
[![Docker Pulls](https://img.shields.io/docker/pulls/prom/node-exporter.svg?maxAge=604800)][hub]
|
|
[![Go Report Card](https://goreportcard.com/badge/github.com/prometheus/node_exporter)][goreportcard]
|
|
|
|
Prometheus exporter for hardware and OS metrics exposed by \*NIX kernels, written
|
|
in Go with pluggable metric collectors.
|
|
|
|
The [WMI exporter](https://github.com/martinlindhe/wmi_exporter) is recommended for Windows users.
|
|
To expose NVIDIA GPU metrics, [prometheus-dcgm
|
|
](https://github.com/NVIDIA/gpu-monitoring-tools/tree/master/exporters/prometheus-dcgm)
|
|
can be used.
|
|
|
|
## Collectors
|
|
|
|
There is varying support for collectors on each operating system. The tables
|
|
below list all existing collectors and the supported systems.
|
|
|
|
Collectors are enabled by providing a `--collector.<name>` flag.
|
|
Collectors that are enabled by default can be disabled by providing a `--no-collector.<name>` flag.
|
|
|
|
### Enabled by default
|
|
|
|
Name | Description | OS
|
|
---------|-------------|----
|
|
arp | Exposes ARP statistics from `/proc/net/arp`. | Linux
|
|
bcache | Exposes bcache statistics from `/sys/fs/bcache/`. | Linux
|
|
bonding | Exposes the number of configured and active slaves of Linux bonding interfaces. | Linux
|
|
boottime | Exposes system boot time derived from the `kern.boottime` sysctl. | Darwin, Dragonfly, FreeBSD, NetBSD, OpenBSD, Solaris
|
|
conntrack | Shows conntrack statistics (does nothing if no `/proc/sys/net/netfilter/` present). | Linux
|
|
cpu | Exposes CPU statistics | Darwin, Dragonfly, FreeBSD, Linux, Solaris
|
|
cpufreq | Exposes CPU frequency statistics | Linux, Solaris
|
|
diskstats | Exposes disk I/O statistics. | Darwin, Linux, OpenBSD
|
|
edac | Exposes error detection and correction statistics. | Linux
|
|
entropy | Exposes available entropy. | Linux
|
|
exec | Exposes execution statistics. | Dragonfly, FreeBSD
|
|
filefd | Exposes file descriptor statistics from `/proc/sys/fs/file-nr`. | Linux
|
|
filesystem | Exposes filesystem statistics, such as disk space used. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD
|
|
hwmon | Expose hardware monitoring and sensor data from `/sys/class/hwmon/`. | Linux
|
|
infiniband | Exposes network statistics specific to InfiniBand and Intel OmniPath configurations. | Linux
|
|
ipvs | Exposes IPVS status from `/proc/net/ip_vs` and stats from `/proc/net/ip_vs_stats`. | Linux
|
|
loadavg | Exposes load average. | Darwin, Dragonfly, FreeBSD, Linux, NetBSD, OpenBSD, Solaris
|
|
mdadm | Exposes statistics about devices in `/proc/mdstat` (does nothing if no `/proc/mdstat` present). | Linux
|
|
meminfo | Exposes memory statistics. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD
|
|
netclass | Exposes network interface info from `/sys/class/net/` | Linux
|
|
netdev | Exposes network interface statistics such as bytes transferred. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD
|
|
netstat | Exposes network statistics from `/proc/net/netstat`. This is the same information as `netstat -s`. | Linux
|
|
nfs | Exposes NFS client statistics from `/proc/net/rpc/nfs`. This is the same information as `nfsstat -c`. | Linux
|
|
nfsd | Exposes NFS kernel server statistics from `/proc/net/rpc/nfsd`. This is the same information as `nfsstat -s`. | Linux
|
|
pressure | Exposes pressure stall statistics from `/proc/pressure/`. | Linux (kernel 4.20+ and/or [CONFIG\_PSI](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/accounting/psi.txt))
|
|
rapl | Exposes various statistics from `/sys/class/powercap`. | Linux
|
|
schedstat | Exposes task scheduler statistics from `/proc/schedstat`. | Linux
|
|
sockstat | Exposes various statistics from `/proc/net/sockstat`. | Linux
|
|
softnet | Exposes statistics from `/proc/net/softnet_stat`. | Linux
|
|
stat | Exposes various statistics from `/proc/stat`. This includes boot time, forks and interrupts. | Linux
|
|
textfile | Exposes statistics read from local disk. The `--collector.textfile.directory` flag must be set. | _any_
|
|
thermal\_zone | Exposes thermal zone & cooling device statistics from `/sys/class/thermal`. | Linux
|
|
time | Exposes the current system time. | _any_
|
|
timex | Exposes selected adjtimex(2) system call stats. | Linux
|
|
udp_queues | Exposes UDP total lengths of the rx_queue and tx_queue from `/proc/net/udp` and `/proc/net/udp6`. | Linux
|
|
uname | Exposes system information as provided by the uname system call. | Darwin, FreeBSD, Linux, OpenBSD
|
|
vmstat | Exposes statistics from `/proc/vmstat`. | Linux
|
|
xfs | Exposes XFS runtime statistics. | Linux (kernel 4.4+)
|
|
zfs | Exposes [ZFS](http://open-zfs.org/) performance statistics. | [Linux](http://zfsonlinux.org/), Solaris
|
|
|
|
### Disabled by default
|
|
|
|
The perf collector may not work by default on all Linux systems due to kernel
|
|
configuration and security settings. To allow access, set the following sysctl
|
|
parameter:
|
|
|
|
```
|
|
sysctl -w kernel.perf_event_paranoid=X
|
|
```
|
|
|
|
- 2 allow only user-space measurements (default since Linux 4.6).
|
|
- 1 allow both kernel and user measurements (default before Linux 4.6).
|
|
- 0 allow access to CPU-specific data but not raw tracepoint samples.
|
|
- -1 no restrictions.
|
|
|
|
Depending on the configured value different metrics will be available, for most
|
|
cases `0` will provide the most complete set. For more information see [`man 2
|
|
perf_event_open`](http://man7.org/linux/man-pages/man2/perf_event_open.2.html).
|
|
|
|
By default, the perf collector will only collect metrics of the CPUs that
|
|
`node_exporter` is running on (ie
|
|
[`runtime.NumCPU`](https://golang.org/pkg/runtime/#NumCPU). If this is
|
|
insufficient (e.g. if you run `node_exporter` with its CPU affinity set to
|
|
specific CPUs), you can specify a list of alternate CPUs by using the
|
|
`--collector.perf.cpus` flag. For example, to collect metrics on CPUs 2-6, you
|
|
would specify: `--collector.perf --collector.perf.cpus=2-6`. The CPU
|
|
configuration is zero indexed and can also take a stride value; e.g.
|
|
`--collector.perf --collector.perf.cpus=1-10:5` would collect on CPUs
|
|
1, 5, and 10.
|
|
|
|
The perf collector is also able to collect
|
|
[tracepoint](https://www.kernel.org/doc/html/latest/core-api/tracepoint.html)
|
|
counts when using the `--collector.perf.tracepoint` flag. Tracepoints can be
|
|
found using [`perf list`](http://man7.org/linux/man-pages/man1/perf.1.html) or
|
|
from debugfs. And example usage of this would be
|
|
`--collector.perf.tracepoint="sched:sched_process_exec"`.
|
|
|
|
|
|
Name | Description | OS
|
|
---------|-------------|----
|
|
buddyinfo | Exposes statistics of memory fragments as reported by /proc/buddyinfo. | Linux
|
|
devstat | Exposes device statistics | Dragonfly, FreeBSD
|
|
drbd | Exposes Distributed Replicated Block Device statistics (to version 8.4) | Linux
|
|
interrupts | Exposes detailed interrupts statistics. | Linux, OpenBSD
|
|
ksmd | Exposes kernel and system statistics from `/sys/kernel/mm/ksm`. | Linux
|
|
logind | Exposes session counts from [logind](http://www.freedesktop.org/wiki/Software/systemd/logind/). | Linux
|
|
meminfo\_numa | Exposes memory statistics from `/proc/meminfo_numa`. | Linux
|
|
mountstats | Exposes filesystem statistics from `/proc/self/mountstats`. Exposes detailed NFS client statistics. | Linux
|
|
ntp | Exposes local NTP daemon health to check [time](./docs/TIME.md) | _any_
|
|
processes | Exposes aggregate process statistics from `/proc`. | Linux
|
|
qdisc | Exposes [queuing discipline](https://en.wikipedia.org/wiki/Network_scheduler#Linux_kernel) statistics | Linux
|
|
runit | Exposes service status from [runit](http://smarden.org/runit/). | _any_
|
|
supervisord | Exposes service status from [supervisord](http://supervisord.org/). | _any_
|
|
systemd | Exposes service and system status from [systemd](http://www.freedesktop.org/wiki/Software/systemd/). | Linux
|
|
tcpstat | Exposes TCP connection status information from `/proc/net/tcp` and `/proc/net/tcp6`. (Warning: the current version has potential performance issues in high load situations.) | Linux
|
|
wifi | Exposes WiFi device and station statistics. | Linux
|
|
perf | Exposes perf based metrics (Warning: Metrics are dependent on kernel configuration and settings). | Linux
|
|
|
|
### Textfile Collector
|
|
|
|
The textfile collector is similar to the [Pushgateway](https://github.com/prometheus/pushgateway),
|
|
in that it allows exporting of statistics from batch jobs. It can also be used
|
|
to export static metrics, such as what role a machine has. The Pushgateway
|
|
should be used for service-level metrics. The textfile module is for metrics
|
|
that are tied to a machine.
|
|
|
|
To use it, set the `--collector.textfile.directory` flag on the Node exporter. The
|
|
collector will parse all files in that directory matching the glob `*.prom`
|
|
using the [text
|
|
format](http://prometheus.io/docs/instrumenting/exposition_formats/). **Note:** Timestamps are not supported.
|
|
|
|
To atomically push completion time for a cron job:
|
|
```
|
|
echo my_batch_job_completion_time $(date +%s) > /path/to/directory/my_batch_job.prom.$$
|
|
mv /path/to/directory/my_batch_job.prom.$$ /path/to/directory/my_batch_job.prom
|
|
```
|
|
|
|
To statically set roles for a machine using labels:
|
|
```
|
|
echo 'role{role="application_server"} 1' > /path/to/directory/role.prom.$$
|
|
mv /path/to/directory/role.prom.$$ /path/to/directory/role.prom
|
|
```
|
|
|
|
### Filtering enabled collectors
|
|
|
|
The `node_exporter` will expose all metrics from enabled collectors by default. This is the recommended way to collect metrics to avoid errors when comparing metrics of different families.
|
|
|
|
For advanced use the `node_exporter` can be passed an optional list of collectors to filter metrics. The `collect[]` parameter may be used multiple times. In Prometheus configuration you can use this syntax under the [scrape config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<scrape_config>).
|
|
|
|
```
|
|
params:
|
|
collect[]:
|
|
- foo
|
|
- bar
|
|
```
|
|
|
|
This can be useful for having different Prometheus servers collect specific metrics from nodes.
|
|
|
|
## Building and running
|
|
|
|
Prerequisites:
|
|
|
|
* [Go compiler](https://golang.org/dl/)
|
|
* RHEL/CentOS: `glibc-static` package.
|
|
|
|
Building:
|
|
|
|
go get github.com/prometheus/node_exporter
|
|
cd ${GOPATH-$HOME/go}/src/github.com/prometheus/node_exporter
|
|
make
|
|
./node_exporter <flags>
|
|
|
|
To see all available configuration flags:
|
|
|
|
./node_exporter -h
|
|
|
|
## Running tests
|
|
|
|
make test
|
|
|
|
## TLS endpoint
|
|
|
|
** EXPERIMENTAL **
|
|
|
|
The exporter supports TLS via a new web configuration file.
|
|
|
|
```console
|
|
./node_exporter --web.config=web-config.yml
|
|
```
|
|
|
|
See the [https package](https/README.md) for more details.
|
|
|
|
## Using Docker
|
|
The `node_exporter` is designed to monitor the host system. It's not recommended
|
|
to deploy it as a Docker container because it requires access to the host system.
|
|
Be aware that any non-root mount points you want to monitor will need to be bind-mounted
|
|
into the container.
|
|
If you start container for host monitoring, specify `path.rootfs` argument.
|
|
This argument must match path in bind-mount of host root. The node\_exporter will use
|
|
`path.rootfs` as prefix to access host filesystem.
|
|
|
|
```bash
|
|
docker run -d \
|
|
--net="host" \
|
|
--pid="host" \
|
|
-v "/:/host:ro,rslave" \
|
|
quay.io/prometheus/node-exporter \
|
|
--path.rootfs=/host
|
|
```
|
|
|
|
On some systems, the `timex` collector requires an additional Docker flag,
|
|
`--cap-add=SYS_TIME`, in order to access the required syscalls.
|
|
|
|
## Using a third-party repository for RHEL/CentOS/Fedora
|
|
|
|
There is a [community-supplied COPR repository](https://copr.fedorainfracloud.org/coprs/ibotty/prometheus-exporters/) which closely follows upstream releases.
|
|
|
|
[travis]: https://travis-ci.org/prometheus/node_exporter
|
|
[hub]: https://hub.docker.com/r/prom/node-exporter/
|
|
[circleci]: https://circleci.com/gh/prometheus/node_exporter
|
|
[quay]: https://quay.io/repository/prometheus/node-exporter
|
|
[goreportcard]: https://goreportcard.com/report/github.com/prometheus/node_exporter
|