The load average metric is misleading as a representation of CPU
saturation. Normal CPU utilization is a better real representation of
saturation.
On newer Linux, there is a new Pressure Stall Information[0] metric that
better represents CPU over saturation. This is also useful as it can
make single-core saturation more visible.
[0]: https://www.kernel.org/doc/html/latest/accounting/psi.html
Signed-off-by: Ben Kochie <superq@gmail.com>
The `instance:node_memory_swap_io_pages:rate1m` rule was intended to
measure the amount of memory pressure a system is under, but its name is
a bit misleading (it specifically refers to swap), and the rate of
`node_vmstat_pgmajfault` is a better metric for memory pressure
(see #1524).
This commit renames `instance:node_memory_swap_io_pages:rate1m` to
`instance:node_vmstat_pgmajfault:rate1m`, and defines it as
`rate(node_vmstat_pgmajfault{%(nodeExporterSelector)s}[1m])`. The
dashboards are updated accordingly.
Signed-off-by: Benoît Knecht <benoit.knecht@fsfe.org>