node_exporter

mirror of https://github.com/prometheus/node_exporter.git synced 2025-03-05 21:00:12 -08:00

Author	SHA1	Message	Date
Alexander Soelberg Heidarsson	ae746c8b1d	bugfix: 🐛 remove invalid variable from cluster use dashboards Some checks failed bsd / Run end-to-end tests on FreeBSD (push) Has been cancelled Details bsd / Run end-to-end tests on OpenBSD (push) Has been cancelled Details bsd / Run end-to-end tests on NetBSD (push) Has been cancelled Details bsd / Run end-to-end tests on DragonFly (push) Has been cancelled Details bsd / Run end-to-end tests on Solaris (push) Has been cancelled Details bsd / Run end-to-end tests on macOS (push) Has been cancelled Details Push README to Docker Hub / Push README to Docker Hub (push) Has been cancelled Details Push README to Docker Hub / Push README to quay.io (push) Has been cancelled Details golangci-lint / lint (push) Has been cancelled Details Signed-off-by: Alexander Soelberg Heidarsson <89837986+alex5517@users.noreply.github.com>	2025-02-20 08:25:48 +01:00
v-zhuravlev	f252c4616a	Add NodeSystemdServiceCrashlooping alert to mixin (#3039 ) * Add NodeSystemdServiceCrashlooping alert --------- Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2025-02-16 11:05:46 +01:00
liyang	505363a67d	chore: add instance label in NodeHighNumberConntrackEntriesUsed alert description Signed-off-by: liyang <ly18846162402@163.com>	2024-12-23 11:25:05 +01:00
Duologic	2fccdf4e17	fix(docs): add node(Warning\|Critical)WindowHours to node-mixin Signed-off-by: Duologic <jeroen@simplistic.be>	2024-12-23 11:24:50 +01:00
Tom	d0c1d00d18	Migrate dashboards to new grafonnet library (#3147 ) Some checks failed golangci-lint / lint (push) Has been cancelled Details Migrated away from deprecated Grafonnet library. This replaces panels using Angular JS which are disabled by default in Grafana 11 and will be unsupported in Grafana 12. Fixes #3046 --------- Signed-off-by: Tom <12222103+critchtionary@users.noreply.github.com>	2024-12-19 16:49:22 +01:00
Jan Breitkopf	a38a5d7b48	alerts: exclude iowait from NodeCPUHighUsage alert (#3203 ) Signed-off-by: Jan Breitkopf <jan.breitkopf@prorocketeers.com>	2024-12-17 14:11:26 +01:00
Johannes Ziemke	92c10f9fd1	Add AIX dashboard Signed-off-by: Johannes Ziemke <github@5pi.de>	2024-09-28 15:58:02 +02:00
Stefan Andres	fe71568130	Add UIDs to dashboards (#3042 ) Automatically add a uid to each dashboard. This prevents changing URLs when restarting a grafana pod and re-importing the dashboards via ConfigMaps. Signed-off-by: Stefan Andres <sandres@anaconda.com>	2024-07-14 14:22:52 +02:00
looklose	7d4103c089	chore: fix typo in comment Signed-off-by: looklose <shishuaiqun@yeah.net>	2024-04-10 14:24:02 +02:00
Adrian Berger	cc49133321	Add multi-cluster support for Nodes dashboard (#2945 ) Signed-off-by: Adrian Berger <adria.berger94@gmail.com>	2024-03-08 14:41:36 +01:00
Taylor Sly	9f9473859b	Fix description for NodeDiskIOSaturation alert (#2929 ) NodeDiskIOSaturation description should say 30m per the "for" clause Signed-off-by: Taylor Sly <slyt@users.noreply.github.com>	2024-02-16 08:58:22 +01:00
Anton Lugovoi	81fc05c45f	Make filesystem space prediction window configurable (#2844 ) Signed-off-by: fitz123 <alugovoi@ordercapital.com>	2023-11-13 02:10:56 +01:00
Ayoub NASR	7333465abf	Add NodeBondingDegraded alert (#2843 ) Signed-off-by: Ayoub Nasr <ayoub.nasr@scality.com>	2023-11-13 00:36:30 +01:00
Vitaly Zhuravlev	e8d7f4e8b3	Revert alerts pending durtions Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly	3e250a95a0	Update NodeSystemSaturation severity Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	b7dfb32bfc	Set severity to NodeCPUHighUsage to info Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	6bdc1d9c98	Add thresholds for memory, disk and system alerts Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	77ae769179	Add thresholds for memory alerts Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	2111e70ac7	Add comma after 'mounted on' Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	e48e7909f4	Extend alert description Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	da32f8de17	Decrease NodeSystemdServiceFailed severity to warning Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	580c497261	Add NodeSystemSaturation and NodeMemoryMajorPagesFaults Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	e15e7d6a7b	Fix NodeMemoryHighUtilization alert Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	c3ec6e8af1	Add diskDevice selector Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	962de6c921	Add %(nodeExporterSelector)s to Network and conntrack alerts Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	94fc82e418	Add NodeDiskIOSaturation alert Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	614030bb80	Set 'at' everywhere as preposition for instance Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev	3d8075da7d	Decrease NodeNetwork*Errs pending period Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev	74794182a7	Add failed systemd service alert Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev	fd2d62af63	Add CPU and memory alerts Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev	0e0399d41e	Decrease NodeFilesystem pending time to 15m 30m is too long and there is a risk of running out of disk space/inodes completely if something is filling up disk very fast (like log file). Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev	fc967aa992	Add mountpoint to NodeFilesystem alerts This helps to identify alerting filesystem. Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>	2023-06-29 23:26:51 +08:00
Will Bollock	0a17e17718	docs (node/mixin): fix annotation for Skew alert (#2671 ) This updates the annotation for the NodeClockSkewDetected mixin alert to match the new threshold set. Original discussion was in this PR: https://github.com/prometheus/node_exporter/pull/1480 I spent an embarrassingly large amount of time trying to figure out how the heck that alert would mean 300s of clock skew. Turns out the annotation was just left the same after the threshold change. Signed-off-by: Will Bollock <wbollock@linode.com>	2023-05-11 10:33:10 +02:00
Ryan J. Geyer	5e552bac02	Replace mistaken ) with }, resulting in parsable promql Signed-off-by: Ryan J. Geyer <me@ryangeyer.com>	2022-12-13 13:30:42 +01:00
Jan Fajerski	87b8e3790d	docs/node-mixin: add fsMointpointSelector to alerts and dashboards (#2446 ) * docs/node-mixin: add fsMountpointSelector This adds the option to add a `mountpoint` selector to filesystem related alerts. The default is `mountpoint!=""`. * docs/node-mixins: add fsMountpointSelector to dashboards Signed-off-by: Jan Fajerski <jfajersk@redhat.com>	2022-10-20 13:06:31 +02:00
Vitaly Zhuravlev	7519830a8a	Change io time units to %util When appying rate() to seconds we have 'seconds per second' or fractions of the second, so actually it actually can be from 0 to 1. Also update intervalFactor to 1 for better rates. Signed-off-by: Vitaly Zhuravlev <zhuravlev.vitaly@gmail.com>	2022-07-26 11:09:43 +02:00
Vitaly Zhuravlev	469600f4bf	Update units of network ad disk graphs https://prometheus.io/docs/prometheus/latest/querying/functions/#rate rate() calculates per-second average rate, therefore Bps units should be used for disks. In networking bandwidth throughput is usually measured in bits/s so units are changed accordingly. Signed-off-by: Vitaly Zhuravlev <zhuravlev.vitaly@gmail.com>	2022-07-26 11:09:43 +02:00
Paweł Krupa (paulfantom)	8571536327	docs/node-mixin: add missing selectors Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>	2022-07-19 16:44:16 +02:00
Sven Kieske	d64766f43d	fix the following markdownlint issues (#2362 ) fix the following markdownlint errors (and some more): [..]mixins/node-exporter/README.md:13: MD031 Fenced code blocks should be surrounded by blank lines [..]mixins/node-exporter/README.md:21: MD031 Fenced code blocks should be surrounded by blank lines [..]mixins/node-exporter/README.md:27: MD031 Fenced code blocks should be surrounded by blank lines [..]mixins/node-exporter/README.md:33: MD031 Fenced code blocks should be surrounded by blank lines [..]mixins/node-exporter/README.md:41: MD034 Bare URL used A detailed description of the rules is available at https://github.com/markdownlint/markdownlint/blob/master/docs/RULES.md Signed-off-by: Sven Kieske <s.kieske@mittwald.de>	2022-06-28 05:50:06 +02:00
Björn Rabenstein	e5128e83f2	Merge pull request #2364 from grafana/vzhuravlev/fs_table mixin: Change disk graph to disk table	2022-06-08 20:46:47 +02:00
Jan Fajerski	cec414df78	node-mixins/config: Switch fsAvailable warning and critical thresholds Problem: In `0b50eb7294` the usage of the threshold variables was adjusted. The values had been switched as well resulting in reversed thresholds after the commit above. Warnings now have a smaller threshold than critical alerts. Solution: Adjust thresholds to reflect that warnings should be alerted on before critical alerts. Issues: https://github.com/prometheus/node_exporter/pull/2352 Signed-off-by: Jan Fajerski <jfajersk@redhat.com>	2022-06-07 12:10:48 +02:00
Björn Rabenstein	b5a2ad46e3	Merge pull request #2351 from grafana/vzhuravlev/macos Add darwin dashboard	2022-05-03 12:59:29 +02:00
Vitaly Zhuravlev	eef827006a	Change disk graph to disk table Signed-off-by: Vitaly Zhuravlev <zhuravlev.vitaly@gmail.com>	2022-04-27 19:15:50 +04:00
Daniel Lenar	0b50eb7294	Reverse fsSpaceAvailableCriticalThreshold and fsSpaceAvailableWarningThreshold Currently critical alert for space available alerts on warning and warning alert for space available alerts on critical. Signed-off-by: Daniel Lenar <dlenar@vailsys.com>	2022-04-21 11:34:54 -05:00
Gabriel Amaral Antunes	410e069471	Add darwin dashboard to mixin Signed-off-by: Vitaly Zhuravlev <zhuravlev.vitaly@gmail.com>	2022-04-20 15:18:43 +04:00
Vitaly Zhuravlev	8823605f12	Fix NodeFileDescriptorLimit alerts Signed-off-by: Vitaly Zhuravlev <zhuravlev.vitaly@gmail.com>	2022-04-07 16:25:17 +04:00
Severyn Lisovskyi	7b86b7cb29	[node-mixin] change current datasource to grafana's default Signed-off-by: Severyn Lisovskyi <993215+sev3ryn@users.noreply.github.com>	2022-02-02 14:45:26 +01:00
Julian Wiedmann	3e6f4ce627	mixin: exclude iowait and steal from CPU Utilisation (#2194 ) 'iowait' and 'steal' indicate specific idle/wait states, which shouldn't be counted into CPU Utilisation. Also see https://github.com/prometheus-operator/kube-prometheus/pull/796 and https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/667. Per the iostat man page: %idle Show the percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request. %iowait Show the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request. %steal Show the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtual processor. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>	2021-11-04 11:03:27 +01:00
Ben Kochie	421fc429f3	Replace deprecated linter (#2176 ) Upstream is replacing `golint` with `revive`. * Cleanup unused mixin go files. Signed-off-by: Ben Kochie <superq@gmail.com>	2021-10-27 11:01:15 +02:00
ngc104	4bc1c02000	fix bug in #2130 (#2170 ) Signed-off-by: Yves Mettier <yves.mettier@orange.com> Co-authored-by: Yves Mettier <yves.mettier@orange.com>	2021-10-21 12:07:38 +02:00

1 2 3

123 commits