Commit graph

31 commits

Author SHA1 Message Date
Paweł Krupa (paulfantom) 8571536327 docs/node-mixin: add missing selectors
Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>
2022-07-19 16:44:16 +02:00
Daniel Lenar 0b50eb7294 Reverse fsSpaceAvailableCriticalThreshold and fsSpaceAvailableWarningThreshold
Currently critical alert for space available alerts on warning and
warning alert for space available alerts on critical.

Signed-off-by: Daniel Lenar <dlenar@vailsys.com>
2022-04-21 11:34:54 -05:00
Vitaly Zhuravlev 8823605f12 Fix NodeFileDescriptorLimit alerts
Signed-off-by: Vitaly Zhuravlev <zhuravlev.vitaly@gmail.com>
2022-04-07 16:25:17 +04:00
paulfantom 832909dd25 docs/node-mixin/alerts: make NodeFilesystemAlmostOutOfSpace fire earlier
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2021-08-16 16:35:58 +02:00
Loïc Blot 55ffe57cbc
feat(rules): add NodeFileDescriptorLimit kernel exhaustion alert
Add a new alert when fs.file-nr is close to fs.file-max

Signed-off-by: Loic Blot <loic.blot@unix-experience.fr>
2021-04-30 12:40:09 +02:00
Ben Kochie eefb18db02
Merge pull request #1764 from dhoppe/patch-1
Use description instead of message as field for annotations
2021-01-24 14:56:03 +01:00
Ben Kochie 4b68aeb80a
Merge pull request #1862 from fsschmitt/fix/alerts-label-naming
fix: node_md_disks state label from fail to failed
2021-01-24 14:53:22 +01:00
Björn Rabenstein 9c9c636305
Merge pull request #1861 from paulfantom/network-alerts
docs/node-mixin/alerts: use ratio for network alerts
2020-10-19 12:14:24 +02:00
paulfantom f81747e608 docs/node-mixin/alerts: add max error condition to alert about desynchronized clock
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2020-10-08 11:15:16 +02:00
fsschmitt effa4da989 fix: node_md_disks state label as failed
Signed-off-by: fsschmitt <492108+fsschmitt@users.noreply.github.com>
2020-10-07 14:20:56 +01:00
paulfantom d7cbe85d22
docs/node-mixin/alerts: use a rate for network alerts
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2020-10-07 13:04:51 +02:00
Nicolas Lamirault ff2ff3410f
Configure 2 thresholds for NodeFilesystemAlmostOutOfSpace alert (#1835)
* Add: configure 2 thresholds for NodeFilesystemAlmostOutOfSpace alert

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>
2020-09-18 11:28:32 +02:00
Rajat Vig 7dd8adf7ed
Fix NodeRAIDDegraded to not use a string rule expressions
Signed-off-by: Rajat Vig <rvig@etsy.com>
2020-08-28 10:43:39 +01:00
Simon Pasquier 02212dd2c6 Run jsonnetfmt
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-08-25 10:15:30 +02:00
Hao Ke 9b7a0d06a1 Fix syntax error
Signed-off-by: Hao Ke <hao.ke@auryc.com>

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-08-25 10:07:37 +02:00
paulfantom e4ec8e04c5 docs/node-mixin: add alerts about failing RAID array
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2020-08-24 16:17:20 +02:00
Dennis Hoppe fc64b70386
Use description instead of message as field for annotations
Signed-off-by: Dennis Hoppe <github@debian-solutions.de>
2020-06-24 13:38:57 +02:00
Frederic Branczyk b42819b69d
Merge pull request #1657 from povilasv/NodeTextFileCollectorScrapeError
Add NodeTextFileCollectorScrapeError alert to mixin
2020-04-30 17:54:06 +02:00
Povilas Versockas bd3e6d224c
Add NodeTextFileCollectorScrapeError alert to mixin
Signed-off-by: Povilas Versockas <p.versockas@gmail.com>
2020-03-31 18:12:36 +03:00
beorn7 8b00b22904 Fix sign error in NodeClockSkewDetected
Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-25 13:07:23 +01:00
paulfantom 820f8d595e
docs/node-mixin: alert on desynchronised clock
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2020-03-23 08:23:58 +01:00
Neraud 1006a2c4bb Add missing coma
Signed-off-by: Neraud <neraud.login@gmail.com>
2020-03-21 13:06:43 +01:00
Povilas Versockas 48bb6f670c Add NodeHighNumberConntrackEntriesUsed
Signed-off-by: Povilas Versockas <p.versockas@gmail.com>
2020-03-20 17:46:05 +01:00
iuri aranda 0107bc7942
Make FS space alerts thresholds configurable (#1624)
* Make FS space alerts thresholds configurable (#1)

This makes it possible to tweak the thresholds for
the NodeFilesystemSpaceFillingUp alerts. Which
might be necessary in systems like Kubernetes,
where the image garbage collector runs at 85%,
so it's not a problem that the disk reaches that usage %.

Signed-off-by: iuri aranda <iuri@skyscrapers.eu>
2020-03-02 16:24:51 +01:00
Leo dfeec07f2f Fix node-mixin prometheus alert rules to use percentage
Signed-off-by: Leo <leonardjonathanoh@live.com>
2019-09-11 08:47:24 +00:00
beorn7 97ef113762 Make the severity of "critical" alerts configurable
This addresses the blissful scenario where single-node failures are
unproblematic. No reason to wake somebody up if a node is about to
screw itself up by filling the disk.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-08-14 22:24:24 +02:00
beorn7 3a770a0b1d Convert annotations from message to summary/description
Signed-off-by: beorn7 <beorn@grafana.com>
2019-07-16 21:40:57 +02:00
beorn7 a92d1d7889 Address review comments, batch 2
Signed-off-by: beorn7 <beorn@grafana.com>
2019-07-16 21:18:17 +02:00
beorn7 b3b47f2d07 Make selector naming consistent
Signed-off-by: beorn7 <beorn@grafana.com>
2019-07-10 20:09:01 +02:00
beorn7 dec5b5b053 Fix indentation
Signed-off-by: beorn7 <beorn@grafana.com>
2019-07-10 20:07:20 +02:00
beorn7 2df034c055 Move node-mixin into docs directory
Signed-off-by: beorn7 <beorn@grafana.com>
2019-07-05 19:38:03 +02:00