Commit graph

1490 commits

Author SHA1 Message Date
paulfantom d7cbe85d22
docs/node-mixin/alerts: use a rate for network alerts
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2020-10-07 13:04:51 +02:00
Ben Kochie 306a365377 Downgrade CPU counter warnings
We've gathered enough evidence that the CPU counter bug workaround is
working as intended. Downgrade the message from Warning to Debug.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-10-01 12:41:15 +02:00
Christian Rohmann a3aaf63bb1
Add check state for mdadm arrays via node_md_state metric (#1810)
* Expose metric for state=check for node_md_state
* Added new e2e output fixture including md201 which is in checking state and a the new state=check labeled metric for all other md

Signed-off-by: Christian Rohmann <github@frittentheke.de>
2020-09-27 13:44:45 +02:00
Ben Kochie 3b73912dd8
Update build (#1852)
* Bump Go modules to latest.
* Update to Go 1.15.
* Remove obsolete darwin/386 build.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-09-23 21:06:58 +02:00
Arthur Outhenin-Chalandre 6585e43eec Fix memory gauge in mixin with multiple pods
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
2020-09-23 15:36:43 +02:00
Nicolas Lamirault ff2ff3410f
Configure 2 thresholds for NodeFilesystemAlmostOutOfSpace alert (#1835)
* Add: configure 2 thresholds for NodeFilesystemAlmostOutOfSpace alert

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>
2020-09-18 11:28:32 +02:00
Ben Kochie d8a1585f59
Merge pull request #1834 from prometheus/fix-cpu-spelling
Fix capitalization of CPU acronym throughout
2020-09-04 11:15:53 +02:00
Julius Volz d05aac43e4 Fix capitalization of CPU acronym throughout
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2020-09-03 23:34:33 +02:00
Ben Kochie f07e982c77
Merge pull request #1827 from rajatvig/patch-1
Fix NodeRAIDDegraded to not use a string rule expressions
2020-08-30 16:36:56 +02:00
Rajat Vig 7dd8adf7ed
Fix NodeRAIDDegraded to not use a string rule expressions
Signed-off-by: Rajat Vig <rvig@etsy.com>
2020-08-28 10:43:39 +01:00
Ben Kochie e51b508428
Merge pull request #1823 from simonpasquier/add-mixin-to-ci
*: add mixin tests to CI
2020-08-25 12:00:55 +02:00
Simon Pasquier 02212dd2c6 Run jsonnetfmt
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-08-25 10:15:30 +02:00
Hao Ke 9b7a0d06a1 Fix syntax error
Signed-off-by: Hao Ke <hao.ke@auryc.com>

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-08-25 10:07:37 +02:00
Simon Pasquier 6d959e2e8c *: add mixin tests to CI
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-08-25 10:03:46 +02:00
Julian Kornberger 66fb6762bf
Netdev tweaks (#1558)
* Check interface name before loading interface data
* Reduce indentation
* Optimize nested netdev map

This change avoids conversion to strings and back.

Signed-off-by: Julian Kornberger <jk+github@digineo.de>
2020-08-24 17:43:27 +02:00
Aleksei Zakharov 0478ddef69
bcache: add writeback_rate_debug stats (#1658)
* bcache: add writeback_rate_debug export

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru>
2020-08-24 17:40:31 +02:00
paulfantom e4ec8e04c5 docs/node-mixin: add alerts about failing RAID array
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2020-08-24 16:17:20 +02:00
Aleksei Zakharov 3b035c8fa1
bcache: add priorityStats flag (#1621)
* bcache: add priorityStats flag

Fixes #1593

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru>
2020-08-10 16:50:58 +02:00
domchan 503e4fc848
Expose cpu bugs and flags as info metrics. (#1788)
* Expose cpu bugs and flags as info metrics with a regexp filter.
* Automatically enable CPU info metrics when using flags or bugs feature.

Signed-off-by: domgoer <domdoumc@gmail.com>
2020-07-17 18:32:23 +02:00
Ben Kochie f4b89c79a2
Merge pull request #1787 from domgoer/master
better wording
2020-07-14 14:32:19 +02:00
domgoer 457b7cdc18 better wording
Signed-off-by: domgoer <domdoumc@gmail.com>
2020-07-14 20:13:48 +08:00
Ben Kochie a4f45e823f
Merge pull request #1782 from prometheus/superq/drop_vendor
Remove vendor directory
2020-07-14 13:57:22 +02:00
胡玮文 2c1d2a6efd Update the link to prometheus-dcgm
The original link is broken (404).

Signed-off-by: 胡玮文 <huww98@outlook.com>
2020-07-14 12:24:38 +02:00
Ben Kochie 7c659627da
Remove vendor directory
Dev summit 2020-07-10 consensus item: Remove vendor from repos.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-07-11 18:25:18 +02:00
Ben Kochie 4ef4548ad5
Merge pull request #1770 from prometheus/superq/fix_md_changelog
Fix up node_md_disks changelog entry
2020-06-30 04:20:29 +02:00
Ben Kochie 7ad86f7994
Merge pull request #1769 from knweiss/typos
udp_queues_linux.go: s/upd/udp/ in two error strings
2020-06-29 16:39:30 +02:00
Ben Kochie 1f46669916
Fix up node_md_disks changelog entry
Fixes: https://github.com/prometheus/node_exporter/issues/1759

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-29 16:30:59 +02:00
Karsten Weiss b9b1d4e369 udp_queues_linux.go: s/upd/udp/ in two error strings
Signed-off-by: Karsten Weiss <knweiss@gmail.com>
2020-06-29 15:00:15 +02:00
Ben Kochie 8d436bedf5
Merge pull request #1761 from prometheus/repo_sync
Synchronize common files from prometheus/prometheus
2020-06-29 08:50:22 +02:00
prombot 10fe59b9b6 Update common Prometheus files
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2020-06-23 00:10:30 +00:00
Ben Kochie 5d42d4d99f
Merge pull request #1732 from fach/master
Adding backlog/current queue length to qdisc collector
2020-06-22 20:15:04 +02:00
Ben Kochie 08ce3c6dd4
Merge pull request #1733 from prometheus/superq/OutRsts
Include TCP OutRsts in netstat metrics
2020-06-18 17:12:45 +02:00
Ben Kochie e96073cfd5
Merge pull request #1752 from prometheus/superq/error_verb
Use Go 1.13 error features
2020-06-18 17:08:48 +02:00
Ben Kochie dfa53f835a
Use Go 1.13 error features
* Use `errors.Is()` for unwrapping errors.
* Use `%w` error verb in internal error formatting.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-16 14:47:03 +02:00
Ben Kochie 3799895d41
Merge pull request #1750 from prometheus/superq/1.0.1
Update for 1.0.1 release
2020-06-15 18:32:09 +02:00
Ben Kochie a34630b8a2
Update for 1.0.1 release
Update changelog and version for 1.0.1 release.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-15 14:34:07 +02:00
Ben Kochie 64ba27e7d6 Fix up powersupplyclass error
Switch to go `%w` error verb and errors.Is().

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-15 12:36:29 +02:00
Ben Kochie 35bfe455df
Merge pull request #1735 from prometheus/bjk/bump_procfs
Update prometheus/procfs
2020-06-15 07:44:34 +02:00
Ben Kochie c8c1618074
Merge pull request #1747 from prometheus/superq/fix_powersupplyclass
Handle no data from powersupplyclass
2020-06-14 15:45:12 +02:00
Ben Kochie baa7ab732f
Update prometheus/procfs
Fixes: https://github.com/prometheus/node_exporter/issues/1721

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-14 14:18:13 +02:00
Ben Kochie 7790f96881
Merge pull request #1743 from prometheus/superq/flags
Improve filter flag names.
2020-06-14 10:39:43 +02:00
Ben Kochie 2cefe3d769
Merge pull request #1745 from jeffreystoke/master
Fix build tags for collectors
2020-06-14 08:44:02 +02:00
Ben Kochie 5fed4f01e9
Handle no data from powersupplyclass
Handle the case when /sys/class/power_supply doesn't exist. Fixes
logging error spam.

Requires https://github.com/prometheus/procfs/pull/308

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-13 11:09:16 +02:00
Ben Kochie 7e49b68d3a
Improve filter flag names.
Update netdev and systemd collectors to deprecate poorly chosen flag names.

Old flag names to be removed in 2.0.0.

https://github.com/prometheus/node_exporter/issues/1742

Add log messages for parsed flag values to help discover quoting isuses in
supervisors.

https://github.com/prometheus/node_exporter/issues/1737

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-12 12:46:31 +02:00
Jeffrey Stoke cb7ab5119a
Fix collectors' build tags
Signed-off-by: Jeffrey Stoke <me@arhat.dev>
2020-06-12 10:26:30 +02:00
fach 79ef305a19 Updating e2e test output
Signed-off-by: fach <shaw38@gmail.com>
2020-06-04 13:01:34 -04:00
fach 5fadcb1bac Updating mod version for github.com/ema/qdisc
Signed-off-by: fach <shaw38@gmail.com>
2020-06-04 12:29:23 -04:00
fach 0ea8978788 Adding backlog/current queue length to qdisc collector
Signed-off-by: fach <shaw38@gmail.com>
2020-06-04 12:13:07 -04:00
Julien Pivotto 594f417bdf
Adapt https/web-config.yml (#1734)
Currently web-config is not a valid yaml and is an incomplete reference.

Keep the reference in README.md and create a minimalist web-config.yml
that acts as an exemple.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-04 17:49:37 +02:00
Ben Kochie 204164e4e4
Include TCP OutRsts in netstat metrics
TCP "OutRsts" is the number of TCP Resets sent by the node. This can be
useful for monitoring connection failures and flooding.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-04 08:51:39 +02:00