Commit graph

1329 commits

Author SHA1 Message Date
John Belmonte 15e36e2230 fix typo in cpufreq metric names (#1510)
Signed-off-by: John Belmonte <john@neggie.net>
2019-10-11 02:12:20 +09:00
Ben Kochie 7a30219ca4
Merge pull request #1514 from ScottBrenner/patch-1
Two quick typo fixes
2019-10-10 08:23:23 +02:00
Scott Brenner 813a4bdf8b Two quick typo fixes
Signed-off-by: Scott Brenner <scott@scottbrenner.me>
2019-10-09 20:42:27 -07:00
Ben Kochie fb54f7f2e0
Merge pull request #1489 from pgier/cpuinfo
add node_cpu_info metric
2019-10-08 14:58:11 +02:00
Matt Layher eeeae46a87
Merge pull request #1506 from prometheus/mdl-drbd-cleanup
collector: clean up DRBD collector, less global state
2019-10-05 10:00:04 -04:00
Matt Layher ce693648d3
collector: clean up DRBD collector, less global state
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-10-04 10:40:18 -04:00
Ben Kochie e6f795798f
Merge pull request #1484 from simonpasquier/bump-golang-1.13
Bump golang 1.13
2019-10-01 20:18:21 +02:00
Matt Layher 57b1e636a5
Merge pull request #1504 from prometheus/mdl-rm-import
collector: remove commented-out import from bcache collector
2019-10-01 12:05:14 -04:00
Matt Layher a1659da2e7
collector: remove commented-out import from bcache collector
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-10-01 11:47:25 -04:00
Björn Rabenstein 855a1f1d18
Merge pull request #1482 from leojonathanoh/fix-node-mixin-prometheus-alert-rules-to-use-percentage
Fix node-mixin prometheus alert rules to use percentage
2019-09-26 20:01:18 +02:00
Paul Gier 9f5225456d fix order of items in CHANGELOG
Signed-off-by: Paul Gier <pgier@redhat.com>
2019-09-25 14:39:43 -05:00
Paul Gier 4d72cb8059 add node_cpu_info metric
Contains information gathered from /proc/cpuinfo

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-09-25 14:38:57 -05:00
Benjamin Drung 27b8c93a5a Use InfiniBandClass from procfs library (#1396)
Parsing the sysfs files for InfiniBand was added to the procfs library
(see https://github.com/prometheus/procfs/pull/164).

Therefore use `InfiniBandClass` from the procfs library instead of
parsing sysfs itself.

If the port counter return `N/A (no PMA)` no metric will be returned
(instead of returning 0 for this metric.

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
2019-09-23 18:18:35 +02:00
Simon Pasquier cfc06075d1 Bump github.com/prometheus/common to v0.7.0
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-09-16 10:59:12 +02:00
Simon Pasquier a99ef58c4b Fix go.mod and vendor/
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-09-16 10:58:07 +02:00
Simon Pasquier e6f7dfaa50 *: bump Go version to 1.13
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-09-16 10:57:12 +02:00
Ben Kochie f3538e1fc6
Merge pull request #1488 from pgier/update-procfs-v0.0.5
update procfs to v0.0.5
2019-09-16 09:37:38 +02:00
Paul Gier cbfb496629 update procfs to v0.0.5
- Fixes (#1465) failure in netclass collector
- Adds parsing of CPU information

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-09-15 16:57:37 -05:00
PrometheusBot eb19c5c20b makefile: update Makefile.common with newer version (#1481)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2019-09-13 12:55:06 +02:00
Björn Rabenstein e7c2dbed4e
Merge pull request #1483 from s-urbaniak/fix-selectors
node-mixin: fix configuration for unset fsSelector/diskDeviceSelector and dashboard query
2019-09-12 21:36:31 +02:00
Sergiusz Urbaniak f4417b209a node-mixin: fix configuration for unset fsSelector/diskDeviceSelector
As per https://github.com/prometheus/node_exporter/pull/1429#discussion_r304210103
we want to fetch all devices and all fs types.

Currently, this is done by setting empty string which breaks most queries which rely on it.

This fixes it by setting the appropriate selector instead of empty string.

Signed-off-by: Sergiusz Urbaniak <sergiusz.urbaniak@gmail.com>
2019-09-12 14:02:56 +02:00
Sergiusz Urbaniak ed78237036 node-mixin: fix query in Disk Space Utilisation dashboard
Signed-off-by: Sergiusz Urbaniak <sergiusz.urbaniak@gmail.com>
2019-09-12 14:02:56 +02:00
Leo dfeec07f2f Fix node-mixin prometheus alert rules to use percentage
Signed-off-by: Leo <leonardjonathanoh@live.com>
2019-09-11 08:47:24 +00:00
Ben Kochie 7caedccd73
Merge pull request #1445 from davemcphee/coolingDevice
Scrape cooling_device state
2019-09-09 19:24:17 +02:00
Ben Kochie 82b7b1f732
Merge branch 'master' into coolingDevice 2019-09-09 17:44:03 +02:00
dt-rush 93fbb93a46 fix issue where rootfs path strips to the empty string (#1464)
Change-type: patch
Connects-to: #1463
Signed-off-by: dt-rush <nickp@balena.io>
2019-09-09 17:39:24 +02:00
Björn Rabenstein ab8cf1f718 Node mixin: Clarify dashboard dependency on rules (#1475)
Following @discordianfish's suggestion
[here](https://github.com/prometheus/node_exporter/issues/1454#issuecomment-524225222).

Signed-off-by: beorn7 <beorn@grafana.com>
2019-09-08 10:55:43 +02:00
Ben Kochie 0e77317955
Update netlink vendoring (#1471)
* github.com/ema/qdisc
* github.com/mdlayher/genetlink
* github.com/mdlayher/wifi

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-09-05 15:35:13 +02:00
Paul Gier 8c3de12c22 systemd: check version for availability of properties (#1413)
The dbus property 'SystemState' and the timer property 'LastTriggerUSec'
were added in version 212 of systemd.
Check that the version of systemd is higher than 212 before attempting
to query these properties

f755e3b74b
dedabea4b3

Resolves issue #291

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-09-04 16:27:25 +02:00
Alex Schmitz 664025d60c
Scrape cooling_device state
Signed-off-by: Alex Schmitz <alex.schmitz@gmail.com>
2019-08-30 08:58:47 -05:00
PrometheusBot d3478a207e makefile: update Makefile.common with newer version (#1466)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2019-08-30 13:32:17 +02:00
Boris Momčilović 93c12e03a1 Ipvs firewall mark (#1455)
* IPVS: include firewall mark label

Signed-off-by: Boris Momčilović <boris@firstbeatmedia.com>
2019-08-27 14:24:11 +02:00
Paul Gier 0b7ac85acb update procfs to v0.0.4 (#1457)
Signed-off-by: Paul Gier <pgier@redhat.com>
2019-08-27 09:26:19 +02:00
Björn Rabenstein 154d59dee7
Merge pull request #1452 from prometheus/beorn7/mixin
Update legendLink
2019-08-21 09:50:26 +02:00
beorn7 76ff263ca6 Update legendLink
This still had the 'k8s' in as it was copied and pasted from the
kubernetes-mixin.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-08-20 18:49:12 +02:00
Björn Rabenstein 0f38d680b4
Merge pull request #1449 from prometheus/beorn7/mixin3
node-mixin: Make the severity of "critical" alerts configurable
2019-08-19 13:55:52 +02:00
Björn Rabenstein d208140290
Merge pull request #1450 from prometheus/beorn7/mixin
More improvements for the node dashboard
2019-08-19 11:08:18 +02:00
beorn7 44e5731de7 Add line for number of cores to load graph
Backported from the node dashboard in the kubernetes-mixin.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-08-15 16:43:57 +02:00
beorn7 024d5ed55e Fix title of CPU panel to usage
We use the `mode="idle"` metric, but we are inverting it, so this is
usage, and that's intended.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-08-15 16:36:10 +02:00
beorn7 a016d9cd6f node-mixin: Improve disk usage panel
- Use a stacked graph instead of a gauge as development over time is
  especially useful for disk space usage.

- By only taking one metric per device into account, we avoid
  double-counting for devices that are mounted multiple times.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-08-15 16:32:54 +02:00
Björn Rabenstein 7ef6f2576d
node-mxin: Improve nodes dashboard (#1448)
* node-mixin: Improve nodes dashboard

- Use stacking where it makes sense.
- Normalize idle CPU so that stacking is more meaningful.
- Consistently fill where stacking is used but don't fill where not.
- Fix y axis max value for Idle CPU panel.
- Fix y axis min value for memory usage panel.
- Use `$__interval` for range where applicable (and set min step
  to 1m).
- Make the right Y axis for disk I/O actually work.

This is just an incremental improvements. It doesn't touch the more
involved TODOs.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-08-15 00:40:51 +02:00
Björn Rabenstein 0d3a2d3209
Merge pull request #1447 from prometheus/beorn7/mixin
node-mixin: Fix various straight-forward issues in the USE dashboards
2019-08-15 00:37:43 +02:00
beorn7 97ef113762 Make the severity of "critical" alerts configurable
This addresses the blissful scenario where single-node failures are
unproblematic. No reason to wake somebody up if a node is about to
screw itself up by filling the disk.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-08-14 22:24:24 +02:00
beorn7 f350aaf87e node-mixin: Fix various straight-forward issues in the USE dashboards
- Normalize cluster memory utilisation.

- Fix missing `1m` in memory saturation.

- Have both disk-related row next to each other instead with the
  network row in between.

- Correctly render transmit network traffic as negative, using
  `seriesOverrides` and `min: null` for the y-axis.

- Make panel and row naming consistent.

- Remove legend where it would just display a single entry with
  exactly the title of the panel.

- Fix metric name in individual node CPU Saturation panel.

- Break up disk space utilisation by device in the panel for an
  individual node.

NB: All of that doesn't touch any more subtle issues captured in the
various TODOs.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-08-13 21:54:28 +02:00
Sandro Jäckel 697c2deed5 Update rootfs syntax in Docker example (#1443)
Signed-off-by: Sandro Jäckel <sandro.jaeckel@gmail.com>
2019-08-07 09:19:20 +02:00
Phil Frost 26d4fbdf07 Fix seconds reported by schedstat (#1426)
Upstream bugfix: https://github.com/prometheus/procfs/pull/191

Signed-off-by: Phil Frost <phil@postmates.com>
2019-08-06 19:08:06 +02:00
Richard Kojedzinszky 75462bf4fe Scrape thermal_zone temperatures (#1425)
* Scrape thermal_zone temperatures

Signed-off-by: Richard Kojedzinszky <richard@kojedz.in>
2019-08-04 12:56:36 +02:00
Ben Kochie 10146109ec
Update CHANGELOG for #1433
Signed-off-by: Ben Kochie <superq@gmail.com>
2019-08-03 12:33:25 +02:00
Philip Gough 2d95ecaa96 Extends uname collector to export on Darwin OS (#1433)
Adds uname collector support for Darwin and OpenBSD

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
2019-08-03 12:32:43 +02:00
PrometheusBot 2f2392af3f makefile: update Makefile.common with newer version (#1434)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2019-08-03 12:15:24 +02:00