Commit graph

1276 commits

Author SHA1 Message Date
Ben Kochie c39f6749fc
Bugfix release 0.18.1 (#1366)
Cherry-pick two bug fixes into 0.18.1.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-06-04 14:29:33 +02:00
Brian Candler b3429e4a97 Make storcli.py compatible with python2 (#1365)
This is only a minor change to .format() arguments, and is useful on CentOS6
servers which have only python2.

Signed-off-by: Brian Candler <b.candler@pobox.com>
2019-06-03 11:46:02 +02:00
Ben Kochie 4a15edf0b6
Add changelog entry for #1364
Signed-off-by: Ben Kochie <superq@gmail.com>
2019-06-03 11:20:06 +02:00
Simon Pasquier a076cd3203 Use Circle CI's org context (#1362)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-06-03 11:17:59 +02:00
Ben Kochie fdf9846282 Fixup 0.17.0 changelog (#1354)
* Fix ordering of CHANGE items by PR number.
* Add missing CHANGE for #1003

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-06-02 10:51:07 +01:00
Ben Kochie 8146998945
Fix rollover bug in mountstats collector (#1364)
* Update procfs vendor to pull in github.com/prometheus/procfs/pull/165
* Update mountstats collector to use new types.
* Rollover counter automatically to avoid float64 accuracy issues.
* Update e2e test.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-05-31 18:30:37 +02:00
Noam Meltzer 501ccf9fb4 Add --collector.netdev.device-whitelist flag (#1279)
* Add --collector.netdev.device-whitelist flag

Sometimes it is desired to monitor only one netdev. The golang regexp
does not support a negated regex, so the ignored-devices flag is too
cumbersome for this task.
This change introduces a new flag: accept-devices, which is mutually
exclusive to ignored-devices. This flag allows specifying ONLY the
netdev you'd like.

Signed-off-by: Noam Meltzer <noam@cynerio.co>
2019-05-31 17:55:50 +02:00
Benjamin Drung fc02b5dfbc Make scripts in text_collector_examples executable (#1358)
Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
2019-05-29 06:58:03 -05:00
Benjamin Drung dfb6002fad btrfs_stats: Upgrade to Python 3 (#1359)
Python 2.7 will not be maintained past 2020. Therefore upgrade
`text_collector_examples/btrfs_stats.py` to Python 3.

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
2019-05-29 06:57:23 -05:00
Paul Gier bd3fc09b30 fix or ignore codespell issues (#1351)
Signed-off-by: Paul Gier <pgier@redhat.com>
2019-05-20 13:05:39 -05:00
PrometheusBot 2a9939fcf3 Synchronize Makefile.common from prometheus/prometheus (#1346)
* makefile: update Makefile.common with newer version

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Remove obsolete release tool.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-05-14 20:27:02 -05:00
David O'Rourke 814ef064c0 meminfo: Fix the size mismatch in the swapTotal check mib for BSD. (#1345)
Signed-off-by: David O'Rourke <david.orourke@gmail.com>
2019-05-14 17:42:36 -05:00
Ben Kochie f10c665d33
Cleanup uname Update call (#1342)
Make collector a pointer for consistency.

Fixes: https://github.com/prometheus/node_exporter/issues/1300

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-05-13 11:44:12 -05:00
Paul Gier 8b13c130b7 log pid when there is a problem reading the process stats (#1341)
Signed-off-by: Paul Gier <pgier@redhat.com>
2019-05-10 13:04:26 -05:00
Paul Gier d0a66c4c40 use sys/unix package instead of syscall (#1340)
According to the golang docs, the syscall package is deprecated.
https://golang.org/pkg/syscall
This updates collectors to use the x/sys/unix package instead.
Also updates the vendored x/sys/unix module to latest.

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-05-10 13:04:06 -05:00
Ben Kochie f97f01c46c
Update for 0.18.0 release (#1337)
* Update CHANGELOG for release.
* Bump VERSION.
* Update vendoring.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-05-09 13:19:12 -05:00
Daniel Hodges 7882009870 Add perf exporter (#1274)
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2019-05-07 13:21:41 +02:00
PrometheusBot 0c6b90be4e makefile: update Makefile.common with newer version (#1332)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2019-05-07 06:38:46 +02:00
Paul Gier 86f9079429 update procfs to latest (#1335)
Updates for procfs refactoring

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-05-07 06:38:21 +02:00
Simon Pasquier c7abeae816 *: enable default linters (#1334)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-05-06 15:42:50 +02:00
Simon Pasquier c3ce1ea6d8 *: bump Go version to 1.12 (#1329)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-04-26 11:20:37 +02:00
PrometheusBot b5cab091dc Synchronize Makefile.common from prometheus/prometheus (#1328)
* makefile: update Makefile.common with newer version

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Add .golangci.yml

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-04-25 10:53:48 +02:00
Ben Kochie 78b9eb9c2c Use 64-bit Darwin netstat counters (#1319)
Avoid 32-bit counter rollovers.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-04-25 10:07:56 +02:00
Christian Hoffmann 36e3b2a923 textfile: use opened file's mtime as timestamp (#1326)
Previously, the node_textfile_mtime_seconds metric was based on the
Fileinfo.ModTime() of the ioutil.ReadDir() return value. This is based
on lstat() and therefore has unintended consequences for symlinks
(modification time of the symlink instead of the symlink target is
returned). It is also racy as the lstat() is performed before reading
the file.

This commit changes the node_textfile_mtime_seconds metric to be based
on a fresh Stat() call on the open file.  This eliminates the race and
works as expected for symlinks. Fixes #1324.

Signed-off-by: Christian Hoffmann <mail@hoffmann-christian.info>
2019-04-18 17:47:04 +02:00
Daniele Sluijters 5b4140e0bd README: Move pressure to enabled table (#1325)
Follow-up from #1261.

Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>
2019-04-18 13:52:14 +02:00
Daniele Sluijters cc2fd82008 Expose /proc/pressure (#1261)
This enables the collection of pressure stall information as exposed
by the `/proc/pressure` interface added in the 4.20 release of the
Linux kernel.

Closes #1174

Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>
2019-04-18 12:19:20 +02:00
Johannes Würbach 4e5c4d464f Docker images for ARM32v7, ARM64v8 and ppc64le (#1207)
Build and publish ARM32v7, ARM64v8 and ppc64le docker images.

Signed-off-by: Johannes Würbach <johannes.wuerbach@googlemail.com>
2019-04-15 17:36:25 +02:00
Ben Kochie e71e9f5a2f
Update vendoring (#1304)
Update to current vendoring.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-04-15 14:00:19 +02:00
Paul Gier b1298677aa Early init of procfs (#1315)
Minor change to match naming convention in other collectors.

Initialize the proc or sys FS instance once while initializing
each collector instead of re-creating for each metric update.

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-04-10 18:16:12 +02:00
Henk fbe390709f Add nvme_metrics.sh text collector example (#1309)
* Add nvme_metrics.sh text collector example

Signed-off-by: Henk <henk@wearespindle.com>
2019-04-08 15:50:29 +02:00
Shawn Craver b8b0195d6d OpenBSD rc.d script (#1306)
* OpenBSD rc.d script

Signed-off-by: Shawn Craver <craversp@gmail.com>
2019-04-04 13:06:31 +02:00
Théo Brigitte 4d88761c13 update github.com/godbus/dbus to latest master (#1305)
* update github.com/godbus/dbus to 271e53dc4968a0f8862f20841cc63bb5f43d6c57

Signed-off-by: Theo Brigitte <theo.brigitte@gmail.com>
2019-04-03 12:32:48 +02:00
Simon Pasquier dbe7badc7c Update Makefile.common (#1288)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-03-25 12:47:57 +01:00
Edgaras Giedrė 2f87b7cba6 Update smartmon.py to widen self_assessment_passed test (#1293)
Signed-off-by: EdgarasG <edgaras.giedre@hostinger.com>
2019-03-20 09:38:41 +01:00
Johannes 'fish' Ziemke d2136aace0
Update README: Add note about ts in textfile
This closes #1284
2019-03-19 11:23:17 +01:00
Slawomir Gonet 19e5bb6abd yum.sh: yum update monitor (#1273)
Signed-off-by: Slawomir Gonet <slawek@otwiera.cz>
2019-02-28 00:12:47 +01:00
Julian Kornberger 5110efc1cd Translate smartmon.py to Python (#1225)
* Add smartmon.py python port of the smartmon.sh bash script

Signed-off-by: Arthur Skowronek <ags@digineo.de>
2019-02-27 22:19:55 +01:00
Paul Gier 8ca1e5594b upgrade promu to v0.3.0 (#1272)
Signed-off-by: Paul Gier <pgier@redhat.com>
2019-02-27 20:30:09 +01:00
Saj Goonatilleke d546916c6b Add the inotify-instances text collector (#1186)
This is an alternative take on the embedded inotify collector:

https://github.com/prometheus/node_exporter/pull/988

The proposed embedded collector was not accepted for inclusion because
it was not possible for a single unprivileged node_exporter process to
detect inotify resource utilisation in other user domains.

This text collector works around the problem by giving the operator a
choice between the following:

  - Run only the text collector as root to gain visibility over all
    processes on the system.

  - Run one or more instances of the text collector as an unprivileged
    user to gain visibility over subsets of the system.

In either case, the data generated by this collector can be useful when
hunting down inotify instance leaks -- and when confirming the
resolution of such leaks.

Signed-off-by: Saj Goonatilleke <sg@redu.cx>
2019-02-27 01:03:25 +01:00
Cole White 83c9b11747 remove "-n" flag from /usr/bin/awk (#1269)
This flag causes no ipmi data to be emitted and an error log is generated on each invocation: "awk: not an option: -nf".

I was unable to locate a "-n" flag in the mawk or gawk man pages, so I tested it by manually changing the script on a running Debian buster system.  The issue was resolved and metrics were emitted.

Signed-off-by: Cole White <cwhite@wikimedia.org>
2019-02-23 18:37:06 +01:00
Nuno Tavares 0dc14762ef ADD Cachevault_Info.Temp, being a distinct phy component, I think it's worth monitoring (#1268)
Signed-off-by: Nuno Tavares <n.tavares@portavita.eu>
2019-02-21 14:12:45 +01:00
Paul Gier cc847f2f44 collector/cpu: split cpu freq metrics into separate collector (#1253)
The cpu frequency information is not always needed and/or available.
This change allows the cpu frequency metrics to be enabled/disabled
separately from the other cpu metrics, and also prevents a frequency
metric failure (such as a parse error) from failing the main cpu
collector.

Fixes #1241

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-02-19 17:22:54 +01:00
Ben Kochie f028b81615
Update systemd blacklist (#1255)
Include additional unit types in the default systemd collector
blacklist.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-02-17 17:57:15 +01:00
Ben Kochie dc4c58671d
Update vendoring. (#1257)
* Update vendoring.

Update vendoring to latest upstream.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-02-13 14:12:12 +01:00
Paul Gier cb9e23c536 Systemd refactor (#1254)
This reduces the system metric collection time by using a wait group
and go routines to allow the systemd metric calls happen concurrently.

Also, makes the start time, restarts, tasks_max, and tasks_current metrics disabled by default
because these can be time consuming to gather.

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-02-11 23:27:21 +01:00
mpursley 1ba436e194 add md_info_detail.sh (#1204)
Signed-off-by: Matt Pursley <mpursley@gmail.com>
2019-02-10 15:20:42 +01:00
Sachi King 18fc512fc4 Bond: Monitor bond mii_status not link operstate (#1124)
With a bond interface the state of the slave interface from the bond's
point of view is reflected in `mii_status` and is independent of the
link's `operstate`.

When a bond is monitored with `miimon`, `mii_status` will reflect the
state of the physical link as configured via the operator.

When a bond is monitored via `arp_interval` the `mii_status` will
reflect the results of the bond ARP checking.  This means the link can
be down from the bond's point of view, but up from a physical
connection point of view.

If a bond is not monitored via miimon or arp, the `mii_status` should
likely be always `up`, however I have observed a case where this is not
true and the `operstate` is `up` while `mii_status` is `down`.  Kernel
bond documentation stresses that a bond should not be configured without
one of `mii_mon` or `arp_interval` configured however.

This change results in the metric 'node_bonding_active' matching the
up/down state of the bond's point of view rather than operstate.

Signed-off-by: Sachi King <nakato@nakato.io>
2019-02-10 11:00:04 +01:00
Paul Gier e0d6d11859 netclass_linux: remove varying labels from the 'up' metric (#1243)
* netclass_linux: remove varying labels from the 'up' metric

This moves the variable label values such as 'operstate' out of
the 'network_up' metric and into a separate metric called '_info'.
This allows the 'up' metric to remain continous over state changes.
Fixes #1236

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-02-07 15:59:32 +01:00
Johannes 'fish' Ziemke 6ea0aa73e4 Rename interface to device in netclass collector (#1224)
* Rename interface to device in netclass collector

This makes it consistent with other networking metrics like node_network_receive_bytes_total

This closes #1223 

Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
2019-02-06 20:02:48 +01:00
Ralf Horstmann 3867ad5ab0 Add diskstats collector for OpenBSD (#1250)
* Add diskstats collector for OpenBSD

Tested on i386 and amd64, OpenBSD 6.4 and -current.

* Refactor diskstats collectors

This moves common descriptors from Linux, Darwin, OpenBSD
diskstats collectors into diskstats_common.go

Signed-off-by: Ralf Horstmann <ralf+github@ackstorm.de>
2019-02-06 11:36:22 +01:00