Commit graph

1021 commits

Author SHA1 Message Date
Martin Kennelly 4065902fe5
Add TCPTimeouts to netstat default filter (#2189)
TCP timeouts count is a useful signal to show
abnormal network performance and is another
signal to aid debugging. This metric can be
used to generate proactive alerts for host
network namespace workloads.

Signed-off-by: Martin Kennelly <mkennell@redhat.com>
2021-11-18 09:34:55 +01:00
Benjamin Drung f5ae31a84c
Disable lnstat collector by default (#2188)
The new `lnstat` collector produces a high number of metrics, per-cpu,
and results in approximately double the number of metrics previously
scraped. For example, a typical server with 64 cores produces 3832
lnstat metrics compared to 4147 metrics for the remaining collectors.

Therefore disable the `lnstat` collector by default.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-11-18 09:33:34 +01:00
Park Beomsu c861ba93aa
Remove redundant nil check (#2206)
Signed-off-by: computerphilosopher <bspark@jam2in.com>
2021-11-15 11:23:49 +01:00
Benjamin Drung d85cbaa17c
ethtool: Prevent duplicate metric names (#2187)
Sanitizing the metric names can lead to duplicate metric names:

```
caller=level.go:63 level=error caller="error gathering metrics: [from Gatherer #2] collected metric \"node_ethtool_giant_hdr\" { label:<name:\"device\" value:\"ens192\" > untyped:<value:0" msg=" > } was collected before with the same name and label values"
```

Generate a map from the sanitized metric names to the metric names from
ethtool. In case of duplicate sanitized metric names drop both metrics,
because it is unknown which one to take.

Fixes: https://github.com/prometheus/node_exporter/issues/2185
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-11-15 11:22:36 +01:00
Tobias Klauser 58ab0144af Use SysctlTimeval for boottime collector on BSD
Use SysctlTimeval from the golang.org/x/sys/unix package to
simplify the implementation of the boottime collector for the BSDs and
allows to build it without cgo.

Tested on macOS 11.6, FreeBSD 13 and OpenBSD 7.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2021-11-15 10:50:03 +01:00
Johannes 'fish' Ziemke 85e20238e7
Add clocksource metrics to time collector (#2197)
* Add clocksource metrics to time collector

This closes #1336

Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-11-12 11:45:31 +01:00
Ben Kochie fda358a1ec
Workaround LLVM/Clang 11.0 for Darwin builds (#2200)
LLVM/Clang 11.0 adds a `-Wundef-prefix=TARGET_OS_` build flag which
breaks this build flag.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-11-09 17:52:49 +01:00
Benjamin Drung 2a28266852
ethtool: Add test case with leading spaces (#2186)
Add test case for ethtool metrics with leading spaces reported in #2185:

```
$ ethtool -S
NIC statistics:
     Tx Queue#: 0
       TSO pkts tx: 0
       TSO bytes tx: 0
       ucast pkts tx: 20487
       ucast bytes tx: 1908107
       mcast pkts tx: 83
       mcast bytes tx: 5906
       bcast pkts tx: 4
       bcast bytes tx: 168
       pkts tx err: 0
       pkts tx discard: 0
       drv dropped tx total: 0
          too many frags: 0
          giant hdr: 0
          hdr err: 0
          tso: 0
       ring full: 0
       pkts linearized: 0
       hdr cloned: 0
       giant hdr: 0
     Rx Queue#: 0
       LRO pkts rx: 0
       LRO byte rx: 0
       ucast pkts rx: 25086
       ucast bytes rx: 2404103
       mcast pkts rx: 0
       mcast bytes rx: 0
       bcast pkts rx: 0
       bcast bytes rx: 0
       pkts rx OOB: 0
       pkts rx err: 0
       drv dropped rx total: 0
          err: 0
          fcs: 0
       rx buf alloc fail: 0
     tx timeout count: 0
```

Bug: https://github.com/prometheus/node_exporter/issues/2185
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-10-29 10:55:39 +02:00
Benjamin Drung 0dc82eac13
Correctly disable ZFS for test cases (#2182)
Disable `collector/zfs_linux_test.go` in case `!nozfs` is set to
completely disable ZFS.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-10-28 15:27:15 +02:00
Alessio Caiazza 6523fdfc4b
darwin powersupply collector (#1777)
* Extract powersupply linux code from collector common file.
* Add Darwin powersupply collector.

Signed-off-by: Alessio Caiazza <nolith@abisso.org>
2021-10-28 10:22:24 +02:00
Alessio Caiazza ee17ba0fc0
Fix imports when building on macos (#2180)
Signed-off-by: Alessio Caiazza <nolith@abisso.org>
2021-10-27 16:56:36 +02:00
STRRL df7ea981f7
feat: new collector about thermal conditions on macos (#2032)
* feat: new collector about thermal conditions on macos

Signed-off-by: STRRL <str_ruiling@outlook.com>
2021-10-27 14:05:57 +02:00
Benjamin Drung 9def2f9222
Add DMI collector (#2131)
Add a DMI collector to expose the Desktop Management Interface (DMI)
info from `/sys/class/dmi/id/`. This will expose information about the
BIOS, mainboard, chassis, and product.

Closes: https://github.com/prometheus/node_exporter/issues/303
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-10-27 13:56:37 +02:00
ml 094ee24ad7
Ignore mountpoints under /run (#2157)
* Exclude mountpoints under /run/credentials

Signed-off-by: ml <ml@visu.li>
2021-10-27 13:53:26 +02:00
jordy1024 fbc23548b9
Fix timer GC delays in the Linux filesystem collector (#2169)
Use `time.NewTimer()` and explicit `Stop()` to avoid memory bloat / GC problems with `time.After()` in the Linux filesystem collector timeout handling.

Signed-off-by: bawenmao <bawenmao@sogou-inc.com>
2021-10-24 12:48:57 +02:00
Matt Layher 8b198ce168
collector: replace fmt.Sprintf with strconv.Itoa in perfCollector (#2174)
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2021-10-22 21:03:10 +02:00
W. Andrew Denton d514d67d7f ethtool: convert updateSpeeds to a loop.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-10-20 19:11:30 +02:00
W. Andrew Denton 56e6dcb1fa ethtool: add newline to settings fixture.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-10-20 19:11:30 +02:00
W. Andrew Denton 0f6c84214c ethtool: minor changes to resolve review nits.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-10-20 19:11:30 +02:00
W. Andrew Denton 77a3c9bc20 Gather additional link info from ethtool.
The ethtool_cmd	struct from the	linux kernel contains information about the speeds and features	supported by a
network device. This includes speeds and duplex	but also features like autonegotiate and 802.3x pause frames.

Closes #1444

Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-10-20 19:11:30 +02:00
wenlxie 11bb2f8a95 support thread state
Signed-off-by: wenlxie <wenlxie@ebay.com>
2021-10-19 11:58:43 +02:00
Kiril Vladimirov 1721de0c38
collector: Unwrap glob textfile directories (#1985)
* collector: Unwrap glob textfile directories
* collector: Store full path in mtime's file label

The point is to avoid duplicated gauges from files with the same name in
different directories.

This introduces support for exporting from multiple directories matching
given pattern (e.g. `/home/*/metrics/`).

Signed-off-by: Kiril Vladimirov <kiril@vladimiroff.org>
2021-10-18 14:05:21 +02:00
Ben Kochie f67faf9d18
Fixup drm_linux.go build tag.
Signed-off-by: Ben Kochie <superq@gmail.com>
2021-10-11 15:36:44 +02:00
Siavash Safi 5f110dfeb8
Add initial support for monitoring GPUs on Linux (#1998)
Expose GPU metrics using `sysfs/drm`.
`amdgpu` is the only driver which exposes this information through DRM.

Signed-off-by: Siavash Safi <siavash.safi@gmail.com>
2021-10-11 15:26:21 +02:00
Ben Kochie f61be48d94
Use include/exclude flags for ethtool filtering (#2165)
Use the same flag pattern as netdev to make filtering methods the same.
* Move SanitizeMetricName to helper.go

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-10-11 15:12:25 +02:00
Julien Pivotto 68a6c78c0d
Update go to 1.17 (#2159)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-10-03 13:35:24 +02:00
Aleksei Zakharov 0e6b23c338
Lnstat: expose metrics from /proc/net/stat (#1771)
* Lnstat initial commit

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru>
Co-authored-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-09-28 10:24:18 +02:00
ventifus 0aec407666
Refactor diskstats (#2141)
* Refactor diskstats_linux to use procfs.
* Add `node_disk_info` metric.

Signed-off-by: W. Andrew Denton <git@flying-snail.net>
Co-authored-by: W. Andrew Denton <git@flying-snail.net>
2021-09-28 10:14:12 +02:00
Sergei Semenchuk 5de46c6bac
collect flag_info and bug_info only for one core (#2156)
Signed-off-by: binjip978 <binjip978@gmail.com>
2021-09-28 07:44:03 +02:00
Sergei Semenchuk 2b490d645e
add path label to rapl collector (#2146)
Signed-off-by: binjip978 <binjip978@gmail.com>
2021-09-27 22:57:03 +02:00
Sergei Semenchuk a0c20d48db
add node_network_address_info collector (#2105)
* add node_network_address_info collector

emit (interface, addr, netmask, scope) labels for every network
interface

Signed-off-by: binjip978 <binjip978@gmail.com>
2021-09-08 14:50:25 +02:00
Benjamin Drung b6215e649c Add os release collector
Currently Node Exporter has a metric called `node_uname_info` which of
course exposes uname info. While this is nice, it does not help if you
are running different OSes which could have similar uname info.

Therefore parse `/etc/os-release` or `/usr/lib/os-release` and expose a
`node_os_info` metric which provide information regarding the OS
release/version of the node. Also expose the major.minor part of the OS
release version as `node_os_version`.

Since the os-release files will not change often, cache the parsed
content and only refresh the cache if the modification time changes.

This `os` collector will read files outside of `/proc` and `/sys`, but
the os-release file is widely used and the format is standardized:
https://www.freedesktop.org/software/systemd/man/os-release.html

Bug: https://github.com/prometheus/node_exporter/issues/1574
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-19 14:04:21 +02:00
Ben Kochie 84b36c4fd8
Add flag to disable guest CPU metrics
In high scale virtualized / cloud environments there are typically
no guest VMs. Add a boolean flag to allow disabling the Linux guest
CPU metrics.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-08-17 13:04:46 +02:00
Benjamin Drung 26ca609183 ethtool: Expose node_ethtool_info metric
Add a `node_ethtool_info` metric to all ethtool devices to expose driver
information with following labels:

 * bus_info
 * driver
 * expansion_rom_version
 * firmware_version
 * version

This metric is useful to monitor the firmware version to be up-to-date.

Note: The version label might be malformed due to bug #39 in ethtool:
https://github.com/safchain/ethtool/issues/39

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-16 16:09:35 +02:00
Benjamin Drung 6ac6ea2d13 ethtool: Sanitize metric names
OpenMetrics and the Prometheus exposition format require the metric name
to consist only of alphanumericals and "_", ":" and they must not start
with digits. The metric names from the ethtool stats might contain
spaces, brackets, and dots. Converting them directly to metric names
will produce invalid metric names.

Therefore sanitize the metric names and convert them to lower case.

Fixes: https://github.com/prometheus/node_exporter/issues/2083
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-16 15:28:27 +02:00
Johannes 'fish' Ziemke e6b5aaaff4 Add collector.ethtool.metrics-include
This adds a new flag --collector.ethtool.metrics-include to the ethtool
collector. Only metrics matching this regexp will be collected.

Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-08-10 18:57:36 +02:00
Benjamin Drung 4356c09ebd ethtool: Use prometheus.BuildFQName
Use `prometheus.BuildFQName` everywhere in `ethtool` instead of
hard-coding the metric names.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-10 18:20:01 +02:00
Benjamin Drung 3afd382e75 Add --collector.ethtool.ignored-devices
Other network related collectors allow to filter out unwanted devices.
Add this support to the new ethtool collector as well.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-10 18:09:26 +02:00
Ben Kochie 5d2a4cf7fb
Fix processes collector long int parsing
Update procfs library to include ignored fields ParseInt handling.

Wrap error returns so that the user can know more about what failed.
Returns from getAllocatedThreads() are errors anyway.

Fixes: https://github.com/prometheus/node_exporter/issues/2110

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-08-06 05:55:24 +02:00
Ben Kochie 747012c59a
Merge pull request #2092 from prometheus/superq/fix_energy_uj
Fix rapl collector log noise
2021-07-22 21:00:29 +02:00
Ben Kochie 97d4b01691
Bump prometheus/procfs library
Pull in bug fix for noisy logging.

Fixes: https://github.com/prometheus/node_exporter/issues/2086

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-21 21:40:21 +02:00
Ben Kochie 502f287c96
Fix rapl collector log noise
Capture permission denied error for "energy_uj" file.

Fixes: https://github.com/prometheus/node_exporter/issues/1892

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-21 19:28:54 +02:00
Ben Kochie 6ac7a53f45
Fix conntrack collector log noise
Fix un-handled file not found for conntrack stats.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-15 13:45:07 +02:00
Ben Kochie 6793e0e5a8
Merge pull request #2019 from treydock/ib-counters
Add more Infiniband counters
2021-07-14 13:17:31 +02:00
Ben Kochie 40766fd3cc
Merge pull request #2015 from ston1th/openbsd_mem_cache_fix
Fix wrong value for OpenBSD memory buffer cache
2021-07-14 13:15:32 +02:00
Ben Kochie f17a85d63d
Merge branch 'master' into netclass-filter-before-parsing 2021-07-13 11:22:46 +02:00
Ben Kochie a6ebe10455
Merge branch 'master' into nvme 2021-07-12 17:09:51 +02:00
Luiz Angelo Daros de Luca 00aa2f34ce Add tapestats to collect tape devices statistics
It is based on diskstats to allow metrics reuse by simply
s/disk/tape/ the query.

Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
2021-07-09 21:01:08 -03:00
Ben Kochie 2510378a56
Merge pull request #2067 from prometheus/superq/idle_jump
Handle small backwards jumps in CPU idle
2021-07-07 13:27:21 +02:00
Ben Kochie 73c9a10d37
Handle small backwards jumps in CPU idle
The Linux CPU idle stat can also jump backwards slightly in some cases.
Allow the jump back up to 3 seconds before we attempt to reset the CPU
counter cache.

Fixes: https://github.com/prometheus/node_exporter/issues/1903

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-07 12:24:46 +02:00
Trey Dockendorf f0b2449d94 Add more IB counters
Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>
2021-07-06 11:15:32 -04:00
Benjamin Drung b23146db3f Add nvme collector
Add a collector for NVMes to expose the firmware versions. This requires
procfs >= 0.7.0.

Fixes #1891
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-07-06 13:38:15 +02:00
Ben Kochie 839c2d557f
Update go-kstat location
Move go-kstat to the new github.com/illumos/go-kstat location.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-06 11:44:18 +02:00
Ben Kochie 13be860e25 Add time zone offset metric
Add the time zone and offset in seconds.

Closes: https://github.com/prometheus/node_exporter/issues/2052

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-01 11:25:53 +02:00
Jan Fajerski e656b79297 netclass: retrieve interface names and filter before parsing
We should filter excluded interfaces before parsing the interface
details.
This change is based on https://github.com/prometheus/procfs/pull/376

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2021-06-28 10:53:51 +02:00
Ben Kochie 90d469805a
Fix Eof newline in collector/conntrack_linux.go
Signed-off-by: Ben Kochie <superq@gmail.com>
2021-06-23 11:53:57 +02:00
Kozlov Alexander 02ee897c03
Added conntrack statistics metrics (#1155)
* Added conntrack statistics metrics

Signed-off-by: Aleksandr Kozlov <avlkozlov@avito.ru>
Co-authored-by: Aleksandr Kozlov <avlkozlov@avito.ru>
Co-authored-by: Ben Kochie <superq@gmail.com>
2021-06-23 11:52:43 +02:00
Oliver Geiselhardt-Herms cc4f13b369 Fix build
Signed-off-by: Oliver Geiselhardt-Herms <oliver.geiselhardt-herms@sap.com>
2021-06-17 13:22:17 +02:00
Ben Kochie 27dc754aeb
Merge pull request #1832 from ventifus/master
Add a new ethtool stats collector
2021-06-16 10:07:50 +02:00
ventifus 76c0e1e5a1
Update collector/ethtool_linux.go
Signed-off-by: W. Andrew Denton <git@flying-snail.net>

Co-authored-by: Manuel Rüger <manuel@rueg.eu>
2021-06-11 09:02:08 -07:00
Julien Pivotto 99af1dbb44 Update logic
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-06-04 11:35:07 +02:00
Julien Pivotto 2e20d668f2 Only iniate collectors once
When /metrics is called for specific collectors, the collectors are
initialed every time. Which means that we spend a lot of time
re-initiating the same collectors again and again. Especially, some
collectors make the assumptions that that are initiated once - e.g.
systemd collector generates info message upon initiation.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-06-04 11:20:56 +02:00
Ben Kochie e1db354611
Merge pull request #1887 from prometheus/superq/promhttp_errorlog
Add ErrorLog plumbing to promhttp
2021-06-03 16:38:30 +02:00
Ben Kochie 3bc9a93c20
Add ErrorLog plumbing to promhttp
Fix the error logging of the promhttp handler by connecting it to the
promlog setup.
* Switch to go-kit/log.
* Cleanup CHANGELOG.

Fixes: https://github.com/prometheus/node_exporter/issues/1886

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-06-03 10:47:41 +02:00
kcx2366425574 9eff6761df fix the uncorrect word
Signed-off-by: kcx2366425574 <18279911430@163.com>
2021-05-26 09:58:37 +08:00
W. Andrew Denton 892893ff05 ethtool: Log stats collection errors.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-05-14 10:07:30 -07:00
Hu Shuai 5ee20043a7 Fix golint issue caused by typo
Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-05-11 09:55:52 +08:00
W. Andrew Denton 807f3c3af3 ethtool: Remove end-to-end testing.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-05-03 09:35:49 -07:00
W. Andrew Denton 596ff45f8f ethtool: Add a new ethtool stats collector (metrics equivalent to "ethtool -S")
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-04-29 11:07:26 -07:00
Ben Kochie 7b5cc3e505
Add Darwin arm64 build
Add darwin/arm64 to the CGO crossbuilder list.
* Update Makefile.common to pick up new promu.
* Fix possible nil pointer caught by staticcheck.
* Update collector build tags.

https://github.com/prometheus/node_exporter/issues/1997

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-04-14 10:39:52 +02:00
ston1th 2b7aa4c303 Fix wrong value for OpenBSD memory buffer cache
Fixes #1972

Signed-off-by: ston1th <ston1th@giftfish.de>
2021-04-03 16:57:56 +02:00
Frederic Hemberger 39124626cd Rename collector.filesystem flags to match other collectors
Ref: #1743
Fixes: #1994

Signed-off-by: Frederic Hemberger <mail@frederic-hemberger.de>
2021-03-24 21:01:10 +01:00
Ben Kochie 81caeb6a1b
Merge pull request #2000 from prometheus/fixpanic-systemd-backwards-compat
Fix panix when using backwards compatible flags
2021-03-19 16:22:42 +01:00
Ben Kochie 9893fca77e
Add flag to ignore network speed if it is unknown
Some devices (ex virtual) don't have a speed and report `-1` as the
speed value. Add a flag to allow ignoring speed on these devices.

Fixes: https://github.com/prometheus/node_exporter/issues/1967

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-18 11:36:31 +01:00
Julien Pivotto e7649ba48e Fix panix when using backwards compatible flags
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-03-15 14:59:49 +01:00
Ben Kochie 3b3ef7357f
Silence missing netclass errors
* Handle no such file and permission denied errors.
* Reduce excessive error wrapping.

Fixes: https://github.com/prometheus/node_exporter/issues/1840

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-03 20:40:08 +01:00
Ben Kochie 23e5b245a4
Sanitize strings from /sys/class/power_supply
Avoid panic on invalid UTF-8 from /sys/class/power_supply by
sanitizing strings parsed from the kernel.
* Add a broken string to the test fixtures.

Fixes: https://github.com/prometheus/node_exporter/issues/1979

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-03 18:05:51 +01:00
Ben Kochie 46d0a0813f
Handle errors from disabled PSI subsystem
When CONFIG_PSI_DEFAULT_DISABLED=y, the pressure system returns
"operation not supported", rather than permission denied or not
exposing the /proc/pressure files.

Fixes: https://github.com/prometheus/node_exporter/issues/1961

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-03 11:02:28 +01:00
Ben Kochie 5a6551e8ae
Fix some noisy log lines
* Bump procfs to include some fixes to error messages.
* Lower zpoolStatePaths log from Warn to Debug.

Fixes: https://github.com/prometheus/node_exporter/issues/1961
Fixes: https://github.com/prometheus/node_exporter/issues/1960

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-10 16:16:54 +01:00
Hu Shuai 4109a5089f Fix ineffassign issue
Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-02-08 10:53:12 +08:00
Ben Kochie a37d3f659c
Release 1.1.0
* Update Build
  - Update CircleCI orb.
  - Update CIrcleCI Machine image.
  - Use golang-builder 1.15.
* Update Go modules.
* Fixup fixtures for XFS bug.

NOTE: We have improved some of the flag naming conventions (PR #1743). The old names are
      deprecated and will be removed in 2.0. They will continue to work for backwards
      compatibility.

* [CHANGE] Improve filter flag names #1743
* [CHANGE] Add btrfs and powersupplyclass to list of exporters enabled by default #1897
* [FEATURE] Add fibre channel collector #1786
* [FEATURE] Expose cpu bugs and flags as info metrics. #1788
* [FEATURE] Add network_route collector #1811
* [FEATURE] Add zoneinfo collector #1922
* [ENHANCEMENT] Add more InfiniBand counters #1694
* [ENHANCEMENT] Add flag to aggr ipvs metrics to avoid high cardinality metrics #1709
* [ENHANCEMENT] Adding backlog/current queue length to qdisc collector #1732
* [ENHANCEMENT] Include TCP OutRsts in netstat metrics #1733
* [ENHANCEMENT] Add pool size to entropy collector #1753
* [ENHANCEMENT] Remove CGO dependencies for OpenBSD amd64 #1774
* [ENHANCEMENT] bcache: add writeback_rate_debug stats #1658
* [ENHANCEMENT] Add check state for mdadm arrays via node_md_state metric #1810
* [ENHANCEMENT] Expose XFS inode statistics #1870
* [ENHANCEMENT] Expose zfs zpool state #1878
* [ENHANCEMENT] Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable #1947
* [BUGFIX] filesystem_freebsd: Fix label values #1728
* [BUGFIX] Fix various procfs parsing errors #1735
* [BUGFIX] Handle no data from powersupplyclass #1747
* [BUGFIX] udp_queues_linux.go: change upd to udp in two error strings #1769
* [BUGFIX] Fix node_scrape_collector_success behaviour #1816
* [BUGFIX] Fix NodeRAIDDegraded to not use a string rule expressions #1827
* [BUGFIX] Fix node_md_disks state label from fail to failed #1862
* [BUGFIX] Handle EPERM for syscall in timex collector #1938
* [BUGFIX] bcache: fix typo in a metric name #1943
* [BUGFIX] Fix XFS read/write stats (https://github.com/prometheus/procfs/pull/343)

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-05 21:23:23 +01:00
Ben Kochie 43b91ac846
Merge pull request #1954 from prometheus/superq/noisy_rapl
Fix rapl collector log noise
2021-02-05 21:20:53 +01:00
Ben Kochie dc5a94c803
Fix rapl collector log noise
Catch permission denined errors in the rapl collector.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-05 18:16:17 +01:00
Ben Kochie 0b0c5624e1
Fix network_route collector naming
* Use `device` label to match other `node_network_...` metics.
* Fix naming convention to match Promehteus best practices.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-05 18:10:42 +01:00
Ben Kochie 1729558e11
Merge pull request #1922 from kwisniewski98/zone
Add zoneinfo collector
2021-02-05 13:57:54 +01:00
Ben Kochie 78682c80af
Merge pull request #1786 from deusnefum/master
Add fibre channel collector
2021-02-03 18:22:59 +01:00
mhiles 5a28930e2e change fc_host everywhere, update fixtures
Signed-off-by: mhiles <hiles@hpe.com>
2021-02-03 09:35:58 -05:00
Ben Kochie 22c5aeb0ef
Merge pull request #1943 from hs0210/work
bcache: fix typo
2021-02-03 09:57:23 +01:00
Ben Kochie 477a192803
Merge pull request #1947 from Oloremo/supervisord_env_vars
Added an ability to pass collector.supervisord.url via ENV vars
2021-02-03 08:35:43 +01:00
mhiles 56eba80306 add fibrechannel to default list in read me; host -> fc_host to avoid name collision
Signed-off-by: mhiles <hiles@hpe.com>
2021-02-02 18:05:24 -05:00
Matthew Hiles 3fcccd5b14 Update collector/fibrechannel_linux.go
Co-authored-by: Ben Kochie <superq@gmail.com>
Signed-off-by: mhiles <hiles@hpe.com>
2021-02-02 16:47:56 -05:00
mhiles 19790920b2 update fibrechannel fixture
Signed-off-by: mhiles <hiles@hpe.com>
2021-01-29 10:38:45 -05:00
mhiles 6e02d5bff1 invalid_tx_word_total -> invalid_tx_words_total
Signed-off-by: mhiles <hiles@hpe.com>
2021-01-29 10:28:22 -05:00
Dustin Hooten eecfbcd610 PR feedback
Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>
2021-01-29 10:37:56 +01:00
Dustin Hooten b7626ecdbf ProcessStatCollector: continue if PID disappears between opening and reading file
Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>
2021-01-29 10:37:56 +01:00
Proskurin Kirill 8d147327a6 feat: Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable
Signed-off-by: Proskurin Kirill <kirill.proskurin@behavox.com>
2021-01-27 18:09:52 +00:00
Wisniewski, Krzysztof2 997a8fbb7f Add zoneinfo collector
Signed-off-by: Wisniewski, Krzysztof2 <Krzysztof2.Wisniewski@intel.com>
2021-01-26 12:00:35 +01:00
Hu Shuai 0900cb5655 bcache: fix typo
Fix typos in collector/bcache_linux.go and collector/fixtures/e2e-output.txt

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-01-25 13:50:16 +08:00
Ben Kochie 7904ea4af2
Update netdev OpenBSD amd64 filter
Use new filtering method from #1826

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-01-24 16:37:02 +01:00
Ben Kochie 40ce993d5b
Merge pull request #1816 from liiling/master
Fix node_scrape_collector_success behaviour
2021-01-24 15:28:41 +01:00
Ben Kochie 9fab63825b
Merge pull request #1753 from binjip978/entropy
add pool size to entropy collector
2021-01-24 14:57:33 +01:00
Ben Kochie cb6509f1ac
Merge pull request #1826 from digineo/device-filter
Move ignore/accept to new netDevFilter struct
2021-01-24 14:55:29 +01:00
Ben Kochie 8536df5bbb
Merge branch 'master' into network_route 2021-01-24 14:46:00 +01:00
Ben Kochie e785e06e85
Merge pull request #1884 from xinau/patch-1
collector/filesystem: fixing logging message
2021-01-24 14:40:36 +01:00
Ben Kochie 1d03daf616
Handle EPERM for syscall in timex collector
Handle case where Adjtimex syscall gets a permission denined error.

Fixes: https://github.com/prometheus/node_exporter/issues/1934

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-01-23 14:22:46 +01:00
Ben Kochie 8a3cc9db05
Merge pull request #1694 from treydock/ib-counters
Add more IB counters
2020-12-14 01:07:58 +01:00
Ben Kochie 3b8a7f6ef3
Merge pull request #1774 from ston1th/openbsd_amd64
remove openbsd amd64 cgo dependecies
2020-12-14 01:06:03 +01:00
kamijin_fanta 67480c5e5b repalce to jsimonetti/rtnetlink
Signed-off-by: kamijin_fanta <kamijin@live.jp>
2020-12-04 11:37:02 +09:00
kamijin_fanta 1f5898999e add network_route collector
Signed-off-by: kamijin_fanta <kamijin@live.jp>
2020-12-04 10:32:34 +09:00
ston1th 91bec7c53c fixed build
new type: `netDevStats map[string]map[string]uint64`

Signed-off-by: ston1th <ston1th@giftfish.de>
2020-11-12 23:52:03 +01:00
ston1th 1450a92a68 move const values to iota plus code cleanup
Signed-off-by: ston1th <ston1th@giftfish.de>
2020-11-12 23:37:57 +01:00
ston1th bdba65df16 skip null bytes at the end of strings
Signed-off-by: ston1th <ston1th@giftfish.de>
2020-11-12 23:37:57 +01:00
ston1th 26bd6bf99d more build fixes
Signed-off-by: ston1th <ston1th@giftfish.de>
2020-11-12 23:37:56 +01:00
ston1th 386992e984 fixed build
Signed-off-by: ston1th <ston1th@giftfish.de>
2020-11-12 23:37:56 +01:00
ston1th f8609aeee2 remove openbsd amd64 cgo dependecies
I have rewritten all CGO dependencies for OpenBSD amd64
using pure go, be able to crosscompile node_exporter.

Signed-off-by: ston1th <ston1th@giftfish.de>
2020-11-12 23:37:48 +01:00
Trey Dockendorf 0692791457 Add more IB counters
Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>
2020-11-12 09:35:13 -05:00
xinau daa7303a82 collector/filesystem: fixing logging message
this change fixes the logging message for the filesystem ignored-fs-types flag to output the flag instead of the mountpoints flag.

Signed-off-by: xinau <felix.ehrenpfort@protonmail.com>
2020-11-12 15:27:33 +01:00
Artur Molchanov 519203e3d0 Expose zfs zpool state
Create a metric node_zfs_zpool_state.

Signed-off-by: Artur Molchanov <artur.molchanov@easybrain.com>
2020-10-27 17:39:13 +03:00
Ondrej Baudys ed10485073
Expose XFS inode statistics (#1869) (#1870)
* Expose XFS inode statistics (#1869)

Also fixes #1177

@SuperQ @discordianfish

Signed-off-by: Ondrej Baudys <obaudys@gmail.com>
Co-authored-by: obaudys@gmail.com <ondrej.baudys@nextgen.net>
2020-10-22 18:14:33 +02:00
Ben Kochie 306a365377 Downgrade CPU counter warnings
We've gathered enough evidence that the CPU counter bug workaround is
working as intended. Downgrade the message from Warning to Debug.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-10-01 12:41:15 +02:00
Christian Rohmann a3aaf63bb1
Add check state for mdadm arrays via node_md_state metric (#1810)
* Expose metric for state=check for node_md_state
* Added new e2e output fixture including md201 which is in checking state and a the new state=check labeled metric for all other md

Signed-off-by: Christian Rohmann <github@frittentheke.de>
2020-09-27 13:44:45 +02:00
Julius Volz d05aac43e4 Fix capitalization of CPU acronym throughout
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2020-09-03 23:34:33 +02:00
Li Ling bf154d485b Differentiate ErrNoData and other errors (#1323, #1723)
Signed-off-by: Li Ling <liiling@google.com>
2020-09-01 16:21:36 +02:00
Li Ling 875f437fcc Conntrack update: differentiate ErrNoData and other errors (#1323, #1723)
Signed-off-by: Li Ling <liiling@google.com>
2020-08-26 14:35:39 +02:00
Julian Kornberger 2d412c6ce6 Move ignore/accept to new netDevFilter struct
Signed-off-by: Julian Kornberger <jk+github@digineo.de>
2020-08-26 11:33:05 +02:00
Julian Kornberger 66fb6762bf
Netdev tweaks (#1558)
* Check interface name before loading interface data
* Reduce indentation
* Optimize nested netdev map

This change avoids conversion to strings and back.

Signed-off-by: Julian Kornberger <jk+github@digineo.de>
2020-08-24 17:43:27 +02:00
Aleksei Zakharov 0478ddef69
bcache: add writeback_rate_debug stats (#1658)
* bcache: add writeback_rate_debug export

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru>
2020-08-24 17:40:31 +02:00
Li Ling a26ffaf4f8 Failed conntrack update should return success=0 (#1323, #1723)
Signed-off-by: Li Ling <liiling@google.com>
2020-08-20 14:18:57 +02:00
Li Ling f0332993ef Empty collection should return success=0 (#1323, #1723)
Signed-off-by: Li Ling <liiling@google.com>
2020-08-20 14:09:55 +02:00
Aleksei Zakharov 3b035c8fa1
bcache: add priorityStats flag (#1621)
* bcache: add priorityStats flag

Fixes #1593

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru>
2020-08-10 16:50:58 +02:00
domchan 503e4fc848
Expose cpu bugs and flags as info metrics. (#1788)
* Expose cpu bugs and flags as info metrics with a regexp filter.
* Automatically enable CPU info metrics when using flags or bugs feature.

Signed-off-by: domgoer <domdoumc@gmail.com>
2020-07-17 18:32:23 +02:00
mhiles 10f6841caf counter compliance again
Signed-off-by: mhiles <hiles@hpe.com>
2020-07-13 12:00:43 -04:00
mhiles d80ae383b7 conform to metric naming conventions
Signed-off-by: mhiles <hiles@hpe.com>
2020-07-13 10:30:52 -04:00
mhiles 076c953488 update fixtures / e2e test for fibre channel
Signed-off-by: mhiles <hiles@hpe.com>
2020-07-13 09:30:19 -04:00
mhiles 62403d6748 add fibre channel collector
Signed-off-by: mhiles <hiles@hpe.com>
2020-07-13 08:41:04 -04:00
Karsten Weiss b9b1d4e369 udp_queues_linux.go: s/upd/udp/ in two error strings
Signed-off-by: Karsten Weiss <knweiss@gmail.com>
2020-06-29 15:00:15 +02:00
Ben Kochie 5d42d4d99f
Merge pull request #1732 from fach/master
Adding backlog/current queue length to qdisc collector
2020-06-22 20:15:04 +02:00
binjip978 488d10ef48 Merge branch 'master' into entropy
Signed-off-by: binjip978 <binjip978@gmail.com>
2020-06-19 08:50:51 +03:00
Ben Kochie 08ce3c6dd4
Merge pull request #1733 from prometheus/superq/OutRsts
Include TCP OutRsts in netstat metrics
2020-06-18 17:12:45 +02:00
binjip978 3e1cb10f55 change metric description to correct units
Signed-off-by: binjip978 <binjip978@gmail.com>
2020-06-16 18:29:22 +03:00
Ben Kochie dfa53f835a
Use Go 1.13 error features
* Use `errors.Is()` for unwrapping errors.
* Use `%w` error verb in internal error formatting.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-16 14:47:03 +02:00
binjip978 6d1a4ddb24 add pool size to entropy collector
Signed-off-by: binjip978 <binjip978@gmail.com>
2020-06-16 08:06:40 +03:00
Ben Kochie 64ba27e7d6 Fix up powersupplyclass error
Switch to go `%w` error verb and errors.Is().

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-15 12:36:29 +02:00
Ben Kochie 35bfe455df
Merge pull request #1735 from prometheus/bjk/bump_procfs
Update prometheus/procfs
2020-06-15 07:44:34 +02:00
Ben Kochie c8c1618074
Merge pull request #1747 from prometheus/superq/fix_powersupplyclass
Handle no data from powersupplyclass
2020-06-14 15:45:12 +02:00
Ben Kochie baa7ab732f
Update prometheus/procfs
Fixes: https://github.com/prometheus/node_exporter/issues/1721

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-14 14:18:13 +02:00
Ben Kochie 7790f96881
Merge pull request #1743 from prometheus/superq/flags
Improve filter flag names.
2020-06-14 10:39:43 +02:00
Ben Kochie 5fed4f01e9
Handle no data from powersupplyclass
Handle the case when /sys/class/power_supply doesn't exist. Fixes
logging error spam.

Requires https://github.com/prometheus/procfs/pull/308

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-13 11:09:16 +02:00
Ben Kochie 7e49b68d3a
Improve filter flag names.
Update netdev and systemd collectors to deprecate poorly chosen flag names.

Old flag names to be removed in 2.0.0.

https://github.com/prometheus/node_exporter/issues/1742

Add log messages for parsed flag values to help discover quoting isuses in
supervisors.

https://github.com/prometheus/node_exporter/issues/1737

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-12 12:46:31 +02:00
Jeffrey Stoke cb7ab5119a
Fix collectors' build tags
Signed-off-by: Jeffrey Stoke <me@arhat.dev>
2020-06-12 10:26:30 +02:00
fach 79ef305a19 Updating e2e test output
Signed-off-by: fach <shaw38@gmail.com>
2020-06-04 13:01:34 -04:00
fach 0ea8978788 Adding backlog/current queue length to qdisc collector
Signed-off-by: fach <shaw38@gmail.com>
2020-06-04 12:13:07 -04:00
Ben Kochie 204164e4e4
Include TCP OutRsts in netstat metrics
TCP "OutRsts" is the number of TCP Resets sent by the node. This can be
useful for monitoring connection failures and flooding.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-04 08:51:39 +02:00
David O'Rourke 217bbdd6b9 helper_test: Fix copyright year
Signed-off-by: David O'Rourke <david.orourke@gmail.com>
2020-06-03 11:33:10 +02:00
David O'Rourke 03450a4d7d filesystem_freebsd: Use bytesToString to get label values
Signed-off-by: David O'Rourke <david.orourke@gmail.com>
2020-06-03 11:33:10 +02:00
David O'Rourke d6fbce1529 helper: Add new bytesToString function and tests
Signed-off-by: David O'Rourke <david.orourke@gmail.com>
2020-06-03 11:33:10 +02:00
David O'Rourke 4c06e33c23 filesystem_freebsd: Fix label values
We must know the length of the various filesystem C strings before
turning them from a byte array into a Go string, otherwise our Go
strings could contain null bytes, corrupting the label values.

Signed-off-by: David O'Rourke <david.orourke@gmail.com>
2020-06-03 11:33:10 +02:00
Wei He 0253277121
Add flag to aggr ipvs metrics to avoid high cardinality metrics (#1709)
Fixes #1708

Signed-off-by: Wing924 <weihe924stephen@gmail.com>
2020-06-02 10:52:00 +02:00
Ben Kochie 2aef188bc8
Merge pull request #1679 from alexnoz/ntp-usage-str-fix
Use clearer usage string for `collector.ntp.server-is-local` option
2020-05-25 13:58:01 +02:00
Ben Kochie f3073755a3
Merge pull request #1690 from shapor/patch-1
Move regexp to global in meminfo_linux.go
2020-05-25 13:57:24 +02:00
Ben Kochie 3565316d7e
Linux CPU: Cache CPU metrics
Cache CPU metrics to avoid counters (ie iowait) jumping backwards.

Fixes: https://github.com/prometheus/node_exporter/issues/1686

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-05-24 16:31:26 +02:00
Ben Kochie b532c81da7
Update filesystem freebsd
Upstream x/sys/unix changed types.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-05-14 21:02:21 +02:00
Sudhar287 6807e5319b
read contents of objset file (#1632)
* added objread functionality

Signed-off-by: Sudharshann D <sudhar287@gmail.com>
2020-05-13 21:06:00 +02:00
Shapor Naghibzadeh a1a3633d89 Move regexp to global in meminfo_linux.go
Compile regexp outside of parsing function in meminfo_linux.go

Signed-off-by: Shapor Naghibzadeh <shapor@google.com>
2020-04-26 01:13:25 -07:00
Alex Nozdriukhin 744e334ef9 Use clearer usage string for collector.ntp.server-is-local option
Signed-off-by: Alex Nozdriukhin <alex-nozzz@mail.ru>
fixes #1662
2020-04-18 00:53:57 +03:00
alpaca fa4edd700e
Fix accidently empty lines in meminfo_linux (#1671)
* Fix accidently empty lines in meminfo_linux

Signed-off-by: qwertysun <qwertysun@tencent.com>
2020-04-17 12:07:35 +02:00
Daniel Hodges b14168cf6a
Add perf tracepoint collection flag (#1664)
* Add tracepoint collector option for perf collector

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2020-04-17 12:02:08 +02:00
Daniel Hodges 44357ed677
Fix initialization in perf collector when using multiple CPUs (#1665)
* Fix initialization in perf collector when using multiple CPUs

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2020-04-17 11:59:07 +02:00
Peter Bueschel da5972b539
Add gauges for allocated memory for queued UDP and TCP packages (#1503)
* Two new states will be added to the tcpstat collector called rx_queued_bytes and tx_queued_bytes.

For UDP datagrams an additional collector 'udp_queues' can be used to expose the total lengths of the tx_queue and rx_queue.
@SuperQ and @discordianfish this changes gives us the option to check for overloaded UDP + TCP processing.
The names of the new TCP states and the UDP metric can be discussed.
The current reasons are just:

I don't want to add another collector for the same exposed file, so I just added the new states to the tcpstat collector.
I chose the name 'udp_queue' instead of 'udpstat' as UDP has no state.


Signed-off-by: Peter Bueschel <peter.bueschel@logmein.com>
2020-03-31 10:46:32 +02:00
Paweł Krupa 1771fc87d9
collector/systemd: use regexp to extract systemd version (#1647)
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2020-03-27 21:35:56 +01:00
Tom Wilkie 6496c24d61
Metrics for IO errors on Mac. (#1636)
* Metrics for IO errors and retries on Mac.

Signed-off-by: Tom Wilkie <tom@grafana.com>
2020-03-21 21:05:38 +01:00
Benjamin Drung 34d50e15d5 Add model_name and stepping to node_cpu_info metric
The `node_cpu_info` metric contains some information like the `model`
(which is an integer), but not the human readable model name. Also the
stepping of the processor might be interesting, since different stepping
of a processor might behave differently.

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
2020-03-20 17:27:11 +01:00
Ben Kochie e49a13d0cf
Catch missing schedstat file (#1641)
Suppres error log noise if schedstat file doesn't exist.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-03-19 19:50:36 +01:00
Ben Kochie c4183f9935
Minor cleanup in perf collector (#1616)
* Use `strconv.Itoa()` instead of `fmt.Sprintf()` for simple conversion.
* Eliminate copy-paste in collector setup.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-02-20 12:05:59 +01:00
Daniel Hodges ec62141388
Fix num cpu (#1561)
* add a map of profilers to CPUids

`runtime.NumCPU()` returns the number of CPUs that the process can run
on. This number does not necessarily correlate to CPU ids if the
affinity mask of the process is set.

This change maintains the current behavior as default, but also allows
the user to specify a range of CPUids to use instead.

The CPU id is stored as the value of a map keyed on the profiler
object's address.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Signed-off-by: Daniel Hodges <hodges@uber.com>

Co-authored-by: jdamato-fsly <55214354+jdamato-fsly@users.noreply.github.com>
2020-02-20 11:36:33 +01:00
Paul Gier b40954dce5
new flag to disable all default collectors (#1460)
* new flag to disable all default collectors

Signed-off-by: Paul Gier <pgier@redhat.com>

Co-authored-by: Ben Kochie <superq@gmail.com>
2020-02-20 11:03:33 +01:00
Ben Kochie 3e1b0f1bee
Don't count empty collection as success (#1613)
Many collectors depend on underlying features to be enabled. This causes
confusion about what "success" means. This changes the behavior of the
`node_scrape_collector_success` metric.

* When a collector is unable to find data don't return success.
* Catch the no data error and send to Debug log level to avoid log spam.
* Update collectors to support this new functionality.
* Fix copy-pasta mistake in infiband debug message.

Closes: https://github.com/prometheus/node_exporter/issues/1323

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-02-19 16:11:29 +01:00
Ben Kochie 1a75bc7b50
Fix up Darwin swap metrics
* Add a changelog entry.
* Remove redundant swap free metric.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-02-19 15:52:47 +01:00
jonas-lindmark 9828533697
Swap usage on darwin from sysctl vm.swapusage (#1608)
Signed-off-by: jonas <jonas.lindmark@denacode.se>
2020-02-19 15:51:29 +01:00
Silke Hofstra 8faa843fc4
Add Btrfs collector (#1512)
* Add procfs/btrfs to vendor folder
* Add Btrfs collector

Resolves #1100

Signed-off-by: Silke Hofstra <silke@slxh.eu>
2020-02-19 15:48:51 +01:00
Benjamin Drung ca1ac435ea
Collect non-numeric data from /sys/class/infiniband (#1563)
Let the node exporter collect the non-numeric data from
/sys/class/infiniband: board ID, firmware version, and HCA type.

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>

Co-authored-by: Ben Kochie <superq@gmail.com>
2020-02-19 15:18:44 +01:00
Phil Porada 14eafab016
Adds metrics and tests for UDP receive and send buffer errors (#1534)
* Adds metrics for UDP receive and send buffer errors

Signed-off-by: Phil Porada <philporada@gmail.com>
2020-02-19 14:41:40 +01:00
Julian Kornberger cfcaeee145
Use strconv.Itoa() instead of fmt.Sprintf() (#1566)
Signed-off-by: Julian Kornberger <jk+github@digineo.de>
2020-02-19 14:34:05 +01:00
Tobias Klauser 6ad94ae4bc
Implement loadavg on all BSDs without cgo (#1584)
Reuse the Go-only implementation already in place for FreeBSD (#385) on
Darwin, DragonflyBSD, NetBSD and OpenBSD.

Tested on all affected platforms.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2020-02-18 14:14:35 +01:00
Ben Kochie 1567cefdae
Bump all vendoring (#1612)
Update all vendoring to current releases.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-02-18 13:27:11 +01:00
Ben Kochie 14df2a1a1a
Update to latest procfs library (#1611)
Bump to v0.0.10 procfs library.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-02-18 11:33:46 +01:00
Johannes 'fish' Ziemke dcfd610433
systemd: Clarify private flag description (#1587)
This requires root, so it shouldn't be used.

This closes #1246

Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
2020-02-15 11:39:45 +01:00
Julien Pivotto 84c6446094 netdev: clean zero-value assignments
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-02-13 12:20:27 +01:00
Ben Kochie 92ea3c6a3f Fix inifiband collector log noise (#1599)
Handle non-existent infiniband results silent.

Fixes: https://github.com/prometheus/node_exporter/issues/1511

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-02-08 17:18:17 +01:00
Ukri Niemimuukko eac3e30f7f rapl_linux collector
This exposes RAPL statistics from /sys/class/powercap.

Co-Authored-By: Ben Kochie <superq@gmail.com>
Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2020-02-01 12:06:30 +01:00
Paul Cameron 9bb37873a8 Add unix socket support for supervisord collector (#1592)
* Add unix socket support for supervisord collector

For example:
  --collector.supervisord.url=unix:///var/run/supervisor.sock

Fixes prometheus/node_exporter#262

Signed-off-by: Paul Cameron <cameronpm@gmail.com>
2020-01-28 08:50:23 +01:00
Peter Tribble e7a27366a0 Fix Solaris build (typos in function names) (#1522)
Signed-off-by: Peter Tribble <peter.tribble@gmail.com>

Co-authored-by: Ben Kochie <superq@gmail.com>
2020-01-24 18:06:10 +01:00
Thomas Lin 3ddc82c2d8 Fixed inaccurate 'node_network_speed_bytes' when speeds are low (#1580)
Integer division and the order of operations when converting Mbps to Bps
results in a loss of accuracy if the interface speeds are set low.
e.g. 100 Mbps is reported as 12000000 Bps, should be 12500000
     10 Mbps is reported as 1000000 Bps, should be 1250000

Signed-off-by: Thomas Lin <t.lin@mail.utoronto.ca>
2020-01-01 13:10:53 +01:00
Ben Kochie f316099f87
Fix up softnet collector for go-kit change (#1581)
Add missing update for new go-kit logging change.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-12-31 19:36:39 +01:00
Ben Ye 2477c5c67d switch to go-kit/log (#1575)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2019-12-31 17:19:37 +01:00
Peter Nicholson a80b7d0bc5 Add softnet collector (#1576)
Signed-off-by: Peter Nicholson <petergoods@hotmail.com>
2019-12-30 01:36:10 +01:00
Julian Kornberger cafb12dc59 Add cause to error message
Signed-off-by: Julian Kornberger <jk+github@digineo.de>
2019-12-19 15:26:55 +01:00
Julian Kornberger 043fecbfd8 Wrap errors in the Go 1.13 way
Signed-off-by: Julian Kornberger <jk+github@digineo.de>
2019-12-19 15:26:55 +01:00
Matthieu Guegan 2cae917bb7 fix OpenBSD cache memory information (#1542)
This will now use `bcstats.numbufpages` instead of `uvmexp.vnodepages`.

Inspired by OpenBSD's `src/usr.bin/top`

Signed-off-by: Matthieu Guegan <matthieu.guegan@deindeal.ch>
2019-12-02 13:41:58 +01:00
Matt Layher da6b66371f collector: reimplement sockstat collector with procfs (#1552)
* collector: reimplement sockstat collector with procfs
* collector: handle sockstat IPv4 disabled, debug logging

Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-11-25 13:41:38 -06:00
Holger Hoffstätte 3c2c4e7b3c Add new counters for flush requests in Linux 5.5 (#1548)
* Add diskstat flush request counters for Linux 5.5+
* Update tests for diskstat flush request counters with Linux 5.5+

Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
2019-11-25 13:16:15 -06:00
Ben Kochie 67d3010a79
Add fixture update helper (#1551)
* Add makefile target to update sysfs fixtures.
* Use similar style for fixtures from procfs.
* Re-pack fixtures ttar file.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-11-23 07:52:47 -06:00
Benjamin Drung 04fbcfffa1 Collect InfiniBand port state and physical state (#1357)
Collect the InfiniBand port state, the physical state, and the maximum
signal transfer rate.

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
2019-11-22 13:52:17 -08:00
Matt Layher 177ac7f50f collector: refactor textfile collector to avoid looping defer
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-11-22 13:22:01 -05:00
Sven Haardiek d089776e8b
Squashed commit of the following:
commit 5ef96388a978c54173e1b1ec8e7bcb41fc7d130d
Author: Sven Haardiek <sven@haardiek.de>
Date:   Wed Sep 18 20:45:23 2019 +0200

    block variables

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit c1177382e241994618a8ab7dd9842027d597b0df
Author: Sven Haardiek <sven@haardiek.de>
Date:   Wed Sep 18 20:38:33 2019 +0200

    Use SI Units

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 04e4f99c423872d3094f21f89a8235b233a01941
Merge: 5417c98 f3538e1
Author: Sven Haardiek <sven@haardiek.de>
Date:   Wed Sep 18 19:20:17 2019 +0200

    Merge branch 'master' into power_supply_class

commit 5417c9820a40b37b490caedeaa3526883380b9bf
Author: Sven Haardiek <sven@haardiek.de>
Date:   Wed Sep 4 23:02:39 2019 +0200

    Drop averages

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 1f1447dbe7bbdcdabebf4c968beb14c67d89dd9f
Author: Sven Haardiek <sven@haardiek.de>
Date:   Wed Sep 4 22:56:00 2019 +0200

    Update Copyright

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 9677425059a3bf61cd7498cf7b5f05d5af7a626b
Merge: 0b51589 d3478a2
Author: Sven Haardiek <sven@haardiek.de>
Date:   Mon Sep 2 22:02:53 2019 +0200

    Merge branch 'master' into power_supply_class

commit 0b51589f390cc1b33ea4728d85fca3a3b231cf3f
Author: PrometheusBot <prometheus-team@googlegroups.com>
Date:   Fri Aug 30 13:32:17 2019 +0200

    makefile: update Makefile.common with newer version (#1466)

    Signed-off-by: prombot <prometheus-team@googlegroups.com>

commit af2b9e849c7b69237b7fa0e9a289c929ec7173a0
Author: Boris Momčilović <boris.momcilovic@gmail.com>
Date:   Tue Aug 27 14:24:11 2019 +0200

    Ipvs firewall mark (#1455)

    * IPVS: include firewall mark label

    Signed-off-by: Boris Momčilović <boris@firstbeatmedia.com>

commit 773f99de7f699900a00b4d35340e356fe7098ee7
Author: Paul Gier <pgier@redhat.com>
Date:   Tue Aug 27 02:26:19 2019 -0500

    update procfs to v0.0.4 (#1457)

    Signed-off-by: Paul Gier <pgier@redhat.com>

commit 6f8a4f4348f62700cbf7eeb2657851237e13c35d
Author: beorn7 <beorn@grafana.com>
Date:   Tue Aug 20 18:49:12 2019 +0200

    Update legendLink

    This still had the 'k8s' in as it was copied and pasted from the
    kubernetes-mixin.

    Signed-off-by: beorn7 <beorn@grafana.com>

commit d758cf394cfbed9e87e116a24d72050066cd039a
Author: beorn7 <beorn@grafana.com>
Date:   Wed Aug 14 22:24:24 2019 +0200

    Make the severity of "critical" alerts configurable

    This addresses the blissful scenario where single-node failures are
    unproblematic. No reason to wake somebody up if a node is about to
    screw itself up by filling the disk.

    Signed-off-by: beorn7 <beorn@grafana.com>

commit 041b9e1e785f5f43bbef97c0c76d205181d08890
Author: beorn7 <beorn@grafana.com>
Date:   Thu Aug 15 16:43:57 2019 +0200

    Add line for number of cores to load graph

    Backported from the node dashboard in the kubernetes-mixin.

    Signed-off-by: beorn7 <beorn@grafana.com>

commit 5552bb3a6b2be1e3dd1a93dbdb9650bd0363a922
Author: beorn7 <beorn@grafana.com>
Date:   Thu Aug 15 16:36:10 2019 +0200

    Fix title of CPU panel to usage

    We use the `mode="idle"` metric, but we are inverting it, so this is
    usage, and that's intended.

    Signed-off-by: beorn7 <beorn@grafana.com>

commit db0571b402233323ed7e222e53f7ef7738520f49
Author: beorn7 <beorn@grafana.com>
Date:   Thu Aug 15 16:32:54 2019 +0200

    node-mixin: Improve disk usage panel

    - Use a stacked graph instead of a gauge as development over time is
      especially useful for disk space usage.

    - By only taking one metric per device into account, we avoid
      double-counting for devices that are mounted multiple times.

    Signed-off-by: beorn7 <beorn@grafana.com>

commit 3822e096c5d27d06b9c9a68beff81ef23f12eb36
Author: Björn Rabenstein <beorn@grafana.com>
Date:   Thu Aug 15 00:40:51 2019 +0200

    node-mxin: Improve nodes dashboard (#1448)

    * node-mixin: Improve nodes dashboard

    - Use stacking where it makes sense.
    - Normalize idle CPU so that stacking is more meaningful.
    - Consistently fill where stacking is used but don't fill where not.
    - Fix y axis max value for Idle CPU panel.
    - Fix y axis min value for memory usage panel.
    - Use `$__interval` for range where applicable (and set min step
      to 1m).
    - Make the right Y axis for disk I/O actually work.

    This is just an incremental improvements. It doesn't touch the more
    involved TODOs.

    Signed-off-by: beorn7 <beorn@grafana.com>

commit fbced86b9835e1b196c15ddcac01ba3cfcf369cc
Author: beorn7 <beorn@grafana.com>
Date:   Tue Aug 13 21:54:28 2019 +0200

    node-mixin: Fix various straight-forward issues in the USE dashboards

    - Normalize cluster memory utilisation.

    - Fix missing `1m` in memory saturation.

    - Have both disk-related row next to each other instead with the
      network row in between.

    - Correctly render transmit network traffic as negative, using
      `seriesOverrides` and `min: null` for the y-axis.

    - Make panel and row naming consistent.

    - Remove legend where it would just display a single entry with
      exactly the title of the panel.

    - Fix metric name in individual node CPU Saturation panel.

    - Break up disk space utilisation by device in the panel for an
      individual node.

    NB: All of that doesn't touch any more subtle issues captured in the
    various TODOs.

    Signed-off-by: beorn7 <beorn@grafana.com>

commit 5bdf0625023cf7d05e0f65c6b6a1303637772ca6
Author: Sandro Jäckel <sandro.jaeckel@gmail.com>
Date:   Wed Aug 7 09:19:20 2019 +0200

    Update rootfs syntax in Docker example (#1443)

    Signed-off-by: Sandro Jäckel <sandro.jaeckel@gmail.com>

commit b59f081d45a3ca65957900ec33772dca25a3066f
Author: Phil Frost <phil@postmates.com>
Date:   Tue Aug 6 13:08:06 2019 -0400

    Fix seconds reported by schedstat (#1426)

    Upstream bugfix: https://github.com/prometheus/procfs/pull/191

    Signed-off-by: Phil Frost <phil@postmates.com>

commit ac9a059ae81fa31f9963614483af3b5e3bfd672c
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sun Aug 4 20:15:36 2019 +0200

    Try to make it work for PowerPC

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit c81acf3b009e8538783489d1468f33faf65d8b01
Merge: c064116 75462bf
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sun Aug 4 20:14:16 2019 +0200

    Merge remote-tracking branch 'upstream/master' into power_supply_class

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit c0641162c3a432f29df30c8d0632a7756d7d2bff
Merge: 06f6e3e 0b710bb
Author: Sven Haardiek <sven@haardiek.de>
Date:   Fri Aug 2 18:30:28 2019 +0200

    Merge branch 'master' into power_supply_class

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 06f6e3e8b2a9b2e3f345b6d312a777731bb4b403
Author: Sven Haardiek <sven.haardiek@iotec-gmbh.de>
Date:   Fri Mar 22 15:36:03 2019 +0100

    Fix Pull Request comments

    * concise metric conditions
    * combine info about power supply to one metric

    Signed-off-by: Sven Haardiek <sven.haardiek@iotec-gmbh.de>

commit 785c3735c4626de56f8341f800ab7bb5e2594d08
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sat Mar 9 18:47:52 2019 +0100

    Use sys.ttar instead of uploading the files

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit e07bff5d938457147b9009aef7d42d763018cd66
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sat Mar 9 18:34:50 2019 +0100

    Add information about from /sys/class/power_supply

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 55b3e34840c9dfc6513ae8e69b6479d5842a3091
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sat Mar 9 18:09:45 2019 +0100

    Use cyclecount instead of cycle_count since it is a gauge

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 602350b333cf9353d2cd0ffd40206c96ffe29941
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sat Mar 9 18:09:25 2019 +0100

    other build options

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 5aa38f678451d5b63ffdc32336345a1ff6703725
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sat Mar 9 18:08:56 2019 +0100

    Update fixtures

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit c6acc474a4224b8d9f7b178d0d2e02636d8629ea
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sat Mar 9 17:20:30 2019 +0100

    Update command line parameter flag

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit f5a329e6ae5ed3b16aa866d67b944f1a73edfe42
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sat Mar 9 17:20:06 2019 +0100

    Update procfs dependency

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 38d5fa5165643d6a44dc863b3a1696774259ac0d
Merge: 5a7ce69 28f3582
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sat Mar 9 16:28:29 2019 +0100

    Merge branch 'power_supply_class' of github.com:shaardie/node_exporter into power_supply_class

commit 5a7ce69505079c9c090e44448cfbd7ffb2b04df7
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sat Oct 20 18:55:49 2018 +0200

    Updated Metrics of Power Supply Class

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 690ab1b9c1f2e183b7088cf81c7f266d85ee6df6
Author: Sven Haardiek <sven@haardiek.de>
Date:   Fri Oct 19 20:03:42 2018 +0200

    Start work on Power Supply Collector

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 28f358222bbac4315fbf44d94da36d4b0ff2ed55
Author: Sven Haardiek <sven@haardiek.de>
Date:   Sat Oct 20 18:55:49 2018 +0200

    Updated Metrics of Power Supply Class

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

commit 751d99b818503e9a4430b10c39760f180349b294
Author: Sven Haardiek <sven@haardiek.de>
Date:   Fri Oct 19 20:03:42 2018 +0200

    Start work on Power Supply Collector

    Signed-off-by: Sven Haardiek <sven@haardiek.de>

Signed-off-by: Sven Haardiek <sven@haardiek.de>
2019-10-27 16:03:35 +01:00
Ben Kochie 74a90e81c0
Merge pull request #1486 from mknapphrt/mount_timeout
Add a flag to adjust mount timeout
2019-10-23 13:34:18 +02:00
Mark Knapp c9603c6ea2 Add a flag to adjust mount timeout
Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>
2019-10-22 14:47:59 -04:00
John Belmonte 15e36e2230 fix typo in cpufreq metric names (#1510)
Signed-off-by: John Belmonte <john@neggie.net>
2019-10-11 02:12:20 +09:00
Ben Kochie fb54f7f2e0
Merge pull request #1489 from pgier/cpuinfo
add node_cpu_info metric
2019-10-08 14:58:11 +02:00
Matt Layher ce693648d3
collector: clean up DRBD collector, less global state
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-10-04 10:40:18 -04:00
Matt Layher a1659da2e7
collector: remove commented-out import from bcache collector
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-10-01 11:47:25 -04:00
Paul Gier 4d72cb8059 add node_cpu_info metric
Contains information gathered from /proc/cpuinfo

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-09-25 14:38:57 -05:00
Benjamin Drung 27b8c93a5a Use InfiniBandClass from procfs library (#1396)
Parsing the sysfs files for InfiniBand was added to the procfs library
(see https://github.com/prometheus/procfs/pull/164).

Therefore use `InfiniBandClass` from the procfs library instead of
parsing sysfs itself.

If the port counter return `N/A (no PMA)` no metric will be returned
(instead of returning 0 for this metric.

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
2019-09-23 18:18:35 +02:00
Ben Kochie 82b7b1f732
Merge branch 'master' into coolingDevice 2019-09-09 17:44:03 +02:00
dt-rush 93fbb93a46 fix issue where rootfs path strips to the empty string (#1464)
Change-type: patch
Connects-to: #1463
Signed-off-by: dt-rush <nickp@balena.io>
2019-09-09 17:39:24 +02:00
Paul Gier 8c3de12c22 systemd: check version for availability of properties (#1413)
The dbus property 'SystemState' and the timer property 'LastTriggerUSec'
were added in version 212 of systemd.
Check that the version of systemd is higher than 212 before attempting
to query these properties

f755e3b74b
dedabea4b3

Resolves issue #291

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-09-04 16:27:25 +02:00
Alex Schmitz 664025d60c
Scrape cooling_device state
Signed-off-by: Alex Schmitz <alex.schmitz@gmail.com>
2019-08-30 08:58:47 -05:00
Boris Momčilović 93c12e03a1 Ipvs firewall mark (#1455)
* IPVS: include firewall mark label

Signed-off-by: Boris Momčilović <boris@firstbeatmedia.com>
2019-08-27 14:24:11 +02:00
Phil Frost 26d4fbdf07 Fix seconds reported by schedstat (#1426)
Upstream bugfix: https://github.com/prometheus/procfs/pull/191

Signed-off-by: Phil Frost <phil@postmates.com>
2019-08-06 19:08:06 +02:00
Richard Kojedzinszky 75462bf4fe Scrape thermal_zone temperatures (#1425)
* Scrape thermal_zone temperatures

Signed-off-by: Richard Kojedzinszky <richard@kojedz.in>
2019-08-04 12:56:36 +02:00
Philip Gough 2d95ecaa96 Extends uname collector to export on Darwin OS (#1433)
Adds uname collector support for Darwin and OpenBSD

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
2019-08-03 12:32:43 +02:00
Dipack P Panjabi a7452023db Added mountinfo changes to node_exporter (#1417)
Use the extra information gleaned from the mountinfo file to add
a 'mountaddr' field for NFS metrics. This helps prevent prometheus from
ignoring mounts that come from the same URL, but are actually from
different IP addresses.

This commit also rebases to current master

Signed-off-by: Dipack P Panjabi <dpanjabi@hudson-trading.com>
2019-07-28 11:32:40 +02:00
Matthias Rampke b133213c7a Report non-fatal collection errors in the exporter metric. (#1439)
As per prometheus/client_golang#543, pass the Registry for exporter
metrics when setting up the /metrics HTTP handler.

With this, the `promhttp_metric_handler_errors_total` metric will
increment on (possibly non-fatal) collection-time errors, such as
duplicate metrics from text files.

Signed-off-by: Matthias Rampke <mr@soundcloud.com>
2019-07-28 10:37:10 +02:00
dt-rush 5d3e2ce2ef properly strip path.rootfs from mountpoint labels (#1421)
Change-type: patch
Connects-to: #1418
Signed-off-by: dt-rush <nickp@balena.io>
2019-07-19 16:51:17 +02:00
Steven Kreuzer d8e47a9f9f Expose additional XFS runtime statistics (#1423)
Include directory operation, read/write system call, and vnode runtime
statistics for XFS filesystems.

Signed-off-by: Steven Kreuzer <skreuzer@FreeBSD.org>
2019-07-15 16:28:09 +02:00
detailyang 7fe9713edf bugfix: avoid nil reference when ignore is nil (#1414)
Signed-off-by: detailyang <detailyang@gmail.com>
2019-07-12 14:23:54 +02:00
Phil Frost f693a71c06 Scrape CPU latency stats from /proc/schedstat (#1389)
These are useful as a direct indication of CPU contention and task
scheduler latency.

Handy references:
 - https://github.com/torvalds/linux/blob/master/Documentation/scheduler/sched-stats.txt
 - https://doc.opensuse.org/documentation/leap/tuning/html/book.sle.tuning/cha.tuning.taskscheduler.html

procfs is updated to pull in the enabling change:
https://github.com/prometheus/procfs/pull/186

Signed-off-by: Phil Frost <phil@postmates.com>
2019-07-10 09:16:24 +02:00
秋葉 777b751f90 read /proc/net files with a single read syscall (#1380)
Signed-off-by: Hanaasagi <ambiguous404@gmail.com>
2019-07-08 15:53:14 +02:00
Derek Marcotte 3d504bc5cb Added FreeBSD zfs support per #1063. (#1394)
Based on the solaris implementation.  There's a lot of other sysctls
available on FreeBSD that aren't reported here.  It'll be easy to add,
if they're useful.  All of the sysctls are uint64.

Signed-off-by: Derek Marcotte <554b8425@razorfever.net>
2019-07-03 15:47:39 +02:00
Advait Bhatwadekar 3f49b31101 Closes issue #261 on node_exporter. (#1403)
* Closes issue #261 on node_exporter.

Delegated mdstat parsing to procfs project. mdadm_linux.go now only exports the metrics.
-> Added disk labels: "fail", "spare", "active" to indicate disk status
-> hanged metric node_md_disks_total ==> node_md_disks_required
-> Removed test cases for mdadm_linux.go, as the functionality they tested for has been moved to procfs project.

Signed-off-by: Advait Bhatwadekar <advait123@ymail.com>
2019-07-01 11:56:06 +02:00
Ben Kochie ccf27426ad
Fix 64k page e2e fixture (#1404)
Update for change in https://github.com/prometheus/node_exporter/pull/1224

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-06-28 09:53:35 +02:00
mknapphrt 3108a50fb6 Fix systemd restart counter label from state to name (#1393)
Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>
2019-06-25 09:37:48 +02:00
Leonid Evdokimov 22a7dbae08 Ignore iso9600 filesystem on Linux (#1355)
The filesystem is read-only and is often used for a virtual FS
with a configuration file for a virtual machine.

Signed-off-by: Leonid Evdokimov <leon@darkk.net.ru>
2019-06-18 17:47:05 +01:00
Paul Gier 2bc133cd48 update procfs to v0.0.2 (#1376)
Signed-off-by: Paul Gier <pgier@redhat.com>
2019-06-12 20:47:16 +02:00
Ben Kochie 8146998945
Fix rollover bug in mountstats collector (#1364)
* Update procfs vendor to pull in github.com/prometheus/procfs/pull/165
* Update mountstats collector to use new types.
* Rollover counter automatically to avoid float64 accuracy issues.
* Update e2e test.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-05-31 18:30:37 +02:00
Noam Meltzer 501ccf9fb4 Add --collector.netdev.device-whitelist flag (#1279)
* Add --collector.netdev.device-whitelist flag

Sometimes it is desired to monitor only one netdev. The golang regexp
does not support a negated regex, so the ignored-devices flag is too
cumbersome for this task.
This change introduces a new flag: accept-devices, which is mutually
exclusive to ignored-devices. This flag allows specifying ONLY the
netdev you'd like.

Signed-off-by: Noam Meltzer <noam@cynerio.co>
2019-05-31 17:55:50 +02:00
David O'Rourke 814ef064c0 meminfo: Fix the size mismatch in the swapTotal check mib for BSD. (#1345)
Signed-off-by: David O'Rourke <david.orourke@gmail.com>
2019-05-14 17:42:36 -05:00
Ben Kochie f10c665d33
Cleanup uname Update call (#1342)
Make collector a pointer for consistency.

Fixes: https://github.com/prometheus/node_exporter/issues/1300

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-05-13 11:44:12 -05:00
Paul Gier 8b13c130b7 log pid when there is a problem reading the process stats (#1341)
Signed-off-by: Paul Gier <pgier@redhat.com>
2019-05-10 13:04:26 -05:00
Paul Gier d0a66c4c40 use sys/unix package instead of syscall (#1340)
According to the golang docs, the syscall package is deprecated.
https://golang.org/pkg/syscall
This updates collectors to use the x/sys/unix package instead.
Also updates the vendored x/sys/unix module to latest.

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-05-10 13:04:06 -05:00
Daniel Hodges 7882009870 Add perf exporter (#1274)
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2019-05-07 13:21:41 +02:00
Paul Gier 86f9079429 update procfs to latest (#1335)
Updates for procfs refactoring

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-05-07 06:38:21 +02:00
Ben Kochie 78b9eb9c2c Use 64-bit Darwin netstat counters (#1319)
Avoid 32-bit counter rollovers.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-04-25 10:07:56 +02:00
Christian Hoffmann 36e3b2a923 textfile: use opened file's mtime as timestamp (#1326)
Previously, the node_textfile_mtime_seconds metric was based on the
Fileinfo.ModTime() of the ioutil.ReadDir() return value. This is based
on lstat() and therefore has unintended consequences for symlinks
(modification time of the symlink instead of the symlink target is
returned). It is also racy as the lstat() is performed before reading
the file.

This commit changes the node_textfile_mtime_seconds metric to be based
on a fresh Stat() call on the open file.  This eliminates the race and
works as expected for symlinks. Fixes #1324.

Signed-off-by: Christian Hoffmann <mail@hoffmann-christian.info>
2019-04-18 17:47:04 +02:00
Daniele Sluijters cc2fd82008 Expose /proc/pressure (#1261)
This enables the collection of pressure stall information as exposed
by the `/proc/pressure` interface added in the 4.20 release of the
Linux kernel.

Closes #1174

Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>
2019-04-18 12:19:20 +02:00
Paul Gier b1298677aa Early init of procfs (#1315)
Minor change to match naming convention in other collectors.

Initialize the proc or sys FS instance once while initializing
each collector instead of re-creating for each metric update.

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-04-10 18:16:12 +02:00
Paul Gier cc847f2f44 collector/cpu: split cpu freq metrics into separate collector (#1253)
The cpu frequency information is not always needed and/or available.
This change allows the cpu frequency metrics to be enabled/disabled
separately from the other cpu metrics, and also prevents a frequency
metric failure (such as a parse error) from failing the main cpu
collector.

Fixes #1241

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-02-19 17:22:54 +01:00
Ben Kochie f028b81615
Update systemd blacklist (#1255)
Include additional unit types in the default systemd collector
blacklist.

Signed-off-by: Ben Kochie <superq@gmail.com>
2019-02-17 17:57:15 +01:00
Paul Gier cb9e23c536 Systemd refactor (#1254)
This reduces the system metric collection time by using a wait group
and go routines to allow the systemd metric calls happen concurrently.

Also, makes the start time, restarts, tasks_max, and tasks_current metrics disabled by default
because these can be time consuming to gather.

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-02-11 23:27:21 +01:00
Sachi King 18fc512fc4 Bond: Monitor bond mii_status not link operstate (#1124)
With a bond interface the state of the slave interface from the bond's
point of view is reflected in `mii_status` and is independent of the
link's `operstate`.

When a bond is monitored with `miimon`, `mii_status` will reflect the
state of the physical link as configured via the operator.

When a bond is monitored via `arp_interval` the `mii_status` will
reflect the results of the bond ARP checking.  This means the link can
be down from the bond's point of view, but up from a physical
connection point of view.

If a bond is not monitored via miimon or arp, the `mii_status` should
likely be always `up`, however I have observed a case where this is not
true and the `operstate` is `up` while `mii_status` is `down`.  Kernel
bond documentation stresses that a bond should not be configured without
one of `mii_mon` or `arp_interval` configured however.

This change results in the metric 'node_bonding_active' matching the
up/down state of the bond's point of view rather than operstate.

Signed-off-by: Sachi King <nakato@nakato.io>
2019-02-10 11:00:04 +01:00