Commit graph

969 commits

Author SHA1 Message Date
Benoît Knecht 296aa35dd2 end-to-end-test.sh: Use udev fixture and update output
Set the `--path.udev.data` flag to point to the udev fixture, and update the
output fixture with

```console
$ ./end-to-end-test.sh -u
```

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2022-07-06 12:30:50 +02:00
Benoît Knecht 9b5d55e511 collector/diskstats: Add fixtures for udev data
Now that we read some data from `/run/udev/data`, add the corresponding
fixtures and update the expected test results accordingly.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2022-07-06 12:30:50 +02:00
Benoît Knecht 833216dc9e collector: Make udev data path optional
Instead of hard-coding the path to `/run/udev/data`, intoduce a
`--path.udev.data` flag that defaults to that value.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2022-07-06 12:30:50 +02:00
Benoît Knecht 75ceda8bb2 collector/diskstats: Don't use functions from Go 1.18
Since we need to support Go 1.17, don't use `strings.Cut()` which was
introduced in Go 1.18.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2022-07-06 12:30:50 +02:00
Benoît Knecht a997b6096d collector/diskstats: Add labels and metrics from udev
Add labels to the `node_disk_info` metric extracted from udev, such as `model`,
`path`, `revision`, `serial` and `wwn`.

Also add a few metrics related to filesystem and device mapper, which are also
extracted from udev information.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2022-07-06 12:30:50 +02:00
Nobuhiro MIKI 3ed95908d6 collector: add slab info
Co-authored-by: Ben Kochie <superq@gmail.com>
Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp>
2022-07-06 12:18:27 +02:00
Ben Kochie 02f5005ac8
Add diskstat include/exclude flag to all platforms
Refactor diskstats collector include/exclude to work on all platforms.
* Fix up default ignored devices.

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-06-28 08:30:01 +02:00
rushilenekar20 8fcc6320a2
Add diskstats include and exclude device flags
Use standard include/exclude pattern for device include/exclude in the
diskstats collector.

Signed-off-by: Ben Kochie <superq@gmail.com>
Co-authored-by: rushilenekar20 <rushilenekar20@gmail.com>
2022-06-28 07:48:21 +02:00
Jonathan Davies 88f1811eb1
Add selinux collector (#2205)
Add selinux collector

Signed-off-by: Jonathan Davies <jpds@protonmail.com>
2022-06-28 05:54:05 +02:00
Ben Kochie d2b8ee8f20
Add rapl zone label option (#2401)
Add an optional flag to set the RAPL zone as a label, instead of as part
of the metric name.

Fixes: https://github.com/prometheus/node_exporter/issues/2299

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-06-27 23:09:32 +02:00
dependabot[bot] b99f933713
Bump github.com/prometheus/client_golang from 1.12.1 to 1.12.2 (#2411)
* Bump github.com/prometheus/client_golang from 1.12.1 to 1.12.2

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.12.1 to 1.12.2.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.12.1...v1.12.2)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update fixtures for client_golang 1.12.2.

Signed-off-by: Ben Kochie <superq@gmail.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ben Kochie <superq@gmail.com>
2022-06-26 11:33:15 +02:00
Ben Kochie 59c146e57d
Update end-to-end test for aarch64 (#2415)
Fix up handling of CPU info collector on non-x86_64 systems due to
fixtures containing `/proc/cpuinfo` from x86_64.
* Update e2e 64k page test fixture from an arm64 system.
* Enable ARM testing in CircleCI.

Fixes: https://github.com/prometheus/node_exporter/issues/1959

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-06-26 09:41:21 +02:00
Ben Kochie a516d4de4a
Cleanup cgroups collector (#2414)
* Correctly name collector file.
* Fix cgroup summary type as gauge.
* Use a boolean metric rather than a label for enabled.

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-06-24 17:15:31 +02:00
Kobe Biello 45c75f1dbc
Add cgroup summary collector (#2408)
* add cgroups summary collector

Signed-off-by: biello <bellusa@qq.com>
Co-authored-by: bielu <bielu@zuoyebang.com>
2022-06-24 12:05:13 +02:00
Ben Kochie 3999866a36
Merge pull request #2368 from mrueg/update-go-systemd
go.mod: Update coreos/go-systemd
2022-06-05 11:20:46 +02:00
Ben Kochie ea85bfcc23
Merge pull request #2378 from prometheus/superq/devfilter
Rename netDevFilter helper
2022-06-05 10:01:50 +02:00
Ben Kochie e22382c5ec
Merge pull request #2372 from aneagoe/master
rapl_collector: fix issue with invalid metric name (#2299)
2022-05-31 21:42:50 +02:00
Tobias Klauser a8ebe3519e
collector: use ByteSliceToString from golang.org/x/sys/unix
Use unix.ByteSliceToString to convert Utsname []byte fields to strings.

This also allows to drop the bytesToString helper which serves the same
purpose and matches ByteSliceToString's implementation.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2022-05-23 15:44:16 +02:00
Ben Kochie 1b6aaeb2e8
Rename netDevFilter helper
Rename the network device filter to a more generic device filter.

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-05-19 10:36:50 +02:00
Andrei Neagoe 0e320e725b rapl_collector: fix issue with invalid metric name (#2299)
Signed-off-by: Andrei Neagoe <3854672+aneagoe@users.noreply.github.com>
2022-05-09 15:42:46 +02:00
Manuel Rüger 21f9ce2c49 go.mod: Update coreos/go-systemd
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2022-05-04 22:19:30 +02:00
Fionera 9ece38fca9 refactor: Use netlink for tcpstat collector
Signed-off-by: Tim Windelschmidt <t.windelschmidt@babiel.com>
2022-04-25 10:13:06 +02:00
binjip978 e5f384dfe6 Fix staticcheck warnings on linux
Signed-off-by: binjip978 <pdp.eleven11@gmail.com>
2022-04-09 05:36:59 +00:00
Ben Kochie b52bf958f8
Merge pull request #2327 from pjjw/pjjw/powersupplyclass-darwin-old-sdk-fixes
powersupplyclass_darwin: extra includes to build against older macOS SDK
2022-03-30 12:08:52 +02:00
Ben Kochie 9155971e07
Update Go modues
Update to latest releases.
* Fix up perf collector syntax.

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-03-30 11:47:09 +02:00
Peter Woodman 2370cccc1f
powersupplyclass_darwin: enable builds against older macOS SDK
This is necessary to build on darwin using nix, as nix-darwin uses an
older macOS SDK, built from Apple's open source releases.

Signed-off-by: Peter Woodman <peter@shortbus.org>
2022-03-23 22:41:31 -04:00
Ben Kochie 9aae303a46
Merge pull request #2289 from tanguyfalconnet/ethtool-lock
ethtool_linux: add mutex around entries access
2022-03-22 04:34:19 -07:00
Ben Kochie 086fdfed24
Merge pull request #2267 from bison/netdev-lock
netdev_common: Add mutex around metricDescs access
2022-03-22 04:27:37 -07:00
W. Andrew Denton 402a00932d Add a reference to the Linux kernel's documentation for block stat.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2022-03-22 11:36:00 +01:00
W. Andrew Denton 84ce3a0103 diskstats_linux: always scale reads and writes by 512 bytes, not by device units.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2022-03-22 11:36:00 +01:00
Brad Ison cb7b5a755b
netdev_common: Add mutex around metricDescs access
In certain instances on heavily loaded nodes with many network
devices, there may be concurrent access to the netdev collector's
`metricDescs` map, resulting in a panic.  This adds a mutex to prevent
concurrent reads and writes to the map.

Signed-off-by: Brad Ison <bison@xvdf.io>
2022-03-16 11:46:24 +01:00
Ben Kochie 5c0e4d61c8
Add systemd version as label string.
Signed-off-by: Ben Kochie <superq@gmail.com>
2022-02-17 15:39:00 +01:00
t-falconnet 5c8407b772 ethtool-linux: fix entry function
Signed-off-by: t-falconnet <tfalconnet.externe@bedrockstreaming.com>
2022-02-11 17:06:53 +01:00
t-falconnet db87173be0 ethtool-linux: split between create and show entry
Signed-off-by: t-falconnet <tfalconnet.externe@bedrockstreaming.com>
2022-02-11 17:04:33 +01:00
t-falconnet b0708e4c47 ethtool-linux: add remaining unlocked access to entries
Signed-off-by: t-falconnet <tfalconnet.externe@bedrockstreaming.com>
2022-02-11 16:55:26 +01:00
t-falconnet 642f64b701 ethtool_linux: fix entry function
Signed-off-by: t-falconnet <tfalconnet.externe@bedrockstreaming.com>
2022-02-11 16:44:43 +01:00
t-falconnet 4426962ec8 ethtool_linux: add mutex around entries access
Signed-off-by: t-falconnet <tfalconnet.externe@bedrockstreaming.com>
2022-02-11 16:44:43 +01:00
Ben Kochie 5981bbe638
Refactor systemd version
Move the systemd version function to an exporter method. This way we can
update the Verison information at every scrape, in case the underlying
version changes.

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-02-06 15:37:58 +01:00
Joe Groocock 64c4c39132
systemd: Expose systemd minor version
systemd patch versions are as important as the major version number;
they indicate security or bug fixes or other behavioural changes between
versions.

Use float64 over float32 as the rounding error with float32 rendered
250.3 as 250.3000030517578 in my testing.

Signed-off-by: Joe Groocock <jgroocock@cloudflare.com>
Signed-off-by: Joe Groocock <me@frebib.net>
2022-02-06 14:01:45 +00:00
Robbie Lankford 4f27a4fd8e add additional vm_stat memory metrics for darwin
Signed-off-by: Robbie Lankford <robert.lankford@grafana.com>
2022-01-27 11:34:07 +01:00
Lauri Tirkkonen 996563f972 filesystem_linux: exclude mounts under /var/lib/containers/storage
analogous to the /var/lib/docker exclude added in
https://github.com/prometheus/node_exporter/pull/814

podman rootful containers mount eg. shm filesystems at
/var/lib/containers/storage/*-containers/*/userdata/shm. these should be
treated like things under /var/lib/docker by default.

Signed-off-by: Lauri Tirkkonen <lauri@hacktheplanet.fi>
2022-01-03 16:32:37 +01:00
Ben Kochie eecc2b1dea
Add device filter flags to arp collector
Allow filtering APR entries based on device. Useful for ignoring
entries for network namespaces (containers).

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-12-16 15:41:10 +01:00
heyitao 7dbf358915 delete duplicate items
Signed-off-by: heyitao <linuxgcc@163.com>
2021-12-09 11:50:10 +01:00
Lapo Luchini 3136901a74
Ignore filesystems flagged as MNT_IGNORE. (#2227)
* Ignore filesystems flagges as MNT_IGNORE.
Closes #2152.

Signed-off-by: Lapo Luchini <lapo@lapo.it>
2021-12-01 11:21:31 +01:00
Ben Kochie 1d5afd05b5
Sanitize UTF-8 in dmi collector (#2229)
Replace invalid UTF-8 chars with "�" string.

Fixes: https://github.com/prometheus/node_exporter/issues/2228

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-12-01 11:13:43 +01:00
Jacob Vosmaer 5c8d162ca6
Add node_softirqs_total metric (#2221)
This adds a new Linux metric, node_softirqs_total, which corresponds
to the 'softirq' line in /proc/stat. This metric is disabled by
default and it can be enabled with '--collector.stat.softirq'.

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
2021-12-01 09:55:13 +01:00
Matt Oshry 60a2668788
Handle nil CPU thermal power status on M1 (#2225)
Signed-off-by: Matt Oshry <matto@spatialinc.com>
2021-11-29 10:55:36 +01:00
Claudio Jeker 2cf3db8859 Also track the CPU Spin time for OpenBSD systems.
Use the non-cgo version for all openbsd architectures.
The old code only pulled some defines from header files. Just add them
as enumerations in native go. Also be careful at what the SysctlRaw returns.

Implement a way that supports both recent and old pre-6.4 OpenBSD systems.
With go-1.16 OpenBSD binaries will link to libc and because of this binaries
built on OpenBSD 6.9-current do not run on OpenBSD 6.3. OpenBSD 6.3 is also
not supported for more then 2 years. So maybe the compat code is not needed.
Still validation object length before doing an unsafe pointer conversion
is probably reasonable but I'm no golang expert.

Signed-off-by: Claudio Jeker <claudio@openbsd.org>
2021-11-26 12:15:45 +01:00
Martin Kennelly 4065902fe5
Add TCPTimeouts to netstat default filter (#2189)
TCP timeouts count is a useful signal to show
abnormal network performance and is another
signal to aid debugging. This metric can be
used to generate proactive alerts for host
network namespace workloads.

Signed-off-by: Martin Kennelly <mkennell@redhat.com>
2021-11-18 09:34:55 +01:00
Benjamin Drung f5ae31a84c
Disable lnstat collector by default (#2188)
The new `lnstat` collector produces a high number of metrics, per-cpu,
and results in approximately double the number of metrics previously
scraped. For example, a typical server with 64 cores produces 3832
lnstat metrics compared to 4147 metrics for the remaining collectors.

Therefore disable the `lnstat` collector by default.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-11-18 09:33:34 +01:00
Park Beomsu c861ba93aa
Remove redundant nil check (#2206)
Signed-off-by: computerphilosopher <bspark@jam2in.com>
2021-11-15 11:23:49 +01:00
Benjamin Drung d85cbaa17c
ethtool: Prevent duplicate metric names (#2187)
Sanitizing the metric names can lead to duplicate metric names:

```
caller=level.go:63 level=error caller="error gathering metrics: [from Gatherer #2] collected metric \"node_ethtool_giant_hdr\" { label:<name:\"device\" value:\"ens192\" > untyped:<value:0" msg=" > } was collected before with the same name and label values"
```

Generate a map from the sanitized metric names to the metric names from
ethtool. In case of duplicate sanitized metric names drop both metrics,
because it is unknown which one to take.

Fixes: https://github.com/prometheus/node_exporter/issues/2185
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-11-15 11:22:36 +01:00
Tobias Klauser 58ab0144af Use SysctlTimeval for boottime collector on BSD
Use SysctlTimeval from the golang.org/x/sys/unix package to
simplify the implementation of the boottime collector for the BSDs and
allows to build it without cgo.

Tested on macOS 11.6, FreeBSD 13 and OpenBSD 7.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2021-11-15 10:50:03 +01:00
Johannes 'fish' Ziemke 85e20238e7
Add clocksource metrics to time collector (#2197)
* Add clocksource metrics to time collector

This closes #1336

Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-11-12 11:45:31 +01:00
Ben Kochie fda358a1ec
Workaround LLVM/Clang 11.0 for Darwin builds (#2200)
LLVM/Clang 11.0 adds a `-Wundef-prefix=TARGET_OS_` build flag which
breaks this build flag.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-11-09 17:52:49 +01:00
Benjamin Drung 2a28266852
ethtool: Add test case with leading spaces (#2186)
Add test case for ethtool metrics with leading spaces reported in #2185:

```
$ ethtool -S
NIC statistics:
     Tx Queue#: 0
       TSO pkts tx: 0
       TSO bytes tx: 0
       ucast pkts tx: 20487
       ucast bytes tx: 1908107
       mcast pkts tx: 83
       mcast bytes tx: 5906
       bcast pkts tx: 4
       bcast bytes tx: 168
       pkts tx err: 0
       pkts tx discard: 0
       drv dropped tx total: 0
          too many frags: 0
          giant hdr: 0
          hdr err: 0
          tso: 0
       ring full: 0
       pkts linearized: 0
       hdr cloned: 0
       giant hdr: 0
     Rx Queue#: 0
       LRO pkts rx: 0
       LRO byte rx: 0
       ucast pkts rx: 25086
       ucast bytes rx: 2404103
       mcast pkts rx: 0
       mcast bytes rx: 0
       bcast pkts rx: 0
       bcast bytes rx: 0
       pkts rx OOB: 0
       pkts rx err: 0
       drv dropped rx total: 0
          err: 0
          fcs: 0
       rx buf alloc fail: 0
     tx timeout count: 0
```

Bug: https://github.com/prometheus/node_exporter/issues/2185
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-10-29 10:55:39 +02:00
Benjamin Drung 0dc82eac13
Correctly disable ZFS for test cases (#2182)
Disable `collector/zfs_linux_test.go` in case `!nozfs` is set to
completely disable ZFS.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-10-28 15:27:15 +02:00
Alessio Caiazza 6523fdfc4b
darwin powersupply collector (#1777)
* Extract powersupply linux code from collector common file.
* Add Darwin powersupply collector.

Signed-off-by: Alessio Caiazza <nolith@abisso.org>
2021-10-28 10:22:24 +02:00
Alessio Caiazza ee17ba0fc0
Fix imports when building on macos (#2180)
Signed-off-by: Alessio Caiazza <nolith@abisso.org>
2021-10-27 16:56:36 +02:00
STRRL df7ea981f7
feat: new collector about thermal conditions on macos (#2032)
* feat: new collector about thermal conditions on macos

Signed-off-by: STRRL <str_ruiling@outlook.com>
2021-10-27 14:05:57 +02:00
Benjamin Drung 9def2f9222
Add DMI collector (#2131)
Add a DMI collector to expose the Desktop Management Interface (DMI)
info from `/sys/class/dmi/id/`. This will expose information about the
BIOS, mainboard, chassis, and product.

Closes: https://github.com/prometheus/node_exporter/issues/303
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-10-27 13:56:37 +02:00
ml 094ee24ad7
Ignore mountpoints under /run (#2157)
* Exclude mountpoints under /run/credentials

Signed-off-by: ml <ml@visu.li>
2021-10-27 13:53:26 +02:00
jordy1024 fbc23548b9
Fix timer GC delays in the Linux filesystem collector (#2169)
Use `time.NewTimer()` and explicit `Stop()` to avoid memory bloat / GC problems with `time.After()` in the Linux filesystem collector timeout handling.

Signed-off-by: bawenmao <bawenmao@sogou-inc.com>
2021-10-24 12:48:57 +02:00
Matt Layher 8b198ce168
collector: replace fmt.Sprintf with strconv.Itoa in perfCollector (#2174)
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2021-10-22 21:03:10 +02:00
W. Andrew Denton d514d67d7f ethtool: convert updateSpeeds to a loop.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-10-20 19:11:30 +02:00
W. Andrew Denton 56e6dcb1fa ethtool: add newline to settings fixture.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-10-20 19:11:30 +02:00
W. Andrew Denton 0f6c84214c ethtool: minor changes to resolve review nits.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-10-20 19:11:30 +02:00
W. Andrew Denton 77a3c9bc20 Gather additional link info from ethtool.
The ethtool_cmd	struct from the	linux kernel contains information about the speeds and features	supported by a
network device. This includes speeds and duplex	but also features like autonegotiate and 802.3x pause frames.

Closes #1444

Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-10-20 19:11:30 +02:00
wenlxie 11bb2f8a95 support thread state
Signed-off-by: wenlxie <wenlxie@ebay.com>
2021-10-19 11:58:43 +02:00
Kiril Vladimirov 1721de0c38
collector: Unwrap glob textfile directories (#1985)
* collector: Unwrap glob textfile directories
* collector: Store full path in mtime's file label

The point is to avoid duplicated gauges from files with the same name in
different directories.

This introduces support for exporting from multiple directories matching
given pattern (e.g. `/home/*/metrics/`).

Signed-off-by: Kiril Vladimirov <kiril@vladimiroff.org>
2021-10-18 14:05:21 +02:00
Ben Kochie f67faf9d18
Fixup drm_linux.go build tag.
Signed-off-by: Ben Kochie <superq@gmail.com>
2021-10-11 15:36:44 +02:00
Siavash Safi 5f110dfeb8
Add initial support for monitoring GPUs on Linux (#1998)
Expose GPU metrics using `sysfs/drm`.
`amdgpu` is the only driver which exposes this information through DRM.

Signed-off-by: Siavash Safi <siavash.safi@gmail.com>
2021-10-11 15:26:21 +02:00
Ben Kochie f61be48d94
Use include/exclude flags for ethtool filtering (#2165)
Use the same flag pattern as netdev to make filtering methods the same.
* Move SanitizeMetricName to helper.go

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-10-11 15:12:25 +02:00
Julien Pivotto 68a6c78c0d
Update go to 1.17 (#2159)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-10-03 13:35:24 +02:00
Aleksei Zakharov 0e6b23c338
Lnstat: expose metrics from /proc/net/stat (#1771)
* Lnstat initial commit

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru>
Co-authored-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-09-28 10:24:18 +02:00
ventifus 0aec407666
Refactor diskstats (#2141)
* Refactor diskstats_linux to use procfs.
* Add `node_disk_info` metric.

Signed-off-by: W. Andrew Denton <git@flying-snail.net>
Co-authored-by: W. Andrew Denton <git@flying-snail.net>
2021-09-28 10:14:12 +02:00
Sergei Semenchuk 5de46c6bac
collect flag_info and bug_info only for one core (#2156)
Signed-off-by: binjip978 <binjip978@gmail.com>
2021-09-28 07:44:03 +02:00
Sergei Semenchuk 2b490d645e
add path label to rapl collector (#2146)
Signed-off-by: binjip978 <binjip978@gmail.com>
2021-09-27 22:57:03 +02:00
Sergei Semenchuk a0c20d48db
add node_network_address_info collector (#2105)
* add node_network_address_info collector

emit (interface, addr, netmask, scope) labels for every network
interface

Signed-off-by: binjip978 <binjip978@gmail.com>
2021-09-08 14:50:25 +02:00
Benjamin Drung b6215e649c Add os release collector
Currently Node Exporter has a metric called `node_uname_info` which of
course exposes uname info. While this is nice, it does not help if you
are running different OSes which could have similar uname info.

Therefore parse `/etc/os-release` or `/usr/lib/os-release` and expose a
`node_os_info` metric which provide information regarding the OS
release/version of the node. Also expose the major.minor part of the OS
release version as `node_os_version`.

Since the os-release files will not change often, cache the parsed
content and only refresh the cache if the modification time changes.

This `os` collector will read files outside of `/proc` and `/sys`, but
the os-release file is widely used and the format is standardized:
https://www.freedesktop.org/software/systemd/man/os-release.html

Bug: https://github.com/prometheus/node_exporter/issues/1574
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-19 14:04:21 +02:00
Ben Kochie 84b36c4fd8
Add flag to disable guest CPU metrics
In high scale virtualized / cloud environments there are typically
no guest VMs. Add a boolean flag to allow disabling the Linux guest
CPU metrics.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-08-17 13:04:46 +02:00
Benjamin Drung 26ca609183 ethtool: Expose node_ethtool_info metric
Add a `node_ethtool_info` metric to all ethtool devices to expose driver
information with following labels:

 * bus_info
 * driver
 * expansion_rom_version
 * firmware_version
 * version

This metric is useful to monitor the firmware version to be up-to-date.

Note: The version label might be malformed due to bug #39 in ethtool:
https://github.com/safchain/ethtool/issues/39

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-16 16:09:35 +02:00
Benjamin Drung 6ac6ea2d13 ethtool: Sanitize metric names
OpenMetrics and the Prometheus exposition format require the metric name
to consist only of alphanumericals and "_", ":" and they must not start
with digits. The metric names from the ethtool stats might contain
spaces, brackets, and dots. Converting them directly to metric names
will produce invalid metric names.

Therefore sanitize the metric names and convert them to lower case.

Fixes: https://github.com/prometheus/node_exporter/issues/2083
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-16 15:28:27 +02:00
Johannes 'fish' Ziemke e6b5aaaff4 Add collector.ethtool.metrics-include
This adds a new flag --collector.ethtool.metrics-include to the ethtool
collector. Only metrics matching this regexp will be collected.

Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-08-10 18:57:36 +02:00
Benjamin Drung 4356c09ebd ethtool: Use prometheus.BuildFQName
Use `prometheus.BuildFQName` everywhere in `ethtool` instead of
hard-coding the metric names.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-10 18:20:01 +02:00
Benjamin Drung 3afd382e75 Add --collector.ethtool.ignored-devices
Other network related collectors allow to filter out unwanted devices.
Add this support to the new ethtool collector as well.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-10 18:09:26 +02:00
Ben Kochie 5d2a4cf7fb
Fix processes collector long int parsing
Update procfs library to include ignored fields ParseInt handling.

Wrap error returns so that the user can know more about what failed.
Returns from getAllocatedThreads() are errors anyway.

Fixes: https://github.com/prometheus/node_exporter/issues/2110

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-08-06 05:55:24 +02:00
Ben Kochie 747012c59a
Merge pull request #2092 from prometheus/superq/fix_energy_uj
Fix rapl collector log noise
2021-07-22 21:00:29 +02:00
Ben Kochie 97d4b01691
Bump prometheus/procfs library
Pull in bug fix for noisy logging.

Fixes: https://github.com/prometheus/node_exporter/issues/2086

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-21 21:40:21 +02:00
Ben Kochie 502f287c96
Fix rapl collector log noise
Capture permission denied error for "energy_uj" file.

Fixes: https://github.com/prometheus/node_exporter/issues/1892

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-21 19:28:54 +02:00
Ben Kochie 6ac7a53f45
Fix conntrack collector log noise
Fix un-handled file not found for conntrack stats.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-15 13:45:07 +02:00
Ben Kochie 6793e0e5a8
Merge pull request #2019 from treydock/ib-counters
Add more Infiniband counters
2021-07-14 13:17:31 +02:00
Ben Kochie 40766fd3cc
Merge pull request #2015 from ston1th/openbsd_mem_cache_fix
Fix wrong value for OpenBSD memory buffer cache
2021-07-14 13:15:32 +02:00
Ben Kochie f17a85d63d
Merge branch 'master' into netclass-filter-before-parsing 2021-07-13 11:22:46 +02:00
Ben Kochie a6ebe10455
Merge branch 'master' into nvme 2021-07-12 17:09:51 +02:00
Luiz Angelo Daros de Luca 00aa2f34ce Add tapestats to collect tape devices statistics
It is based on diskstats to allow metrics reuse by simply
s/disk/tape/ the query.

Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
2021-07-09 21:01:08 -03:00
Ben Kochie 2510378a56
Merge pull request #2067 from prometheus/superq/idle_jump
Handle small backwards jumps in CPU idle
2021-07-07 13:27:21 +02:00
Ben Kochie 73c9a10d37
Handle small backwards jumps in CPU idle
The Linux CPU idle stat can also jump backwards slightly in some cases.
Allow the jump back up to 3 seconds before we attempt to reset the CPU
counter cache.

Fixes: https://github.com/prometheus/node_exporter/issues/1903

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-07 12:24:46 +02:00
Trey Dockendorf f0b2449d94 Add more IB counters
Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>
2021-07-06 11:15:32 -04:00
Benjamin Drung b23146db3f Add nvme collector
Add a collector for NVMes to expose the firmware versions. This requires
procfs >= 0.7.0.

Fixes #1891
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-07-06 13:38:15 +02:00
Ben Kochie 839c2d557f
Update go-kstat location
Move go-kstat to the new github.com/illumos/go-kstat location.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-06 11:44:18 +02:00
Ben Kochie 13be860e25 Add time zone offset metric
Add the time zone and offset in seconds.

Closes: https://github.com/prometheus/node_exporter/issues/2052

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-01 11:25:53 +02:00
Jan Fajerski e656b79297 netclass: retrieve interface names and filter before parsing
We should filter excluded interfaces before parsing the interface
details.
This change is based on https://github.com/prometheus/procfs/pull/376

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2021-06-28 10:53:51 +02:00
Ben Kochie 90d469805a
Fix Eof newline in collector/conntrack_linux.go
Signed-off-by: Ben Kochie <superq@gmail.com>
2021-06-23 11:53:57 +02:00
Kozlov Alexander 02ee897c03
Added conntrack statistics metrics (#1155)
* Added conntrack statistics metrics

Signed-off-by: Aleksandr Kozlov <avlkozlov@avito.ru>
Co-authored-by: Aleksandr Kozlov <avlkozlov@avito.ru>
Co-authored-by: Ben Kochie <superq@gmail.com>
2021-06-23 11:52:43 +02:00
Oliver Geiselhardt-Herms cc4f13b369 Fix build
Signed-off-by: Oliver Geiselhardt-Herms <oliver.geiselhardt-herms@sap.com>
2021-06-17 13:22:17 +02:00
Ben Kochie 27dc754aeb
Merge pull request #1832 from ventifus/master
Add a new ethtool stats collector
2021-06-16 10:07:50 +02:00
ventifus 76c0e1e5a1
Update collector/ethtool_linux.go
Signed-off-by: W. Andrew Denton <git@flying-snail.net>

Co-authored-by: Manuel Rüger <manuel@rueg.eu>
2021-06-11 09:02:08 -07:00
Julien Pivotto 99af1dbb44 Update logic
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-06-04 11:35:07 +02:00
Julien Pivotto 2e20d668f2 Only iniate collectors once
When /metrics is called for specific collectors, the collectors are
initialed every time. Which means that we spend a lot of time
re-initiating the same collectors again and again. Especially, some
collectors make the assumptions that that are initiated once - e.g.
systemd collector generates info message upon initiation.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-06-04 11:20:56 +02:00
Ben Kochie e1db354611
Merge pull request #1887 from prometheus/superq/promhttp_errorlog
Add ErrorLog plumbing to promhttp
2021-06-03 16:38:30 +02:00
Ben Kochie 3bc9a93c20
Add ErrorLog plumbing to promhttp
Fix the error logging of the promhttp handler by connecting it to the
promlog setup.
* Switch to go-kit/log.
* Cleanup CHANGELOG.

Fixes: https://github.com/prometheus/node_exporter/issues/1886

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-06-03 10:47:41 +02:00
kcx2366425574 9eff6761df fix the uncorrect word
Signed-off-by: kcx2366425574 <18279911430@163.com>
2021-05-26 09:58:37 +08:00
W. Andrew Denton 892893ff05 ethtool: Log stats collection errors.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-05-14 10:07:30 -07:00
Hu Shuai 5ee20043a7 Fix golint issue caused by typo
Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-05-11 09:55:52 +08:00
W. Andrew Denton 807f3c3af3 ethtool: Remove end-to-end testing.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-05-03 09:35:49 -07:00
W. Andrew Denton 596ff45f8f ethtool: Add a new ethtool stats collector (metrics equivalent to "ethtool -S")
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-04-29 11:07:26 -07:00
Ben Kochie 7b5cc3e505
Add Darwin arm64 build
Add darwin/arm64 to the CGO crossbuilder list.
* Update Makefile.common to pick up new promu.
* Fix possible nil pointer caught by staticcheck.
* Update collector build tags.

https://github.com/prometheus/node_exporter/issues/1997

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-04-14 10:39:52 +02:00
ston1th 2b7aa4c303 Fix wrong value for OpenBSD memory buffer cache
Fixes #1972

Signed-off-by: ston1th <ston1th@giftfish.de>
2021-04-03 16:57:56 +02:00
Frederic Hemberger 39124626cd Rename collector.filesystem flags to match other collectors
Ref: #1743
Fixes: #1994

Signed-off-by: Frederic Hemberger <mail@frederic-hemberger.de>
2021-03-24 21:01:10 +01:00
Ben Kochie 81caeb6a1b
Merge pull request #2000 from prometheus/fixpanic-systemd-backwards-compat
Fix panix when using backwards compatible flags
2021-03-19 16:22:42 +01:00
Ben Kochie 9893fca77e
Add flag to ignore network speed if it is unknown
Some devices (ex virtual) don't have a speed and report `-1` as the
speed value. Add a flag to allow ignoring speed on these devices.

Fixes: https://github.com/prometheus/node_exporter/issues/1967

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-18 11:36:31 +01:00
Julien Pivotto e7649ba48e Fix panix when using backwards compatible flags
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-03-15 14:59:49 +01:00
Ben Kochie 3b3ef7357f
Silence missing netclass errors
* Handle no such file and permission denied errors.
* Reduce excessive error wrapping.

Fixes: https://github.com/prometheus/node_exporter/issues/1840

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-03 20:40:08 +01:00
Ben Kochie 23e5b245a4
Sanitize strings from /sys/class/power_supply
Avoid panic on invalid UTF-8 from /sys/class/power_supply by
sanitizing strings parsed from the kernel.
* Add a broken string to the test fixtures.

Fixes: https://github.com/prometheus/node_exporter/issues/1979

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-03 18:05:51 +01:00
Ben Kochie 46d0a0813f
Handle errors from disabled PSI subsystem
When CONFIG_PSI_DEFAULT_DISABLED=y, the pressure system returns
"operation not supported", rather than permission denied or not
exposing the /proc/pressure files.

Fixes: https://github.com/prometheus/node_exporter/issues/1961

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-03 11:02:28 +01:00
Ben Kochie 5a6551e8ae
Fix some noisy log lines
* Bump procfs to include some fixes to error messages.
* Lower zpoolStatePaths log from Warn to Debug.

Fixes: https://github.com/prometheus/node_exporter/issues/1961
Fixes: https://github.com/prometheus/node_exporter/issues/1960

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-10 16:16:54 +01:00
Hu Shuai 4109a5089f Fix ineffassign issue
Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-02-08 10:53:12 +08:00
Ben Kochie a37d3f659c
Release 1.1.0
* Update Build
  - Update CircleCI orb.
  - Update CIrcleCI Machine image.
  - Use golang-builder 1.15.
* Update Go modules.
* Fixup fixtures for XFS bug.

NOTE: We have improved some of the flag naming conventions (PR #1743). The old names are
      deprecated and will be removed in 2.0. They will continue to work for backwards
      compatibility.

* [CHANGE] Improve filter flag names #1743
* [CHANGE] Add btrfs and powersupplyclass to list of exporters enabled by default #1897
* [FEATURE] Add fibre channel collector #1786
* [FEATURE] Expose cpu bugs and flags as info metrics. #1788
* [FEATURE] Add network_route collector #1811
* [FEATURE] Add zoneinfo collector #1922
* [ENHANCEMENT] Add more InfiniBand counters #1694
* [ENHANCEMENT] Add flag to aggr ipvs metrics to avoid high cardinality metrics #1709
* [ENHANCEMENT] Adding backlog/current queue length to qdisc collector #1732
* [ENHANCEMENT] Include TCP OutRsts in netstat metrics #1733
* [ENHANCEMENT] Add pool size to entropy collector #1753
* [ENHANCEMENT] Remove CGO dependencies for OpenBSD amd64 #1774
* [ENHANCEMENT] bcache: add writeback_rate_debug stats #1658
* [ENHANCEMENT] Add check state for mdadm arrays via node_md_state metric #1810
* [ENHANCEMENT] Expose XFS inode statistics #1870
* [ENHANCEMENT] Expose zfs zpool state #1878
* [ENHANCEMENT] Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable #1947
* [BUGFIX] filesystem_freebsd: Fix label values #1728
* [BUGFIX] Fix various procfs parsing errors #1735
* [BUGFIX] Handle no data from powersupplyclass #1747
* [BUGFIX] udp_queues_linux.go: change upd to udp in two error strings #1769
* [BUGFIX] Fix node_scrape_collector_success behaviour #1816
* [BUGFIX] Fix NodeRAIDDegraded to not use a string rule expressions #1827
* [BUGFIX] Fix node_md_disks state label from fail to failed #1862
* [BUGFIX] Handle EPERM for syscall in timex collector #1938
* [BUGFIX] bcache: fix typo in a metric name #1943
* [BUGFIX] Fix XFS read/write stats (https://github.com/prometheus/procfs/pull/343)

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-05 21:23:23 +01:00
Ben Kochie 43b91ac846
Merge pull request #1954 from prometheus/superq/noisy_rapl
Fix rapl collector log noise
2021-02-05 21:20:53 +01:00
Ben Kochie dc5a94c803
Fix rapl collector log noise
Catch permission denined errors in the rapl collector.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-05 18:16:17 +01:00
Ben Kochie 0b0c5624e1
Fix network_route collector naming
* Use `device` label to match other `node_network_...` metics.
* Fix naming convention to match Promehteus best practices.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-05 18:10:42 +01:00
Ben Kochie 1729558e11
Merge pull request #1922 from kwisniewski98/zone
Add zoneinfo collector
2021-02-05 13:57:54 +01:00
Ben Kochie 78682c80af
Merge pull request #1786 from deusnefum/master
Add fibre channel collector
2021-02-03 18:22:59 +01:00
mhiles 5a28930e2e change fc_host everywhere, update fixtures
Signed-off-by: mhiles <hiles@hpe.com>
2021-02-03 09:35:58 -05:00
Ben Kochie 22c5aeb0ef
Merge pull request #1943 from hs0210/work
bcache: fix typo
2021-02-03 09:57:23 +01:00
Ben Kochie 477a192803
Merge pull request #1947 from Oloremo/supervisord_env_vars
Added an ability to pass collector.supervisord.url via ENV vars
2021-02-03 08:35:43 +01:00
mhiles 56eba80306 add fibrechannel to default list in read me; host -> fc_host to avoid name collision
Signed-off-by: mhiles <hiles@hpe.com>
2021-02-02 18:05:24 -05:00
Matthew Hiles 3fcccd5b14 Update collector/fibrechannel_linux.go
Co-authored-by: Ben Kochie <superq@gmail.com>
Signed-off-by: mhiles <hiles@hpe.com>
2021-02-02 16:47:56 -05:00
mhiles 19790920b2 update fibrechannel fixture
Signed-off-by: mhiles <hiles@hpe.com>
2021-01-29 10:38:45 -05:00
mhiles 6e02d5bff1 invalid_tx_word_total -> invalid_tx_words_total
Signed-off-by: mhiles <hiles@hpe.com>
2021-01-29 10:28:22 -05:00
Dustin Hooten eecfbcd610 PR feedback
Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>
2021-01-29 10:37:56 +01:00
Dustin Hooten b7626ecdbf ProcessStatCollector: continue if PID disappears between opening and reading file
Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>
2021-01-29 10:37:56 +01:00
Proskurin Kirill 8d147327a6 feat: Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable
Signed-off-by: Proskurin Kirill <kirill.proskurin@behavox.com>
2021-01-27 18:09:52 +00:00
Wisniewski, Krzysztof2 997a8fbb7f Add zoneinfo collector
Signed-off-by: Wisniewski, Krzysztof2 <Krzysztof2.Wisniewski@intel.com>
2021-01-26 12:00:35 +01:00
Hu Shuai 0900cb5655 bcache: fix typo
Fix typos in collector/bcache_linux.go and collector/fixtures/e2e-output.txt

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-01-25 13:50:16 +08:00
Ben Kochie 7904ea4af2
Update netdev OpenBSD amd64 filter
Use new filtering method from #1826

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-01-24 16:37:02 +01:00
Ben Kochie 40ce993d5b
Merge pull request #1816 from liiling/master
Fix node_scrape_collector_success behaviour
2021-01-24 15:28:41 +01:00
Ben Kochie 9fab63825b
Merge pull request #1753 from binjip978/entropy
add pool size to entropy collector
2021-01-24 14:57:33 +01:00
Ben Kochie cb6509f1ac
Merge pull request #1826 from digineo/device-filter
Move ignore/accept to new netDevFilter struct
2021-01-24 14:55:29 +01:00