Commit graph

790 commits

Author SHA1 Message Date
Benjamin Drung b6215e649c Add os release collector
Currently Node Exporter has a metric called `node_uname_info` which of
course exposes uname info. While this is nice, it does not help if you
are running different OSes which could have similar uname info.

Therefore parse `/etc/os-release` or `/usr/lib/os-release` and expose a
`node_os_info` metric which provide information regarding the OS
release/version of the node. Also expose the major.minor part of the OS
release version as `node_os_version`.

Since the os-release files will not change often, cache the parsed
content and only refresh the cache if the modification time changes.

This `os` collector will read files outside of `/proc` and `/sys`, but
the os-release file is widely used and the format is standardized:
https://www.freedesktop.org/software/systemd/man/os-release.html

Bug: https://github.com/prometheus/node_exporter/issues/1574
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-19 14:04:21 +02:00
Ben Kochie 84b36c4fd8
Add flag to disable guest CPU metrics
In high scale virtualized / cloud environments there are typically
no guest VMs. Add a boolean flag to allow disabling the Linux guest
CPU metrics.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-08-17 13:04:46 +02:00
Benjamin Drung 26ca609183 ethtool: Expose node_ethtool_info metric
Add a `node_ethtool_info` metric to all ethtool devices to expose driver
information with following labels:

 * bus_info
 * driver
 * expansion_rom_version
 * firmware_version
 * version

This metric is useful to monitor the firmware version to be up-to-date.

Note: The version label might be malformed due to bug #39 in ethtool:
https://github.com/safchain/ethtool/issues/39

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-16 16:09:35 +02:00
Benjamin Drung 6ac6ea2d13 ethtool: Sanitize metric names
OpenMetrics and the Prometheus exposition format require the metric name
to consist only of alphanumericals and "_", ":" and they must not start
with digits. The metric names from the ethtool stats might contain
spaces, brackets, and dots. Converting them directly to metric names
will produce invalid metric names.

Therefore sanitize the metric names and convert them to lower case.

Fixes: https://github.com/prometheus/node_exporter/issues/2083
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-16 15:28:27 +02:00
Johannes 'fish' Ziemke e6b5aaaff4 Add collector.ethtool.metrics-include
This adds a new flag --collector.ethtool.metrics-include to the ethtool
collector. Only metrics matching this regexp will be collected.

Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-08-10 18:57:36 +02:00
Benjamin Drung 4356c09ebd ethtool: Use prometheus.BuildFQName
Use `prometheus.BuildFQName` everywhere in `ethtool` instead of
hard-coding the metric names.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-10 18:20:01 +02:00
Benjamin Drung 3afd382e75 Add --collector.ethtool.ignored-devices
Other network related collectors allow to filter out unwanted devices.
Add this support to the new ethtool collector as well.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-10 18:09:26 +02:00
Ben Kochie 5d2a4cf7fb
Fix processes collector long int parsing
Update procfs library to include ignored fields ParseInt handling.

Wrap error returns so that the user can know more about what failed.
Returns from getAllocatedThreads() are errors anyway.

Fixes: https://github.com/prometheus/node_exporter/issues/2110

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-08-06 05:55:24 +02:00
Ben Kochie 747012c59a
Merge pull request #2092 from prometheus/superq/fix_energy_uj
Fix rapl collector log noise
2021-07-22 21:00:29 +02:00
Ben Kochie 97d4b01691
Bump prometheus/procfs library
Pull in bug fix for noisy logging.

Fixes: https://github.com/prometheus/node_exporter/issues/2086

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-21 21:40:21 +02:00
Ben Kochie 502f287c96
Fix rapl collector log noise
Capture permission denied error for "energy_uj" file.

Fixes: https://github.com/prometheus/node_exporter/issues/1892

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-21 19:28:54 +02:00
Ben Kochie 6ac7a53f45
Fix conntrack collector log noise
Fix un-handled file not found for conntrack stats.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-15 13:45:07 +02:00
Ben Kochie 6793e0e5a8
Merge pull request #2019 from treydock/ib-counters
Add more Infiniband counters
2021-07-14 13:17:31 +02:00
Ben Kochie 40766fd3cc
Merge pull request #2015 from ston1th/openbsd_mem_cache_fix
Fix wrong value for OpenBSD memory buffer cache
2021-07-14 13:15:32 +02:00
Ben Kochie f17a85d63d
Merge branch 'master' into netclass-filter-before-parsing 2021-07-13 11:22:46 +02:00
Ben Kochie a6ebe10455
Merge branch 'master' into nvme 2021-07-12 17:09:51 +02:00
Luiz Angelo Daros de Luca 00aa2f34ce Add tapestats to collect tape devices statistics
It is based on diskstats to allow metrics reuse by simply
s/disk/tape/ the query.

Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
2021-07-09 21:01:08 -03:00
Ben Kochie 2510378a56
Merge pull request #2067 from prometheus/superq/idle_jump
Handle small backwards jumps in CPU idle
2021-07-07 13:27:21 +02:00
Ben Kochie 73c9a10d37
Handle small backwards jumps in CPU idle
The Linux CPU idle stat can also jump backwards slightly in some cases.
Allow the jump back up to 3 seconds before we attempt to reset the CPU
counter cache.

Fixes: https://github.com/prometheus/node_exporter/issues/1903

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-07 12:24:46 +02:00
Trey Dockendorf f0b2449d94 Add more IB counters
Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>
2021-07-06 11:15:32 -04:00
Benjamin Drung b23146db3f Add nvme collector
Add a collector for NVMes to expose the firmware versions. This requires
procfs >= 0.7.0.

Fixes #1891
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-07-06 13:38:15 +02:00
Ben Kochie 839c2d557f
Update go-kstat location
Move go-kstat to the new github.com/illumos/go-kstat location.

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-06 11:44:18 +02:00
Ben Kochie 13be860e25 Add time zone offset metric
Add the time zone and offset in seconds.

Closes: https://github.com/prometheus/node_exporter/issues/2052

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-01 11:25:53 +02:00
Jan Fajerski e656b79297 netclass: retrieve interface names and filter before parsing
We should filter excluded interfaces before parsing the interface
details.
This change is based on https://github.com/prometheus/procfs/pull/376

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2021-06-28 10:53:51 +02:00
Ben Kochie 90d469805a
Fix Eof newline in collector/conntrack_linux.go
Signed-off-by: Ben Kochie <superq@gmail.com>
2021-06-23 11:53:57 +02:00
Kozlov Alexander 02ee897c03
Added conntrack statistics metrics (#1155)
* Added conntrack statistics metrics

Signed-off-by: Aleksandr Kozlov <avlkozlov@avito.ru>
Co-authored-by: Aleksandr Kozlov <avlkozlov@avito.ru>
Co-authored-by: Ben Kochie <superq@gmail.com>
2021-06-23 11:52:43 +02:00
Oliver Geiselhardt-Herms cc4f13b369 Fix build
Signed-off-by: Oliver Geiselhardt-Herms <oliver.geiselhardt-herms@sap.com>
2021-06-17 13:22:17 +02:00
Ben Kochie 27dc754aeb
Merge pull request #1832 from ventifus/master
Add a new ethtool stats collector
2021-06-16 10:07:50 +02:00
ventifus 76c0e1e5a1
Update collector/ethtool_linux.go
Signed-off-by: W. Andrew Denton <git@flying-snail.net>

Co-authored-by: Manuel Rüger <manuel@rueg.eu>
2021-06-11 09:02:08 -07:00
Julien Pivotto 99af1dbb44 Update logic
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-06-04 11:35:07 +02:00
Julien Pivotto 2e20d668f2 Only iniate collectors once
When /metrics is called for specific collectors, the collectors are
initialed every time. Which means that we spend a lot of time
re-initiating the same collectors again and again. Especially, some
collectors make the assumptions that that are initiated once - e.g.
systemd collector generates info message upon initiation.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-06-04 11:20:56 +02:00
Ben Kochie e1db354611
Merge pull request #1887 from prometheus/superq/promhttp_errorlog
Add ErrorLog plumbing to promhttp
2021-06-03 16:38:30 +02:00
Ben Kochie 3bc9a93c20
Add ErrorLog plumbing to promhttp
Fix the error logging of the promhttp handler by connecting it to the
promlog setup.
* Switch to go-kit/log.
* Cleanup CHANGELOG.

Fixes: https://github.com/prometheus/node_exporter/issues/1886

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-06-03 10:47:41 +02:00
kcx2366425574 9eff6761df fix the uncorrect word
Signed-off-by: kcx2366425574 <18279911430@163.com>
2021-05-26 09:58:37 +08:00
W. Andrew Denton 892893ff05 ethtool: Log stats collection errors.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-05-14 10:07:30 -07:00
Hu Shuai 5ee20043a7 Fix golint issue caused by typo
Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-05-11 09:55:52 +08:00
W. Andrew Denton 807f3c3af3 ethtool: Remove end-to-end testing.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-05-03 09:35:49 -07:00
W. Andrew Denton 596ff45f8f ethtool: Add a new ethtool stats collector (metrics equivalent to "ethtool -S")
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-04-29 11:07:26 -07:00
Ben Kochie 7b5cc3e505
Add Darwin arm64 build
Add darwin/arm64 to the CGO crossbuilder list.
* Update Makefile.common to pick up new promu.
* Fix possible nil pointer caught by staticcheck.
* Update collector build tags.

https://github.com/prometheus/node_exporter/issues/1997

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-04-14 10:39:52 +02:00
ston1th 2b7aa4c303 Fix wrong value for OpenBSD memory buffer cache
Fixes #1972

Signed-off-by: ston1th <ston1th@giftfish.de>
2021-04-03 16:57:56 +02:00
Frederic Hemberger 39124626cd Rename collector.filesystem flags to match other collectors
Ref: #1743
Fixes: #1994

Signed-off-by: Frederic Hemberger <mail@frederic-hemberger.de>
2021-03-24 21:01:10 +01:00
Ben Kochie 81caeb6a1b
Merge pull request #2000 from prometheus/fixpanic-systemd-backwards-compat
Fix panix when using backwards compatible flags
2021-03-19 16:22:42 +01:00
Ben Kochie 9893fca77e
Add flag to ignore network speed if it is unknown
Some devices (ex virtual) don't have a speed and report `-1` as the
speed value. Add a flag to allow ignoring speed on these devices.

Fixes: https://github.com/prometheus/node_exporter/issues/1967

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-18 11:36:31 +01:00
Julien Pivotto e7649ba48e Fix panix when using backwards compatible flags
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-03-15 14:59:49 +01:00
Ben Kochie 3b3ef7357f
Silence missing netclass errors
* Handle no such file and permission denied errors.
* Reduce excessive error wrapping.

Fixes: https://github.com/prometheus/node_exporter/issues/1840

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-03 20:40:08 +01:00
Ben Kochie 23e5b245a4
Sanitize strings from /sys/class/power_supply
Avoid panic on invalid UTF-8 from /sys/class/power_supply by
sanitizing strings parsed from the kernel.
* Add a broken string to the test fixtures.

Fixes: https://github.com/prometheus/node_exporter/issues/1979

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-03 18:05:51 +01:00
Ben Kochie 46d0a0813f
Handle errors from disabled PSI subsystem
When CONFIG_PSI_DEFAULT_DISABLED=y, the pressure system returns
"operation not supported", rather than permission denied or not
exposing the /proc/pressure files.

Fixes: https://github.com/prometheus/node_exporter/issues/1961

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-03 11:02:28 +01:00
Ben Kochie 5a6551e8ae
Fix some noisy log lines
* Bump procfs to include some fixes to error messages.
* Lower zpoolStatePaths log from Warn to Debug.

Fixes: https://github.com/prometheus/node_exporter/issues/1961
Fixes: https://github.com/prometheus/node_exporter/issues/1960

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-10 16:16:54 +01:00
Hu Shuai 4109a5089f Fix ineffassign issue
Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-02-08 10:53:12 +08:00
Ben Kochie a37d3f659c
Release 1.1.0
* Update Build
  - Update CircleCI orb.
  - Update CIrcleCI Machine image.
  - Use golang-builder 1.15.
* Update Go modules.
* Fixup fixtures for XFS bug.

NOTE: We have improved some of the flag naming conventions (PR #1743). The old names are
      deprecated and will be removed in 2.0. They will continue to work for backwards
      compatibility.

* [CHANGE] Improve filter flag names #1743
* [CHANGE] Add btrfs and powersupplyclass to list of exporters enabled by default #1897
* [FEATURE] Add fibre channel collector #1786
* [FEATURE] Expose cpu bugs and flags as info metrics. #1788
* [FEATURE] Add network_route collector #1811
* [FEATURE] Add zoneinfo collector #1922
* [ENHANCEMENT] Add more InfiniBand counters #1694
* [ENHANCEMENT] Add flag to aggr ipvs metrics to avoid high cardinality metrics #1709
* [ENHANCEMENT] Adding backlog/current queue length to qdisc collector #1732
* [ENHANCEMENT] Include TCP OutRsts in netstat metrics #1733
* [ENHANCEMENT] Add pool size to entropy collector #1753
* [ENHANCEMENT] Remove CGO dependencies for OpenBSD amd64 #1774
* [ENHANCEMENT] bcache: add writeback_rate_debug stats #1658
* [ENHANCEMENT] Add check state for mdadm arrays via node_md_state metric #1810
* [ENHANCEMENT] Expose XFS inode statistics #1870
* [ENHANCEMENT] Expose zfs zpool state #1878
* [ENHANCEMENT] Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable #1947
* [BUGFIX] filesystem_freebsd: Fix label values #1728
* [BUGFIX] Fix various procfs parsing errors #1735
* [BUGFIX] Handle no data from powersupplyclass #1747
* [BUGFIX] udp_queues_linux.go: change upd to udp in two error strings #1769
* [BUGFIX] Fix node_scrape_collector_success behaviour #1816
* [BUGFIX] Fix NodeRAIDDegraded to not use a string rule expressions #1827
* [BUGFIX] Fix node_md_disks state label from fail to failed #1862
* [BUGFIX] Handle EPERM for syscall in timex collector #1938
* [BUGFIX] bcache: fix typo in a metric name #1943
* [BUGFIX] Fix XFS read/write stats (https://github.com/prometheus/procfs/pull/343)

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-05 21:23:23 +01:00