In order to reduce cardinality of the interrupts collector add
filtering options
* Add include/exclude regexp filter flags.
* Add boolean flag to include zero values, enabled by default.
Signed-off-by: Ben Kochie <superq@gmail.com>
* Add include and exclude filter for sensors in hwmon collector
Fixes#2242
This commit adds two new flags (`collector.hwmon.sensor-include` and `collector.hwmon.sensor-exclude`) to the `hwmon` collector to allow inclusion or exclusion of specific sensors.
Some devices export nonsensical values for certain sensors. Here is an example:
```
node_hwmon_temp_celsius{chip="platform_nct6775_656",sensor="temp13"} 49.75
node_hwmon_temp_celsius{chip="platform_nct6775_656",sensor="temp15"} 3.892313987e+06
node_hwmon_temp_celsius{chip="platform_nct6775_656",sensor="temp16"} 3.892313987e+06
```
As a user I would like to only exclude these sensors, not necessarily the complete device (as is currently possible with the `--collector.hwmon.chip-exclude` flag) as other sensor values might be sensical or desired.
The new option filters based both on device name and sensor name, separated by a semicolon. For example, to exclude the two sensors above, the following regex can be used:
~~~
--collector.hwmon.sensor-exclude="platform_nct6775_656;temp1[5,6]"
~~~
---------
Signed-off-by: Simon Krenger <skrenger@redhat.com>
Running node_exporter in containers is now a fairly well understood
problem. Replace the warnings with something less dire and more
prescriptive.
Signed-off-by: Ben Kochie <superq@gmail.com>
The DRM collector was missing in the README, this change includes it together with a short description.
Signed-off-by: L <3177243+LukeLR@users.noreply.github.com>
Use the correct include value to the device filter function.
* Add new bogus hwmon fixture.
* Update end-to-end test to use hwmon chip include flag.
Signed-off-by: Ben Kochie <superq@gmail.com>
Mark the `supervisord` as deprecated. This process
supevisor, like `runit`, is of scope for the node_exporter.
Signed-off-by: Ben Kochie <superq@gmail.com>
The ntp collector has always been a source of confusion and problems.
The data it produces is more of a blackbox probe against an NTP server.
The time sync / offset data produced is not what users expect.
Mark this collector as deprecated to be removed in v2.0.0
Signed-off-by: Ben Kochie <superq@gmail.com>
The --web.config flag changed to --web.config.file in
440a132c38 and was realised in the recent
v1.5.0 release.
Signed-off-by: Joe Groocock <me@frebib.net>
Copr community prometheus-exporters repository is obsoleted.
Signed-off-by: Otto Sabart <seberm@seberm.com>
Signed-off-by: Otto Sabart <seberm@seberm.com>
* Correctly name collector file.
* Fix cgroup summary type as gauge.
* Use a boolean metric rather than a label for enabled.
Signed-off-by: Ben Kochie <superq@gmail.com>
* Update build to Go 1.18.
* Update minimum version to 1.17.
* Update machine image to latest.
* Enable dependabot.
* Simplify build in readme.
Signed-off-by: Ben Kochie <superq@gmail.com>
The new `lnstat` collector produces a high number of metrics, per-cpu,
and results in approximately double the number of metrics previously
scraped. For example, a typical server with 64 cores produces 3832
lnstat metrics compared to 4147 metrics for the remaining collectors.
Therefore disable the `lnstat` collector by default.
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Add a DMI collector to expose the Desktop Management Interface (DMI)
info from `/sys/class/dmi/id/`. This will expose information about the
BIOS, mainboard, chassis, and product.
Closes: https://github.com/prometheus/node_exporter/issues/303
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
The ethtool_cmd struct from the linux kernel contains information about the speeds and features supported by a
network device. This includes speeds and duplex but also features like autonegotiate and 802.3x pause frames.
Closes#1444
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
Currently Node Exporter has a metric called `node_uname_info` which of
course exposes uname info. While this is nice, it does not help if you
are running different OSes which could have similar uname info.
Therefore parse `/etc/os-release` or `/usr/lib/os-release` and expose a
`node_os_info` metric which provide information regarding the OS
release/version of the node. Also expose the major.minor part of the OS
release version as `node_os_version`.
Since the os-release files will not change often, cache the parsed
content and only refresh the cache if the modification time changes.
This `os` collector will read files outside of `/proc` and `/sys`, but
the os-release file is widely used and the format is standardized:
https://www.freedesktop.org/software/systemd/man/os-release.html
Bug: https://github.com/prometheus/node_exporter/issues/1574
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Add a `node_ethtool_info` metric to all ethtool devices to expose driver
information with following labels:
* bus_info
* driver
* expansion_rom_version
* firmware_version
* version
This metric is useful to monitor the firmware version to be up-to-date.
Note: The version label might be malformed due to bug #39 in ethtool:
https://github.com/safchain/ethtool/issues/39
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
NOTE: Ignoring invalid network speed will be the default in 2.x
NOTE: Filesystem collector flags have been renamed. `--collector.filesystem.ignored-mount-points` is now `--collector.filesystem.mount-points-exclude` and `--collector.filesystem.ignored-fs-types` is now `--collector.filesystem.fs-types-exclude`. The old flags will be removed in 2.x.
* [CHANGE] Rename filesystem collector flags to match other collectors #2012
* [CHANGE] Make node_exporter print usage to STDOUT #2039
* [FEATURE] Add conntrack statistics metrics #1155
* [FEATURE] Add ethtool stats collector #1832
* [FEATURE] Add flag to ignore network speed if it is unknown #1989
* [FEATURE] Add tapestats collector for Linux #2044
* [FEATURE] Add nvme collector #2062
* [ENHANCEMENT] Add ErrorLog plumbing to promhttp #1887
* [ENHANCEMENT] Add more Infiniband counters #2019
* [ENHANCEMENT] netclass: retrieve interface names and filter before parsing #2033
* [ENHANCEMENT] Add time zone offset metric #2060
* [BUGFIX] Handle errors from disabled PSI subsystem #1983
* [BUGFIX] Fix panic when using backwards compatible flags #2000
* [BUGFIX] Fix wrong value for OpenBSD memory buffer cache #2015
* [BUGFIX] Only initiate collectors once #2048
* [BUGFIX] Handle small backwards jumps in CPU idle #2067
Signed-off-by: Ben Kochie <superq@gmail.com>
Add a collector for NVMes to expose the firmware versions. This requires
procfs >= 0.7.0.
Fixes#1891
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
* Update Build
- Update CircleCI orb.
- Update CIrcleCI Machine image.
- Use golang-builder 1.15.
* Update Go modules.
* Fixup fixtures for XFS bug.
NOTE: We have improved some of the flag naming conventions (PR #1743). The old names are
deprecated and will be removed in 2.0. They will continue to work for backwards
compatibility.
* [CHANGE] Improve filter flag names #1743
* [CHANGE] Add btrfs and powersupplyclass to list of exporters enabled by default #1897
* [FEATURE] Add fibre channel collector #1786
* [FEATURE] Expose cpu bugs and flags as info metrics. #1788
* [FEATURE] Add network_route collector #1811
* [FEATURE] Add zoneinfo collector #1922
* [ENHANCEMENT] Add more InfiniBand counters #1694
* [ENHANCEMENT] Add flag to aggr ipvs metrics to avoid high cardinality metrics #1709
* [ENHANCEMENT] Adding backlog/current queue length to qdisc collector #1732
* [ENHANCEMENT] Include TCP OutRsts in netstat metrics #1733
* [ENHANCEMENT] Add pool size to entropy collector #1753
* [ENHANCEMENT] Remove CGO dependencies for OpenBSD amd64 #1774
* [ENHANCEMENT] bcache: add writeback_rate_debug stats #1658
* [ENHANCEMENT] Add check state for mdadm arrays via node_md_state metric #1810
* [ENHANCEMENT] Expose XFS inode statistics #1870
* [ENHANCEMENT] Expose zfs zpool state #1878
* [ENHANCEMENT] Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable #1947
* [BUGFIX] filesystem_freebsd: Fix label values #1728
* [BUGFIX] Fix various procfs parsing errors #1735
* [BUGFIX] Handle no data from powersupplyclass #1747
* [BUGFIX] udp_queues_linux.go: change upd to udp in two error strings #1769
* [BUGFIX] Fix node_scrape_collector_success behaviour #1816
* [BUGFIX] Fix NodeRAIDDegraded to not use a string rule expressions #1827
* [BUGFIX] Fix node_md_disks state label from fail to failed #1862
* [BUGFIX] Handle EPERM for syscall in timex collector #1938
* [BUGFIX] bcache: fix typo in a metric name #1943
* [BUGFIX] Fix XFS read/write stats (https://github.com/prometheus/procfs/pull/343)
Signed-off-by: Ben Kochie <superq@gmail.com>