Commit graph

2052 commits

Author SHA1 Message Date
Benoît Knecht 3b9613cfae
collector/netdev_linux.go: Fallback to 32-bit stats (#2757)
On some platforms, `msg.Attributes.Stats64` is `nil` because the kernel doesn't
expose 64-bit stats. In that case, return `msg.Attributes.Stats` instead, which
are the 32-bit equivalent.

Note that `RXOtherhostDropped` isn't available in that case, so we hardcode it
to zero.

Fixes #2756.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2023-08-01 15:58:53 +02:00
L 5d1b96c936 Include drm collector in README
The DRM collector was missing in the README, this change includes it together with a short description.

Signed-off-by: L <3177243+LukeLR@users.noreply.github.com>
2023-07-31 13:14:13 +01:00
PrometheusBot 8fb4f78ce5
Update common Prometheus files (#2752)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2023-07-18 21:08:19 +02:00
PrometheusBot fa481315b5
Synchronize common files from prometheus/prometheus (#2736)
* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Fixup linting issues

* Disbale unused-parameter check.
* Fixup minor linting issues.

Signed-off-by: Ben Kochie <superq@gmail.com>

---------

Signed-off-by: prombot <prometheus-team@googlegroups.com>
Signed-off-by: Ben Kochie <superq@gmail.com>
Co-authored-by: Ben Kochie <superq@gmail.com>
2023-07-18 10:46:59 +02:00
Ben Kochie 2d8069208c
Release v1.6.1 (#2747)
Rebuild with latest Go compiler bugfix release.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-07-17 13:58:44 +02:00
Ben Kochie 7c564bcbef
Fixup hwmon chip include (#2739)
Use the correct include value to the device filter function.
* Add new bogus hwmon fixture.
* Update end-to-end test to use hwmon chip include flag.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-07-10 12:46:30 +02:00
Conall O'Brien c241ecf8bd
Update all Include and Exclude variables to use the systemdUnit naming (#2740)
prefix.

Leave an annotation about using regexps instead of device_filter.go, so
@SuperQ doesn't need to remember everything.

Signed-off-by: Conall O'Brien <conall@conall.net>
2023-07-10 12:25:18 +02:00
Gabi Davar f4344579d5
Add missing ethtool flag documentation (#2743)
Signed-off-by: Gabi Davar <grizzly.nyo@gmail.com>
2023-07-08 09:36:36 +02:00
Conall O'Brien 8b4dc82488
Add include and exclude filter for hwmon collector (#2699)
* Add include and exclude flags chip name flags to hwmon collector, following example in systemd collector

---------

Signed-off-by: Conall O'Brien <conall@conall.net>
Co-authored-by: Ben Kochie <superq@gmail.com>
2023-07-07 10:30:24 +02:00
Ben Kochie ed57c15e2c
Merge pull request #2644 from v-zhuravlev/mixin_alerts
Mixin: Add and update alerts
2023-07-04 09:12:00 +02:00
dependabot[bot] a24344d4a8 build(deps): bump github.com/prometheus/client_golang
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.15.1 to 1.16.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.15.1...v1.16.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-03 11:57:59 +02:00
dependabot[bot] 74da9f9d85 build(deps): bump github.com/beevik/ntp from 1.0.0 to 1.1.1
Bumps [github.com/beevik/ntp](https://github.com/beevik/ntp) from 1.0.0 to 1.1.1.
- [Release notes](https://github.com/beevik/ntp/releases)
- [Changelog](https://github.com/beevik/ntp/blob/main/RELEASE_NOTES.md)
- [Commits](https://github.com/beevik/ntp/compare/v1.0.0...v1.1.1)

---
updated-dependencies:
- dependency-name: github.com/beevik/ntp
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-03 11:57:04 +02:00
Michal c31ebb4359
Add cpu vulnerabilities reporting from sysfs (#2721)
* Add cpu vulnerabilities reporting from sysfs

---------

Signed-off-by: Michal Wasilewski <michal@mwasilewski.net>
2023-07-01 14:21:49 +02:00
prombot 3e3ab1778b Update common Prometheus files
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2023-07-01 13:14:03 +02:00
Vitaly Zhuravlev e8d7f4e8b3 Revert alerts pending durtions
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly 3e250a95a0 Update NodeSystemSaturation severity
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev b7dfb32bfc Set severity to NodeCPUHighUsage to info
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev 6bdc1d9c98 Add thresholds for memory, disk and system alerts
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev 77ae769179 Add thresholds for memory alerts
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev 2111e70ac7 Add comma after 'mounted on'
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev e48e7909f4 Extend alert description
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev da32f8de17 Decrease NodeSystemdServiceFailed severity to warning
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev 580c497261 Add NodeSystemSaturation and NodeMemoryMajorPagesFaults
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev e15e7d6a7b Fix NodeMemoryHighUtilization alert
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev c3ec6e8af1 Add diskDevice selector
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev 962de6c921 Add %(nodeExporterSelector)s to Network and conntrack alerts
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev 94fc82e418 Add NodeDiskIOSaturation alert
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev 614030bb80 Set 'at' everywhere as preposition for instance
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev 3d8075da7d Decrease NodeNetwork*Errs pending period
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev 74794182a7 Add failed systemd service alert
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev fd2d62af63 Add CPU and memory alerts
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev 0e0399d41e Decrease NodeFilesystem pending time to 15m
30m is too long and there is a risk of running out of disk space/inodes completely if something is filling up disk very fast (like log file).

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev fc967aa992 Add mountpoint to NodeFilesystem alerts
This helps to identify alerting filesystem.

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:51 +08:00
Ben Kochie a11de2ede5
Update golangci-lint config (#2722)
* Migrate from Python codespell to golangci-lint misspell.
* Inline errcheck exclude list in the golangci-lint config.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-06-21 10:07:30 +02:00
PrometheusBot d1b634fb80
Update common Prometheus files (#2723)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2023-06-18 10:20:37 +02:00
Cam Cope 2346fd9b06
add missing linkspeeds (#2711)
Signed-off-by: Cam Cope <ccope@crusoeenergy.com>
2023-06-18 09:01:53 +02:00
Ben Kochie 742ec0970a
Bump ethtool library (#2720)
Update to latest release.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-06-14 16:23:01 +02:00
Ben Kochie 6fb78f70b3
Bump wifi Go module (#2719)
Update github.com/mdlayher/wifi to the latest commit.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-06-14 15:49:23 +02:00
Junqi Zhao 107ca037c4
fix misspel in CHANGELOG.md (#2717)
Signed-off-by: juzhao <juzhao@redhat.com>
2023-06-13 16:34:41 +02:00
Erica Mays bdc430af2b Parallelize stat calls in Linux filesystem collector.
This change adds the ability to process multiple stat calls in parallel.
Processing is rate-limited based on the new flag
`collector.filesystem.stat-workers` (default 4).

Caveat: filesystem stats information is no longer in the same order as
returned by `/proc/1/mounts`.  This should not be an issue.

Caveat: This change currently uses unbuffered channels to prove
correctness without reliance on buffers.  Buffered channels will yield
superior performance.

Signed-off-by: Erica Mays <erica@emays.dev>
2023-06-09 12:31:31 +02:00
dependabot[bot] 75d951d47a build(deps): bump github.com/jsimonetti/rtnetlink from 1.3.2 to 1.3.3
Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.3.2 to 1.3.3.
- [Release notes](https://github.com/jsimonetti/rtnetlink/releases)
- [Commits](https://github.com/jsimonetti/rtnetlink/compare/v1.3.2...v1.3.3)

---
updated-dependencies:
- dependency-name: github.com/jsimonetti/rtnetlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-06-02 15:50:48 +02:00
dependabot[bot] fc686da10a build(deps): bump github.com/prometheus/procfs from 0.10.0 to 0.10.1
Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.10.0 to 0.10.1.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.10.0...v0.10.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-06-02 15:50:34 +02:00
dependabot[bot] 386c4086dd build(deps): bump github.com/beevik/ntp from 0.3.0 to 1.0.0
Bumps [github.com/beevik/ntp](https://github.com/beevik/ntp) from 0.3.0 to 1.0.0.
- [Release notes](https://github.com/beevik/ntp/releases)
- [Changelog](https://github.com/beevik/ntp/blob/main/RELEASE_NOTES.md)
- [Commits](https://github.com/beevik/ntp/compare/v0.3.0...v1.0.0)

---
updated-dependencies:
- dependency-name: github.com/beevik/ntp
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-06-02 15:50:17 +02:00
Ben Kochie ff7f9d69b6
Release v1.6.0 (#2701)
* [CHANGE] Fix cpustat when some cpus are offline #2318
* [CHANGE] Remove metrics of offline CPUs in CPU collector #2605
* [CHANGE] Deprecate ntp collector #2603
* [CHANGE] Remove bcache `cache_readaheads_totals` metrics #2583
* [CHANGE] Deprecate supervisord collector #2685
* [FEATURE] Enable uname collector on NetBSD #2559
* [FEATURE] NetBSD support for the meminfo collector #2570
* [FEATURE] NetBSD support for CPU collector #2626
* [FEATURE] Add FreeBSD collector for netisr subsystem #2668
* [FEATURE] Add softirqs collector #2669
* [ENHANCEMENT] Add suspended as a `node_zfs_zpool_state` #2449
* [ENHANCEMENT] Add administrative state of Linux network interfaces #2515
* [ENHANCEMENT] Log current value of GOMAXPROCS #2537
* [ENHANCEMENT] Add profiler options for perf collector #2542
* [ENHANCEMENT] Allow root path as metrics path #2590
* [ENHANCEMENT] Add cpu frequency governor metrics #2569
* [ENHANCEMENT] Add new landing page #2622
* [ENHANCEMENT] Reduce privileges needed for btrfs device stats #2634
* [ENHANCEMENT] Add ZFS `memory_available_bytes` #2687
* [ENHANCEMENT] Use `SCSI_IDENT_SERIAL` as serial in diskstats #2612
* [ENHANCEMENT] Read missing from netlink netclass attributes from sysfs #2669
* [BUGFIX] perf: fixes for automatically detecting the correct tracefs mountpoints #2553
* [BUGFIX] Fix `thermal_zone` collector noise @2554
* [BUGFIX] Fix a problem fetching the user wire count on FreeBSD 2584
* [BUGFIX] interrupts: Fix fields on linux aarch64 #2631
* [BUGFIX] Remove metrics of offline CPUs in CPU collector #2605
* [BUGFIX] Fix OpenBSD filesystem collector string parsing #2637
* [BUGFIX] Fix bad reporting of `node_cpu_seconds_total` in OpenBSD #2663

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-05-27 14:01:17 +02:00
Johannes Dilli a1ed08430b
Update ansible role in README.md (#2702)
https://github.com/cloudalchemy/ansible-node-exporter has been deprecated

Signed-off-by: Johannes Dilli <jd1@users.noreply.github.com>
2023-05-26 09:14:38 +02:00
Dan Williams 8c5847bd94
netlink: read missing attributes from sysfs (#2669)
Read missing dev_id, name_assign_type, and addr_assign_type
from sysfs, since they only take a device-specific lock and
not the whole RTNL lock. This means reading them is much less
impactful on other system processes than many of the other
attributes in sysfs that do take the RTNL lock.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-05-25 15:10:39 +02:00
Abbey Woodyear eaacb2e3c7
exposing softirq metrics (#2294)
Signed-off-by: abbeywoodyear <abbey.woodyear@thehutgroup.com>
2023-05-25 15:09:32 +02:00
Remi Jouannet df1b53bee2
softnet: additionals metrics from softnet_data, (#2592)
* softnet: additionals metrics from softnet_data, https://github.com/prometheus/procfs/pull/473
---------

Signed-off-by: remi <remijouannet@gmail.com>
Signed-off-by: Rémi Jouannet <remijouannet@gmail.com>
2023-05-24 17:23:13 +02:00
Benoît Knecht c05b97ce32
collector/diskstats: Use SCSI_IDENT_SERIAL as serial (#2612)
On most hard drives, `ID_SERIAL_SHORT` and `SCSI_IDENT_SERIAL` are identical,
but on some SAS drives they do differ. In that case, `SCSI_IDENT_SERIAL`
corresponds to the serial number printed on the drive label, and to the value
returned by `smartctl -i`.

So use that value by default for the `serial` label on the `node_disk_info`
metric, and fallback to `ID_SERIAL_SHORT` only if it's undefined.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2023-05-24 10:19:18 +02:00
Ben Kochie da0b2ca3c2 Deprecate supervisord collector
Mark the `supervisord` as deprecated. This process
supevisor, like `runit`, is of scope for the node_exporter.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-05-23 18:10:42 +02:00