Commit graph

64 commits

Author SHA1 Message Date
Ben Kochie 59c146e57d
Update end-to-end test for aarch64 (#2415)
Fix up handling of CPU info collector on non-x86_64 systems due to
fixtures containing `/proc/cpuinfo` from x86_64.
* Update e2e 64k page test fixture from an arm64 system.
* Enable ARM testing in CircleCI.

Fixes: https://github.com/prometheus/node_exporter/issues/1959

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-06-26 09:41:21 +02:00
Ben Kochie a516d4de4a
Cleanup cgroups collector (#2414)
* Correctly name collector file.
* Fix cgroup summary type as gauge.
* Use a boolean metric rather than a label for enabled.

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-06-24 17:15:31 +02:00
Ben Kochie eecc2b1dea
Add device filter flags to arp collector
Allow filtering APR entries based on device. Useful for ignoring
entries for network namespaces (containers).

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-12-16 15:41:10 +01:00
Ben Kochie 1d5afd05b5
Sanitize UTF-8 in dmi collector (#2229)
Replace invalid UTF-8 chars with "�" string.

Fixes: https://github.com/prometheus/node_exporter/issues/2228

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-12-01 11:13:43 +01:00
Jacob Vosmaer 5c8d162ca6
Add node_softirqs_total metric (#2221)
This adds a new Linux metric, node_softirqs_total, which corresponds
to the 'softirq' line in /proc/stat. This metric is disabled by
default and it can be enabled with '--collector.stat.softirq'.

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
2021-12-01 09:55:13 +01:00
Johannes 'fish' Ziemke 85e20238e7
Add clocksource metrics to time collector (#2197)
* Add clocksource metrics to time collector

This closes #1336

Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-11-12 11:45:31 +01:00
Aleksei Zakharov 0e6b23c338
Lnstat: expose metrics from /proc/net/stat (#1771)
* Lnstat initial commit

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru>
Co-authored-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-09-28 10:24:18 +02:00
Benjamin Drung b6215e649c Add os release collector
Currently Node Exporter has a metric called `node_uname_info` which of
course exposes uname info. While this is nice, it does not help if you
are running different OSes which could have similar uname info.

Therefore parse `/etc/os-release` or `/usr/lib/os-release` and expose a
`node_os_info` metric which provide information regarding the OS
release/version of the node. Also expose the major.minor part of the OS
release version as `node_os_version`.

Since the os-release files will not change often, cache the parsed
content and only refresh the cache if the modification time changes.

This `os` collector will read files outside of `/proc` and `/sys`, but
the os-release file is widely used and the format is standardized:
https://www.freedesktop.org/software/systemd/man/os-release.html

Bug: https://github.com/prometheus/node_exporter/issues/1574
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-19 14:04:21 +02:00
Ben Kochie 9893fca77e
Add flag to ignore network speed if it is unknown
Some devices (ex virtual) don't have a speed and report `-1` as the
speed value. Add a flag to allow ignoring speed on these devices.

Fixes: https://github.com/prometheus/node_exporter/issues/1967

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-18 11:36:31 +01:00
Ben Kochie 1729558e11
Merge pull request #1922 from kwisniewski98/zone
Add zoneinfo collector
2021-02-05 13:57:54 +01:00
Ben Kochie 78682c80af
Merge pull request #1786 from deusnefum/master
Add fibre channel collector
2021-02-03 18:22:59 +01:00
Wisniewski, Krzysztof2 997a8fbb7f Add zoneinfo collector
Signed-off-by: Wisniewski, Krzysztof2 <Krzysztof2.Wisniewski@intel.com>
2021-01-26 12:00:35 +01:00
Aleksei Zakharov 3b035c8fa1
bcache: add priorityStats flag (#1621)
* bcache: add priorityStats flag

Fixes #1593

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru>
2020-08-10 16:50:58 +02:00
domchan 503e4fc848
Expose cpu bugs and flags as info metrics. (#1788)
* Expose cpu bugs and flags as info metrics with a regexp filter.
* Automatically enable CPU info metrics when using flags or bugs feature.

Signed-off-by: domgoer <domdoumc@gmail.com>
2020-07-17 18:32:23 +02:00
mhiles 076c953488 update fixtures / e2e test for fibre channel
Signed-off-by: mhiles <hiles@hpe.com>
2020-07-13 09:30:19 -04:00
Peter Bueschel da5972b539
Add gauges for allocated memory for queued UDP and TCP packages (#1503)
* Two new states will be added to the tcpstat collector called rx_queued_bytes and tx_queued_bytes.

For UDP datagrams an additional collector 'udp_queues' can be used to expose the total lengths of the tx_queue and rx_queue.
@SuperQ and @discordianfish this changes gives us the option to check for overloaded UDP + TCP processing.
The names of the new TCP states and the UDP metric can be discussed.
The current reasons are just:

I don't want to add another collector for the same exposed file, so I just added the new states to the tcpstat collector.
I chose the name 'udp_queue' instead of 'udpstat' as UDP has no state.


Signed-off-by: Peter Bueschel <peter.bueschel@logmein.com>
2020-03-31 10:46:32 +02:00
Silke Hofstra 8faa843fc4
Add Btrfs collector (#1512)
* Add procfs/btrfs to vendor folder
* Add Btrfs collector

Resolves #1100

Signed-off-by: Silke Hofstra <silke@slxh.eu>
2020-02-19 15:48:51 +01:00
Ukri Niemimuukko eac3e30f7f rapl_linux collector
This exposes RAPL statistics from /sys/class/powercap.

Co-Authored-By: Ben Kochie <superq@gmail.com>
Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2020-02-01 12:06:30 +01:00
Paul Gier 4d72cb8059 add node_cpu_info metric
Contains information gathered from /proc/cpuinfo

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-09-25 14:38:57 -05:00
Richard Kojedzinszky 75462bf4fe Scrape thermal_zone temperatures (#1425)
* Scrape thermal_zone temperatures

Signed-off-by: Richard Kojedzinszky <richard@kojedz.in>
2019-08-04 12:56:36 +02:00
Phil Frost f693a71c06 Scrape CPU latency stats from /proc/schedstat (#1389)
These are useful as a direct indication of CPU contention and task
scheduler latency.

Handy references:
 - https://github.com/torvalds/linux/blob/master/Documentation/scheduler/sched-stats.txt
 - https://doc.opensuse.org/documentation/leap/tuning/html/book.sle.tuning/cha.tuning.taskscheduler.html

procfs is updated to pull in the enabling change:
https://github.com/prometheus/procfs/pull/186

Signed-off-by: Phil Frost <phil@postmates.com>
2019-07-10 09:16:24 +02:00
Daniele Sluijters cc2fd82008 Expose /proc/pressure (#1261)
This enables the collection of pressure stall information as exposed
by the `/proc/pressure` interface added in the 4.20 release of the
Linux kernel.

Closes #1174

Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>
2019-04-18 12:19:20 +02:00
Paul Gier cc847f2f44 collector/cpu: split cpu freq metrics into separate collector (#1253)
The cpu frequency information is not always needed and/or available.
This change allows the cpu frequency metrics to be enabled/disabled
separately from the other cpu metrics, and also prevents a frequency
metric failure (such as a parse error) from failing the main cpu
collector.

Fixes #1241

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-02-19 17:22:54 +01:00
Jan Klat c4102f1175 Add sys/class/net parsing from procfs and expose its metrics (#851)
* add sys/class/net parsing from procfs and expose its metrics

Signed-off-by: Jan Klat <jenik@klatys.cz>

* change code to use int pointers per procfs change, move netclass to separate collector, change metric naming

Signed-off-by: Jan Klat <jenik@klatys.cz>

* bump year in licence, remove redundant newline, correct fixtures

Signed-off-by: Jan Klat <jenik@klatys.cz>

* fix style

Signed-off-by: Jan Klat <jenik@klatys.cz>

* change carrier changes to counter type

Signed-off-by: Jan Klat <jenik@klatys.cz>

* fix e2e output

Signed-off-by: Jan Klat <jenik@klatys.cz>

* add fixtures

Signed-off-by: Jan Klat <jenik@klatys.cz>

* update vendor, use fixtures correctly

Signed-off-by: Jan Klat <jenik@klatys.cz>

* change fixtures (device in /sys/class/net should be symlinked)

Signed-off-by: Jan Klat <jenik@klatys.cz>

* correct fixtures for 64k page, updated readme

Signed-off-by: Jan Klat <jenik@klatys.cz>
2018-07-16 15:08:18 +02:00
Pavlo Kutishchev 456bf5094a Add processes exporter (#950)
* Add processes exporter

Signed-off-by: Pavel Kutishchev <pavel.kutishchev@olx.com>
Signed-off-by: Ben Kochie <superq@gmail.com>
2018-06-05 19:38:32 +02:00
Alexey Kopytov dd98a09bb2 A couple of ARM64-related fixes (#934)
* Do not rely on AArch64 CPUs to support 32-bit ARM for cross-testing.

Signed-off-by: Alexey Kopytov <akopytov@gmail.com>

* aarch64 like ppc64le reports 64k node_sockstat_TCP_mem_bytes due to 64k pages.

Signed-off-by: Alexey Kopytov <akopytov@gmail.com>
2018-05-14 15:55:49 +02:00
Brian Brazil 499c342fed Greatly reduce the metrics vmstat returns by default.
Vmstat has over 100 fields, most of which are highly
detailed debug information. Trim this down to only
essential fields by default, configurable by flag.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-03-29 22:00:02 +01:00
Matt Layher dcb31670d6 Makefile: add checkmetrics target, use in CI (#797) 2018-02-13 18:04:03 +01:00
Ben Kochie 6a041692ed
Add NFS Server metrics collector. (#803)
* Add NFS Server metrics collector.

* Add File Handles metrics.

* Add nfsd IO stats.

* Add metrics for NFSd threads.

* Add metrics for NFSd read ahead cache.

* Add NFSd network traffic counters.

* Add RPC metrics.

* Add V2 requests metrics.

* Add NFSv3 metrics.

* Add NFSv4 metrics.

* Update reply cache comment.

* Update help text.
2018-02-12 17:56:05 +01:00
Ben Kochie 111e3af437
Remove obsolete megacli collector. (#798)
This collector has been replaced by the textfile collector tool
`storcli.py`.
2018-01-23 11:25:42 +01:00
Julius Volz 6cac74f0e0
Add unit suffix to textfile collector mtime metric (#796) 2018-01-22 14:02:19 +01:00
Ben Kochie b4d7ba119a
Add fixture for ppc64le (#785)
* Add support for per-architecture fixtures.
* Add output for ppc64le.
2018-01-11 13:56:19 +01:00
Ben Kochie ea250d73f4
Fix off by one in Linux interrupts collector (#721)
* Fix off by one in Linux interrupts collector

* Fix off by one in CPU column handler.
* Add test.

* Enable interrupts in end-to-end test.
2017-11-02 09:59:46 +01:00
Calle Pettersson 859a825bb8 Replace --collectors.enabled with per-collector flags (#640)
* Move NodeCollector into package collector

* Refactor collector enabling

* Update README with new collector enabled flags

* Fix out-of-date inline flag reference syntax

* Use new flags in end-to-end tests

* Add flag to disable all default collectors

* Track if a flag has been set explicitly

* Add --collectors.disable-defaults to README

* Revert disable-defaults flag

* Shorten flags

* Fixup timex collector registration

* Fix end-to-end tests

* Change procfs and sysfs path flags

* Fix review comments
2017-09-28 15:06:26 +02:00
Calle Pettersson dfe07eaae8 Switch to kingpin flags (#639)
* Switch to kingpin flags

* Fix logrus vendoring

* Fix flags in main tests

* Fix vendoring versions
2017-08-12 15:07:24 +02:00
Ben Kochie 46c31d8a7e Enable IPVS collector by default (#623)
* Silence error output when no IPVS present.
* Enable by default.
* Update end-to-end fixture.
* Update README.
2017-07-26 15:20:28 +02:00
ideaship 8d90276283 Add bcache collector (#597)
* Add bcache collector for Linux

This collector gathers metrics related to the Linux block cache
(bcache) from sysfs.

* Removed commented out code

* Use project comment style

* Add _sectors to metric name to indicate unit

* Really use project comment style

* Rename bcache.go to bcache_linux.go

* Keep collector namespace clean

Rename:
- metric -> bcacheMetric
- periodStatsToMetrics -> bcachePeriodStatsToMetric

* Shorten slice initialization

* Change label names to backing_device, cache_device

* Remove five minute metrics (keep only total)

* Include units in additional metric names

* Enable bcache collector by default

* Provide metrics in seconds, not nanoseconds

* remove metrics with label "all"

* Add fixtures, update end-to-end for bcache collector

* Move fixtures/sys into tar.gz

This changeset moves the collector/fixtures/sys directory into
collector/fixtures/sys.tar.gz and tweaks the Makefile to unpack the
tarball before tests are run.

The reason for this change is that Windows does not allow colons in a
path (colons are present in some of the bcache fixture files), nor can
it (out of the box) deal with pathnames longer than 260 characters
(which we would be increasingly likely to hit if we tried to replace
colons with longer codes that are guaranteed not the turn up in regular
file names).

* Add ttar: plain text archive, replacement for tar

This changeset adds ttar, a plain text replacement for tar, and uses it
for the sysfs fixture archive. The syntax is loosely based on tar(1).

Using a plain text archive makes it possible to review changes without
downloading and extracting the archive. Also, when working on the repo,
git diff and git log become useful again, allowing a committer to verify
and track changes over time.

The code is written in bash, because bash is available out of the box on
all major flavors of Linux and on macOS. The feature set used is
restricted to bash version 3.2 because that is what Apple is still
shipping.

The programm also works on Windows if bash is installed. Obviously, it
does not solve the Windows limitations (path length limited to 260
characters, no symbolic links) that prompted the move to an archive
format in the first place.
2017-07-07 07:20:18 +02:00
Rene Treffer 2e9f1913b8 Move stat_linux to cpu_linux and add cpufreq stats (#548) 2017-06-13 11:21:53 +02:00
Emanuele Rocca 047003b6bb Add qdisc collector for Linux (#580)
* Add qdisc collector for Linux

This collector gathers basic queueing discipline metrics via netlink,
similarly to what `tc -s qdisc show` does.

* qdisc collector: nl-specific code moved, names fixed

- netlink-specific parts moved to github.com/ema/qdisc
- avoid using shortened names
- counters renamed into XXX_total

* Get rid of parseMessage error checking leftover

* Add github.com/ema/qdisc to vendored packages

* Update help texts and comments

* Add qdisc collector to README file

* qdisc collector end-to-end testing

* Update qdisc dependency to latest version

Update github.com/ema/qdisc dependency to revision 2c7e72d, which
includes unit testing.

* qdisc collector: rename "iface" label into "device"
2017-05-23 11:55:50 +02:00
Matt Layher 1feb091b36
Initial XFS collector 2017-04-22 11:53:07 -04:00
Sam Kottler 6eafa51fa8 Add ARP collector for Linux (#540)
* Implement commonalities and linux support for ARP collection

* Add ARP collector to fixtures and run as part of e2e tests

* Bubble up scanner errors

* Use single return values where it makes sense

* Add missing annotation

* Move arp_common into arp_linux

* Add license header to arp_linux.go

* Address initial feedback

* Use strings.Fields instead of strings.Split

* Deal with scanner.Err() rather than throwing away errors

* Check for scan errors in-line before interacting with the entries map

* Don't interact with potentially empty text from scan

* Check for scan errors outside the scan loop

* Add comment about moving procfs parsing

* Add more direct comment

* Update initialism style to match go style guide

* Put function args on the same line

* Add TODO in front of comment about procfs extraction

* Guard against strings.Fields returning an empty slice

* Be more defensive about ARP table format and use upcase more broadly

* Enable the ARP collector by default

* Add ARP collector to the README

* Remove 'entry'
2017-04-11 17:45:19 +02:00
Brian Brazil a02e469b07 Report collector success/failure and duration per scrape. (#516)
This is in line with best practices, and also saves us
63 timeseries on a default Linux setup.
2017-03-16 17:21:00 +00:00
Ben Kochie 38cd07ebb9 Merge pull request #450 from roclark/add-infiniband
infiniband: Add new collector for InfiniBand statistics
2017-02-16 14:33:19 +01:00
Thorhallur Sverrisson 5ab285e098 Adding buddyinfo to end to end test. 2017-02-15 10:15:44 -06:00
Robert Clark 4866adcb71 Add new collector for InfiniBand statistics
Add new metrics for the InfiniBand network protocol including the amount of packets sent and received, the number of times the link has been downed and how many times the link has recovered from an error state.

Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
2017-02-07 11:09:08 -06:00
Matt Layher efa25665ec
Add initial wifi collector, bump netlink to fix 32-bit builds 2017-01-11 10:08:44 -05:00
Ben Kochie 38a4a36061 Update end-to-end test. 2017-01-10 10:23:16 +01:00
Joe Handzik e7442d6517 end-to-end-test.sh: Add zfs plugin
Enables fixture test and updates e2e-output.txt.

Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-08 11:13:35 -06:00
Johannes 'fish' Ziemke 2e47fcb8c5 Only store relevant e2e output
This makes commits ligher/more readable when updating the output.
2017-01-06 12:36:26 +01:00
Johannes 'fish' Ziemke 71ea37987f Merge pull request #365 from EdSchouten/drbd
A collector for DRBD
2016-12-25 11:04:43 +01:00