node_exporter

mirror of https://github.com/prometheus/node_exporter.git synced 2025-08-20 18:33:52 -07:00

Author	SHA1	Message	Date
Matt Layher	da6b66371f	collector: reimplement sockstat collector with procfs (#1552 ) * collector: reimplement sockstat collector with procfs * collector: handle sockstat IPv4 disabled, debug logging Signed-off-by: Matt Layher <mdlayher@gmail.com>	2019-11-25 13:41:38 -06:00
Holger Hoffstätte	3c2c4e7b3c	Add new counters for flush requests in Linux 5.5 (#1548 ) * Add diskstat flush request counters for Linux 5.5+ * Update tests for diskstat flush request counters with Linux 5.5+ Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>	2019-11-25 13:16:15 -06:00
Ben Kochie	67d3010a79	Add fixture update helper (#1551 ) * Add makefile target to update sysfs fixtures. * Use similar style for fixtures from procfs. * Re-pack fixtures ttar file. Signed-off-by: Ben Kochie <superq@gmail.com>	2019-11-23 07:52:47 -06:00
Benjamin Drung	04fbcfffa1	Collect InfiniBand port state and physical state (#1357 ) Collect the InfiniBand port state, the physical state, and the maximum signal transfer rate. Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>	2019-11-22 13:52:17 -08:00
Sven Haardiek	d089776e8b	Squashed commit of the following: commit 5ef96388a978c54173e1b1ec8e7bcb41fc7d130d Author: Sven Haardiek <sven@haardiek.de> Date: Wed Sep 18 20:45:23 2019 +0200 block variables Signed-off-by: Sven Haardiek <sven@haardiek.de> commit c1177382e241994618a8ab7dd9842027d597b0df Author: Sven Haardiek <sven@haardiek.de> Date: Wed Sep 18 20:38:33 2019 +0200 Use SI Units Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 04e4f99c423872d3094f21f89a8235b233a01941 Merge: 5417c98 `f3538e1` Author: Sven Haardiek <sven@haardiek.de> Date: Wed Sep 18 19:20:17 2019 +0200 Merge branch 'master' into power_supply_class commit 5417c9820a40b37b490caedeaa3526883380b9bf Author: Sven Haardiek <sven@haardiek.de> Date: Wed Sep 4 23:02:39 2019 +0200 Drop averages Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 1f1447dbe7bbdcdabebf4c968beb14c67d89dd9f Author: Sven Haardiek <sven@haardiek.de> Date: Wed Sep 4 22:56:00 2019 +0200 Update Copyright Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 9677425059a3bf61cd7498cf7b5f05d5af7a626b Merge: 0b51589 `d3478a2` Author: Sven Haardiek <sven@haardiek.de> Date: Mon Sep 2 22:02:53 2019 +0200 Merge branch 'master' into power_supply_class commit 0b51589f390cc1b33ea4728d85fca3a3b231cf3f Author: PrometheusBot <prometheus-team@googlegroups.com> Date: Fri Aug 30 13:32:17 2019 +0200 makefile: update Makefile.common with newer version (#1466) Signed-off-by: prombot <prometheus-team@googlegroups.com> commit af2b9e849c7b69237b7fa0e9a289c929ec7173a0 Author: Boris Momčilović <boris.momcilovic@gmail.com> Date: Tue Aug 27 14:24:11 2019 +0200 Ipvs firewall mark (#1455) * IPVS: include firewall mark label Signed-off-by: Boris Momčilović <boris@firstbeatmedia.com> commit 773f99de7f699900a00b4d35340e356fe7098ee7 Author: Paul Gier <pgier@redhat.com> Date: Tue Aug 27 02:26:19 2019 -0500 update procfs to v0.0.4 (#1457) Signed-off-by: Paul Gier <pgier@redhat.com> commit 6f8a4f4348f62700cbf7eeb2657851237e13c35d Author: beorn7 <beorn@grafana.com> Date: Tue Aug 20 18:49:12 2019 +0200 Update legendLink This still had the 'k8s' in as it was copied and pasted from the kubernetes-mixin. Signed-off-by: beorn7 <beorn@grafana.com> commit d758cf394cfbed9e87e116a24d72050066cd039a Author: beorn7 <beorn@grafana.com> Date: Wed Aug 14 22:24:24 2019 +0200 Make the severity of "critical" alerts configurable This addresses the blissful scenario where single-node failures are unproblematic. No reason to wake somebody up if a node is about to screw itself up by filling the disk. Signed-off-by: beorn7 <beorn@grafana.com> commit 041b9e1e785f5f43bbef97c0c76d205181d08890 Author: beorn7 <beorn@grafana.com> Date: Thu Aug 15 16:43:57 2019 +0200 Add line for number of cores to load graph Backported from the node dashboard in the kubernetes-mixin. Signed-off-by: beorn7 <beorn@grafana.com> commit 5552bb3a6b2be1e3dd1a93dbdb9650bd0363a922 Author: beorn7 <beorn@grafana.com> Date: Thu Aug 15 16:36:10 2019 +0200 Fix title of CPU panel to usage We use the `mode="idle"` metric, but we are inverting it, so this is usage, and that's intended. Signed-off-by: beorn7 <beorn@grafana.com> commit db0571b402233323ed7e222e53f7ef7738520f49 Author: beorn7 <beorn@grafana.com> Date: Thu Aug 15 16:32:54 2019 +0200 node-mixin: Improve disk usage panel - Use a stacked graph instead of a gauge as development over time is especially useful for disk space usage. - By only taking one metric per device into account, we avoid double-counting for devices that are mounted multiple times. Signed-off-by: beorn7 <beorn@grafana.com> commit 3822e096c5d27d06b9c9a68beff81ef23f12eb36 Author: Björn Rabenstein <beorn@grafana.com> Date: Thu Aug 15 00:40:51 2019 +0200 node-mxin: Improve nodes dashboard (#1448) * node-mixin: Improve nodes dashboard - Use stacking where it makes sense. - Normalize idle CPU so that stacking is more meaningful. - Consistently fill where stacking is used but don't fill where not. - Fix y axis max value for Idle CPU panel. - Fix y axis min value for memory usage panel. - Use `$__interval` for range where applicable (and set min step to 1m). - Make the right Y axis for disk I/O actually work. This is just an incremental improvements. It doesn't touch the more involved TODOs. Signed-off-by: beorn7 <beorn@grafana.com> commit fbced86b9835e1b196c15ddcac01ba3cfcf369cc Author: beorn7 <beorn@grafana.com> Date: Tue Aug 13 21:54:28 2019 +0200 node-mixin: Fix various straight-forward issues in the USE dashboards - Normalize cluster memory utilisation. - Fix missing `1m` in memory saturation. - Have both disk-related row next to each other instead with the network row in between. - Correctly render transmit network traffic as negative, using `seriesOverrides` and `min: null` for the y-axis. - Make panel and row naming consistent. - Remove legend where it would just display a single entry with exactly the title of the panel. - Fix metric name in individual node CPU Saturation panel. - Break up disk space utilisation by device in the panel for an individual node. NB: All of that doesn't touch any more subtle issues captured in the various TODOs. Signed-off-by: beorn7 <beorn@grafana.com> commit 5bdf0625023cf7d05e0f65c6b6a1303637772ca6 Author: Sandro Jäckel <sandro.jaeckel@gmail.com> Date: Wed Aug 7 09:19:20 2019 +0200 Update rootfs syntax in Docker example (#1443) Signed-off-by: Sandro Jäckel <sandro.jaeckel@gmail.com> commit b59f081d45a3ca65957900ec33772dca25a3066f Author: Phil Frost <phil@postmates.com> Date: Tue Aug 6 13:08:06 2019 -0400 Fix seconds reported by schedstat (#1426) Upstream bugfix: https://github.com/prometheus/procfs/pull/191 Signed-off-by: Phil Frost <phil@postmates.com> commit ac9a059ae81fa31f9963614483af3b5e3bfd672c Author: Sven Haardiek <sven@haardiek.de> Date: Sun Aug 4 20:15:36 2019 +0200 Try to make it work for PowerPC Signed-off-by: Sven Haardiek <sven@haardiek.de> commit c81acf3b009e8538783489d1468f33faf65d8b01 Merge: c064116 `75462bf` Author: Sven Haardiek <sven@haardiek.de> Date: Sun Aug 4 20:14:16 2019 +0200 Merge remote-tracking branch 'upstream/master' into power_supply_class Signed-off-by: Sven Haardiek <sven@haardiek.de> commit c0641162c3a432f29df30c8d0632a7756d7d2bff Merge: 06f6e3e `0b710bb` Author: Sven Haardiek <sven@haardiek.de> Date: Fri Aug 2 18:30:28 2019 +0200 Merge branch 'master' into power_supply_class Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 06f6e3e8b2a9b2e3f345b6d312a777731bb4b403 Author: Sven Haardiek <sven.haardiek@iotec-gmbh.de> Date: Fri Mar 22 15:36:03 2019 +0100 Fix Pull Request comments * concise metric conditions * combine info about power supply to one metric Signed-off-by: Sven Haardiek <sven.haardiek@iotec-gmbh.de> commit 785c3735c4626de56f8341f800ab7bb5e2594d08 Author: Sven Haardiek <sven@haardiek.de> Date: Sat Mar 9 18:47:52 2019 +0100 Use sys.ttar instead of uploading the files Signed-off-by: Sven Haardiek <sven@haardiek.de> commit e07bff5d938457147b9009aef7d42d763018cd66 Author: Sven Haardiek <sven@haardiek.de> Date: Sat Mar 9 18:34:50 2019 +0100 Add information about from /sys/class/power_supply Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 55b3e34840c9dfc6513ae8e69b6479d5842a3091 Author: Sven Haardiek <sven@haardiek.de> Date: Sat Mar 9 18:09:45 2019 +0100 Use cyclecount instead of cycle_count since it is a gauge Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 602350b333cf9353d2cd0ffd40206c96ffe29941 Author: Sven Haardiek <sven@haardiek.de> Date: Sat Mar 9 18:09:25 2019 +0100 other build options Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 5aa38f678451d5b63ffdc32336345a1ff6703725 Author: Sven Haardiek <sven@haardiek.de> Date: Sat Mar 9 18:08:56 2019 +0100 Update fixtures Signed-off-by: Sven Haardiek <sven@haardiek.de> commit c6acc474a4224b8d9f7b178d0d2e02636d8629ea Author: Sven Haardiek <sven@haardiek.de> Date: Sat Mar 9 17:20:30 2019 +0100 Update command line parameter flag Signed-off-by: Sven Haardiek <sven@haardiek.de> commit f5a329e6ae5ed3b16aa866d67b944f1a73edfe42 Author: Sven Haardiek <sven@haardiek.de> Date: Sat Mar 9 17:20:06 2019 +0100 Update procfs dependency Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 38d5fa5165643d6a44dc863b3a1696774259ac0d Merge: 5a7ce69 28f3582 Author: Sven Haardiek <sven@haardiek.de> Date: Sat Mar 9 16:28:29 2019 +0100 Merge branch 'power_supply_class' of github.com:shaardie/node_exporter into power_supply_class commit 5a7ce69505079c9c090e44448cfbd7ffb2b04df7 Author: Sven Haardiek <sven@haardiek.de> Date: Sat Oct 20 18:55:49 2018 +0200 Updated Metrics of Power Supply Class Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 690ab1b9c1f2e183b7088cf81c7f266d85ee6df6 Author: Sven Haardiek <sven@haardiek.de> Date: Fri Oct 19 20:03:42 2018 +0200 Start work on Power Supply Collector Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 28f358222bbac4315fbf44d94da36d4b0ff2ed55 Author: Sven Haardiek <sven@haardiek.de> Date: Sat Oct 20 18:55:49 2018 +0200 Updated Metrics of Power Supply Class Signed-off-by: Sven Haardiek <sven@haardiek.de> commit 751d99b818503e9a4430b10c39760f180349b294 Author: Sven Haardiek <sven@haardiek.de> Date: Fri Oct 19 20:03:42 2018 +0200 Start work on Power Supply Collector Signed-off-by: Sven Haardiek <sven@haardiek.de> Signed-off-by: Sven Haardiek <sven@haardiek.de>	2019-10-27 16:03:35 +01:00
John Belmonte	15e36e2230	fix typo in cpufreq metric names (#1510 ) Signed-off-by: John Belmonte <john@neggie.net>	2019-10-11 02:12:20 +09:00
Paul Gier	4d72cb8059	add node_cpu_info metric Contains information gathered from /proc/cpuinfo Signed-off-by: Paul Gier <pgier@redhat.com>	2019-09-25 14:38:57 -05:00
Benjamin Drung	27b8c93a5a	Use InfiniBandClass from procfs library (#1396 ) Parsing the sysfs files for InfiniBand was added to the procfs library (see https://github.com/prometheus/procfs/pull/164). Therefore use `InfiniBandClass` from the procfs library instead of parsing sysfs itself. If the port counter return `N/A (no PMA)` no metric will be returned (instead of returning 0 for this metric. Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>	2019-09-23 18:18:35 +02:00
Alex Schmitz	664025d60c	Scrape cooling_device state Signed-off-by: Alex Schmitz <alex.schmitz@gmail.com>	2019-08-30 08:58:47 -05:00
Boris Momčilović	93c12e03a1	Ipvs firewall mark (#1455 ) * IPVS: include firewall mark label Signed-off-by: Boris Momčilović <boris@firstbeatmedia.com>	2019-08-27 14:24:11 +02:00
Phil Frost	26d4fbdf07	Fix seconds reported by schedstat (#1426 ) Upstream bugfix: https://github.com/prometheus/procfs/pull/191 Signed-off-by: Phil Frost <phil@postmates.com>	2019-08-06 19:08:06 +02:00
Richard Kojedzinszky	75462bf4fe	Scrape thermal_zone temperatures (#1425 ) * Scrape thermal_zone temperatures Signed-off-by: Richard Kojedzinszky <richard@kojedz.in>	2019-08-04 12:56:36 +02:00
Dipack P Panjabi	a7452023db	Added mountinfo changes to node_exporter (#1417 ) Use the extra information gleaned from the mountinfo file to add a 'mountaddr' field for NFS metrics. This helps prevent prometheus from ignoring mounts that come from the same URL, but are actually from different IP addresses. This commit also rebases to current master Signed-off-by: Dipack P Panjabi <dpanjabi@hudson-trading.com>	2019-07-28 11:32:40 +02:00
Matthias Rampke	b133213c7a	Report non-fatal collection errors in the exporter metric. (#1439 ) As per prometheus/client_golang#543, pass the Registry for exporter metrics when setting up the /metrics HTTP handler. With this, the `promhttp_metric_handler_errors_total` metric will increment on (possibly non-fatal) collection-time errors, such as duplicate metrics from text files. Signed-off-by: Matthias Rampke <mr@soundcloud.com>	2019-07-28 10:37:10 +02:00
Steven Kreuzer	d8e47a9f9f	Expose additional XFS runtime statistics (#1423 ) Include directory operation, read/write system call, and vnode runtime statistics for XFS filesystems. Signed-off-by: Steven Kreuzer <skreuzer@FreeBSD.org>	2019-07-15 16:28:09 +02:00
Phil Frost	f693a71c06	Scrape CPU latency stats from /proc/schedstat (#1389 ) These are useful as a direct indication of CPU contention and task scheduler latency. Handy references: - https://github.com/torvalds/linux/blob/master/Documentation/scheduler/sched-stats.txt - https://doc.opensuse.org/documentation/leap/tuning/html/book.sle.tuning/cha.tuning.taskscheduler.html procfs is updated to pull in the enabling change: https://github.com/prometheus/procfs/pull/186 Signed-off-by: Phil Frost <phil@postmates.com>	2019-07-10 09:16:24 +02:00
Advait Bhatwadekar	3f49b31101	Closes issue #261 on node_exporter. (#1403 ) * Closes issue #261 on node_exporter. Delegated mdstat parsing to procfs project. mdadm_linux.go now only exports the metrics. -> Added disk labels: "fail", "spare", "active" to indicate disk status -> hanged metric node_md_disks_total ==> node_md_disks_required -> Removed test cases for mdadm_linux.go, as the functionality they tested for has been moved to procfs project. Signed-off-by: Advait Bhatwadekar <advait123@ymail.com>	2019-07-01 11:56:06 +02:00
Ben Kochie	ccf27426ad	Fix 64k page e2e fixture (#1404 ) Update for change in https://github.com/prometheus/node_exporter/pull/1224 Signed-off-by: Ben Kochie <superq@gmail.com>	2019-06-28 09:53:35 +02:00
Ben Kochie	8146998945	Fix rollover bug in mountstats collector (#1364 ) * Update procfs vendor to pull in github.com/prometheus/procfs/pull/165 * Update mountstats collector to use new types. * Rollover counter automatically to avoid float64 accuracy issues. * Update e2e test. Signed-off-by: Ben Kochie <superq@gmail.com>	2019-05-31 18:30:37 +02:00
Daniele Sluijters	cc2fd82008	Expose /proc/pressure (#1261 ) This enables the collection of pressure stall information as exposed by the `/proc/pressure` interface added in the 4.20 release of the Linux kernel. Closes #1174 Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>	2019-04-18 12:19:20 +02:00
Paul Gier	cc847f2f44	collector/cpu: split cpu freq metrics into separate collector (#1253 ) The cpu frequency information is not always needed and/or available. This change allows the cpu frequency metrics to be enabled/disabled separately from the other cpu metrics, and also prevents a frequency metric failure (such as a parse error) from failing the main cpu collector. Fixes #1241 Signed-off-by: Paul Gier <pgier@redhat.com>	2019-02-19 17:22:54 +01:00
Sachi King	18fc512fc4	Bond: Monitor bond mii_status not link operstate (#1124 ) With a bond interface the state of the slave interface from the bond's point of view is reflected in `mii_status` and is independent of the link's `operstate`. When a bond is monitored with `miimon`, `mii_status` will reflect the state of the physical link as configured via the operator. When a bond is monitored via `arp_interval` the `mii_status` will reflect the results of the bond ARP checking. This means the link can be down from the bond's point of view, but up from a physical connection point of view. If a bond is not monitored via miimon or arp, the `mii_status` should likely be always `up`, however I have observed a case where this is not true and the `operstate` is `up` while `mii_status` is `down`. Kernel bond documentation stresses that a bond should not be configured without one of `mii_mon` or `arp_interval` configured however. This change results in the metric 'node_bonding_active' matching the up/down state of the bond's point of view rather than operstate. Signed-off-by: Sachi King <nakato@nakato.io>	2019-02-10 11:00:04 +01:00
Paul Gier	e0d6d11859	netclass_linux: remove varying labels from the 'up' metric (#1243 ) * netclass_linux: remove varying labels from the 'up' metric This moves the variable label values such as 'operstate' out of the 'network_up' metric and into a separate metric called '_info'. This allows the 'up' metric to remain continous over state changes. Fixes #1236 Signed-off-by: Paul Gier <pgier@redhat.com>	2019-02-07 15:59:32 +01:00
Johannes 'fish' Ziemke	6ea0aa73e4	Rename interface to device in netclass collector (#1224 ) * Rename interface to device in netclass collector This makes it consistent with other networking metrics like node_network_receive_bytes_total This closes #1223 Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>	2019-02-06 20:02:48 +01:00
mknapphrt	7fbdd0ae93	Update procfs vendor (#1248 ) Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>	2019-02-04 16:54:41 +01:00
Ben Kochie	73ddf5f1f7	netstat: Add TCP In/Out Segs (#1185 ) * netstat: Add TCP In/Out Segs In order to get a better idea of TCP packet loss, we need to know how many `node_netstat_Tcp_OutSegs` there are so we can compare this to `node_netstat_Tcp_RetransSegs`. Signed-off-by: Ben Kochie <superq@gmail.com> * Update fixtures Signed-off-by: Ben Kochie <superq@gmail.com>	2018-12-08 12:16:02 +01:00
Nemikolh	62f99f95f0	Add receive/transmit bytes total metric (wifi collector). (#1150 ) Signed-off-by: Nemikolh <Nemikolh@users.noreply.github.com>	2018-11-19 19:15:54 +01:00
Patrick	bdc0e7e678	Collect additional common Infiniband counters (#1120 ) * Collect additional common Infiniband counters Signed-off-by: Patrick Freeman <will.pat.free@gmail.com>	2018-10-30 21:54:09 +01:00
Paul Gier	38163f234f	collector/diskstats: don't fail if there are extra stats, just ignore… (#1125 ) * collector/diskstats: don't fail if there are extra stats, just ignore them Signed-off-by: Paul Gier <pgier@redhat.com>	2018-10-30 18:45:00 +01:00
Ben Kochie	a0a164defb	Update cpufreq metrics collector (#1117 ) * Update Linux cpufreq collector to use new procfs library functions. * Split thermal throttle collection to a separate function. * Add new required fixtures and repack ttar file. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-10-18 17:28:19 +02:00
Paul Gier	e8d8199072	Update diskstats for linux kernel 4.19 (#1109 ) The format of /proc/diskstats is changing in linux-4.19 to include some additional fields. See: https://www.kernel.org/doc/Documentation/iostats.txt * collector/diskstats: use constants for some hard coded strings * collector/diskstats: update diskstats for linux-4.19 * collector/diskstats: remove kernel doc url from individual metrics Signed-off-by: Paul Gier <pgier@redhat.com>	2018-10-15 17:24:28 +02:00
Ben Kochie	a1ce712e22	Cleanup unused /proc/mounts fixture. (#1097 ) * Cleanup unused /proc/mounts fixture. * Ignore Uint -> Unit in codespell. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-10-04 18:07:12 +02:00
Mario Trangoni	3659260b66	infiniband: Handle iWARP* RDMA modules N/A (#974 ) * infiniband: Add not connected i40iw0/ports/1 fixtures * infiniband: Handle issue when iWARP* RDMA modules are not available This is related to #966, and handle this error, Jun 07 13:33:24 hostname node_exporter[81888]: time="2018-06-07T13:33:24+02:00" level=error msg="ERROR: infiniband collector failed after 0.000929s: strconv.ParseUint: parsing \"N/A (no PMA)\": invalid syntax" source="collector.go:132" Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>	2018-10-04 15:05:59 +02:00
Yecheng Fu	0f9842f20a	[continue 912] strip rootfs prefix for run in docker (#1058 ) * strip rootfs prefix for run in docker * Use `/` as default value of path.rootfs, and parse mounts from `/proc/1/mounts`. * No need to mount `/proc` and `/sys` because we share host's PID namespace, which allows processes within the container to see all of the processes on the system. Closes: #66 Signed-off-by: Ivan Mikheykin <ivan.mikheykin@flant.com> Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>	2018-10-04 14:11:21 +02:00
Björn Rabenstein	1c9ea46cca	Update vendoring for client_golang and friends (#1076 ) Signed-off-by: beorn7 <beorn@soundcloud.com>	2018-09-17 17:09:52 +02:00
Marco Tulio R Braga	05e55bddad	Fix typo on description of read_time_seconds_total (#1057 ) Fix typo on unit description of metric `*read_time_seconds_total` from milliseconds to seconds. Signed-off-by: Marco Tulio R Braga <marco.tulio@mtulio.eng.br>	2018-09-02 09:46:45 +02:00
Ben Kochie	fe5a117831	Handle vanishing PIDs (#1043 ) PIDs can vanish (exit) from /proc/ between gathering the list of PIDs and getting all of their stats. * Ignore file not found errors. * Explicitly count the PIDs we find. * Cleanup some error style issues. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-08-13 17:27:23 +02:00
Hannes Körber	14a4f0028e	Enable nfs protocol (#998 ) * vendor: Update prometheus/procfs Signed-off-by: Hannes Körber <hannes.koerber@haktec.de> * mountstats: Use new NFS protocol field In https://github.com/prometheus/procfs/pull/100, the NFSTransportStats struct was expanded by a field called protocol that specifies the NFS protocol in use, either "tcp" or "udp". This commit adds the protocol as a label to all NFS metrics exported via the mountstats collector. Signed-off-by: Hannes Körber <hannes.koerber@haktec.de> * Update fixtures for UDP mount Signed-off-by: Hannes Körber <hannes.koerber@haktec.de>	2018-07-24 00:47:12 +02:00
neiledgar	7e4d9bd150	Update wifi stats to support multiple stations (#977 ) (#980 ) Signed-off-by: neiledgar <neil.edgar@btinternet.com>	2018-07-16 16:02:25 +02:00
Jan Klat	c4102f1175	Add sys/class/net parsing from procfs and expose its metrics (#851 ) * add sys/class/net parsing from procfs and expose its metrics Signed-off-by: Jan Klat <jenik@klatys.cz> * change code to use int pointers per procfs change, move netclass to separate collector, change metric naming Signed-off-by: Jan Klat <jenik@klatys.cz> * bump year in licence, remove redundant newline, correct fixtures Signed-off-by: Jan Klat <jenik@klatys.cz> * fix style Signed-off-by: Jan Klat <jenik@klatys.cz> * change carrier changes to counter type Signed-off-by: Jan Klat <jenik@klatys.cz> * fix e2e output Signed-off-by: Jan Klat <jenik@klatys.cz> * add fixtures Signed-off-by: Jan Klat <jenik@klatys.cz> * update vendor, use fixtures correctly Signed-off-by: Jan Klat <jenik@klatys.cz> * change fixtures (device in /sys/class/net should be symlinked) Signed-off-by: Jan Klat <jenik@klatys.cz> * correct fixtures for 64k page, updated readme Signed-off-by: Jan Klat <jenik@klatys.cz>	2018-07-16 15:08:18 +02:00
Ben Kochie	107e5dfecc	Fix mdadm collector issues (#985 ) * Send "Personality unknown" to debug, not info, remove unnecessary newline. * Add support for "linear" personality. * Always set number of active disks to 0 when a device is inactive. * Add total disks calculation to unknown personalites. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-07-02 12:38:20 +02:00
Brad Beam	e3cf1d5187	Adding support for evaluating octal characters in mountpoint (#954 ) Signed-off-by: Brad Beam <brad.beam@b-rad.info>	2018-06-06 16:49:19 +02:00
Pavlo Kutishchev	456bf5094a	Add processes exporter (#950 ) * Add processes exporter Signed-off-by: Pavel Kutishchev <pavel.kutishchev@olx.com> Signed-off-by: Ben Kochie <superq@gmail.com>	2018-06-05 19:38:32 +02:00
Alexey Kopytov	dd98a09bb2	A couple of ARM64-related fixes (#934 ) * Do not rely on AArch64 CPUs to support 32-bit ARM for cross-testing. Signed-off-by: Alexey Kopytov <akopytov@gmail.com> * aarch64 like ppc64le reports 64k node_sockstat_TCP_mem_bytes due to 64k pages. Signed-off-by: Alexey Kopytov <akopytov@gmail.com>	2018-05-14 15:55:49 +02:00
Ben Kochie	b10ca77680	Fix /proc/net/dev/ interface name handling * Allow any character (UTF-8) for Linux interface names. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-04-18 12:53:59 +02:00
Ben Kochie	1ab4a460c7	Update ppc64le end-to-end fixture. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-04-18 09:12:21 +02:00
Ben Kochie	a528966dcd	Fix parsing of interface aliases in netdev linux Very old kernels expose interface aliases as `foo0:0`, adjust the line parsing to handle these names. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-04-17 13:15:02 +02:00
Ben Kochie	015b86670a	Update ppc64le e2e output. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-04-14 15:28:06 +02:00
Dmitriy Lukyanchikov	eddd1b9357	Fix netdev collector for linux (#890 ) fix variable name, fix transmitHeader extracting modify fixtures to run tests with updated netdev_linux collector Signed-off-by: dmitriy-lukyanchikov <d.lukyanchikov@anchorfree.com>	2018-04-14 13:58:56 +02:00
Derek Marcotte	fe86e908da	Update ppc64 fixtures to unbreak end-to-end. `efc1fdb` added new labels. Signed-off-by: Derek Marcotte <554b8425@razorfever.net>	2018-04-13 06:33:38 -04:00
Karsten Weiss	7e392e6634	Fix spelling mistakes found by codespell Signed-off-by: Karsten Weiss <knweiss@gmail.com>	2018-04-09 18:27:17 +02:00
Karsten Weiss	efc1fdb6d0	cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total (#871 ) * cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total This commit fixes the node_cpu_core_throttles_total metrics on multi-socket systems as the core_ids are the same for each package. I.e. we need to count them seperately. Rename the node_package_throttles_total metric label `node` to `package`. Reorganize the sys.ttar archive and use the same symlinks as the Linux kernel. Also, the new fixtures now use a dual-socket dual-core cpu w/o HT/SMT (node0: cpu0+1, node1: cpu2+3) as well as processor-less (memory-only) NUMA node 'node2' (this is a very rare case). Signed-off-by: Karsten Weiss <knweiss@gmail.com> * cpu: Use the direct /sys path to the cpu files. Use the direct path /sys/devices/system/cpu/cpu[0-9]* (without symlinks) instead of /sys/bus/cpu/devices/cpu[0-9]. The latter path also does not exist e.g. on RHEL 6.9's kernel. Signed-off-by: Karsten Weiss <knweiss@gmail.com> cpu: Reverse core+package throttle processing order Signed-off-by: Karsten Weiss <knweiss@gmail.com> * cpu: Add documentation URLs Signed-off-by: Karsten Weiss <knweiss@gmail.com>	2018-04-09 18:01:52 +02:00
Brian Brazil	31ce32f1fe	Greatly trim what netstat collector exposes by default (#876 ) Netstat is 40% of the metrics on my laptop, many of which are highly detailed information about IP internals in the kernel. ~300 such metrics on every machine in your fleet is excessive, so focus on key metrics by default, overridable by the user. Fixes #515 Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>	2018-03-30 19:28:08 +01:00
Ben Kochie	cf3edadcbb	Update fixtures * Add oom_kill to fixture. * Update e2e outputs. * Put regexp in order. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-03-29 22:00:02 +01:00
Brian Brazil	499c342fed	Greatly reduce the metrics vmstat returns by default. Vmstat has over 100 fields, most of which are highly detailed debug information. Trim this down to only essential fields by default, configurable by flag. Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>	2018-03-29 22:00:02 +01:00
Ben Kochie	779090db7e	Update ppc64le fixture (#867 ) Update to match standard e2e output. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-03-27 17:05:20 +02:00
Mario Trangoni	1f11a86d59	Fix nfs golint issues (#863 ) * procfs: update vendoring Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com> * procfs: fix e2e tests after nfs changes Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>	2018-03-22 22:25:37 +01:00
Ben Kochie	7b720df1c5	Use lowercase cpu label name in interrupts (#849 ) To match other CPU related metric labels, use a lowercase named label.	2018-03-08 15:04:49 +01:00
Julius Volz	864a6ee935	Treat custom textfile metric timestamps as errors (#769 ) This is clearer behavior and users will notice and fix their textfiles faster than if we just output a warning.	2018-02-27 19:43:38 +01:00
Rene Treffer	c504c7e264	Only report core throttles per core, not per cpu (#836 ) * Only report core throttles per core, not per cpu * Add topology/core_id to the cpu sysfs fixtures * Add new cpu fixtures to ttar file * Merge core_id reading and thermal throttle accounting * Declare core_id	2018-02-27 19:43:15 +01:00
Ben Kochie	e0d54a509c	Cleanup NFS metrics (#834 ) * Cleanup NFS metrics * Update `nfs` metric names to match `nfsd`. * Remove uneeded `tcp` label from TCP connections metric. * Remove uneeded `v` on `nfsd` metrics. * Enable all `nfs` v4 client metrics. * Remove `nfs` metric name overrides. * Add ppc64le fixture. * Fix typo.	2018-02-21 07:25:41 +01:00
Ben Kochie	3f41a2fecb	Update ppc64le fixture (#832 ) Updates fixture for ppc64le arch to latest output.	2018-02-19 20:43:33 +01:00
Ben Kochie	d33a447047	Remove deprecated prometheus.InstrumentHandlerFunc (#831 ) Update Prometheus client golang use to use `promhttp.Handler()` instead of `prometheus.InstrumentHandlerFunc()`.	2018-02-19 15:44:59 +01:00
Richard Elling	d7348a5c78	updates for zfsonlinux 0.7.5 (#779 ) * updates for zfsonlinux 0.7.5 * add constants for KSTAT_DATA_* types * added e2e test for negative values represented by uint64 that can result from ZFS bugs	2018-02-16 15:46:31 +01:00
Ben Kochie	3de2542d21	Fix NFSd metric type (#819 ) RPC Count should be a counter, not a gauge.	2018-02-13 17:03:22 +01:00
Matt Layher	544488ddd6	Fix remaining metric naming issues (#799 )	2018-02-12 18:53:31 +01:00
Ben Kochie	6a041692ed	Add NFS Server metrics collector. (#803 ) * Add NFS Server metrics collector. * Add File Handles metrics. * Add nfsd IO stats. * Add metrics for NFSd threads. * Add metrics for NFSd read ahead cache. * Add NFSd network traffic counters. * Add RPC metrics. * Add V2 requests metrics. * Add NFSv3 metrics. * Add NFSv4 metrics. * Update reply cache comment. * Update help text.	2018-02-12 17:56:05 +01:00
Ben Kochie	14d60958d6	Unify CPU collector conventions (#806 ) * Unify CPU collector conventions Add a common CPU metric description. * All collectors use the same `nodeCpuSecondsDesc`. * All collectors drop the `cpu` prefix for `cpu` label values. * Fix subsystem string in cpu_freebsd. * Fix Linux CPU freq label names.	2018-02-01 18:42:20 +01:00
Ben Kochie	111e3af437	Remove obsolete megacli collector. (#798 ) This collector has been replaced by the textfile collector tool `storcli.py`.	2018-01-23 11:25:42 +01:00
Julius Volz	6cac74f0e0	Add unit suffix to textfile collector mtime metric (#796 )	2018-01-22 14:02:19 +01:00
Brian Brazil	a98067a294	Make metrics better follow guidelines (#787 ) * Improve stat linux metric names. cpu is no longer used. * node_cpu -> node_cpu_seconds_total for Linux * Improve filesystem metric names with units * Improve units and names of linux disk stats Remove sector metrics, the bytes metrics cover those already. * Infiniband counters should end in _total * Improve timex metric names, convert to more normal units. See `3c073991eb/kernel/time/ntp.c (L909)` for what stabil means, looks like a moving average of some form. * Update test fixture * For meminfo metrics that had "kB" units, add _bytes * Interrupts counter should have _total	2018-01-17 17:55:55 +01:00
Ben Kochie	b4d7ba119a	Add fixture for ppc64le (#785 ) * Add support for per-architecture fixtures. * Add output for ppc64le.	2018-01-11 13:56:19 +01:00
Julius Volz	f536857ac6	Fix e2e tests after textfile custom timestamp removal (#768 )	2017-12-24 11:54:33 +01:00
Shubheksha Jalan	1f2458f42c	Filter out testfile metrics correctly when using `collect[]` filters (#763 ) * remove injection hook for textfile metrics, convert them to prometheus format * add support for summaries * add support for histograms * add logic for handling inconsistent labels within a metric family for counter, gauge, untyped * change logic for parsing the metrics textfile * fix logic to adding missing labels * Export time and error metrics for textfiles * Add tests for new textfile collector, fix found bugs * refactor Update() to split into smaller functions * remove parseTextFiles(), fix import issue * add mtime metric directly to channel, fix handling of mtime during testing * rename variables related to labels * refactor: add default case, remove if guard for metrics, remove extra loop and slice * refactor: remove extra loop iterating over metric families * test: add test case for different metric type, fix found bug * test: add test for metrics with inconsistent labels * test: add test for histogram * test: add test for histogram with extra dimension * test: add test for summary * test: add test for summary with extra dimension * remove unnecessary creation of protobuf * nit: remove extra blank line	2017-12-23 20:21:58 +01:00
Ben Kochie	cd2a17176a	Add full make to CircleCI (#761 ) * Add full make to CircleCI Ensure end-to-end test is run. * Fix go fmt error. * Fix end-to-end output.	2017-12-21 16:24:23 +01:00
Ben Kochie	2a80537547	Split out guest cpu metrics on Linux. (#744 ) Linux "guest" metrics for VMs are already accounted for in node_cpu `user` and `nice` metrics. Separate these into their own metric to avoid duplication of data.	2017-11-23 15:04:47 +01:00
Karsten Weiss	a8d7d1101a	cpu: Support processor-less (memory-only) NUMA nodes (#734 ) * cpu: Support processor-less (memory-only) NUMA nodes Processor-less (memory-only) NUMA nodes exist e.g. in systems that use Intel Optane drives for RAM expansion using Intel Memory Drive Technology (IMDT). IMDT RAM expansion supports two modes: * "Unify Remote Memory domains": present a processor-less (memory-only) NUMA domain, which is the default * "Expand local memory domains": to expand each processor’s memory domain with a portion of the memory made available by Optane and IMDT This commit fixes a crash in the first case (when "cpulist" is empty). Here's an example of such a system: $ numastat -m\|head -n5 Per-node system memory usage (in MBs): Node 0 Node 1 Node 2 Total --------------- --------------- --------------- --------------- MemTotal 118239.56 130816.00 464384.00 713439.56 $ for i in {0..2}; do echo -n "$i: " ; cat /sys/bus/node/devices/node$i/cpulist ; done 0: 0-7,16-23 1: 8-15,24-31 2: $ /opt/vsmp/bin/vsmpversion -vvv Memory Drive Technology: 8.2.1455.74 (Sep 28 2017 13:09:59) System configuration: Boards: 3 1 x Proc. + I/O + Memory 2 x NVM devices (Intel SSDPED1K375GAQ) Processors: 2, Cores: 16, Threads: 32 Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz Stepping 01 Memory (MB): 713472 (of 977450), Cache: 251416, Private: 12562 1 x 249088MB [262036/ 678/12270] 1 x 232192MB [357707/125369/ 146] 82:00.0#1 1 x 232192MB [357707/125369/ 146] 83:00.0#1 * cpu: rename some variables (pkg => node) * cpu: Use %v not %q in log.Debugf() format strings	2017-11-10 15:31:26 +01:00
Ben Kochie	ea250d73f4	Fix off by one in Linux interrupts collector (#721 ) * Fix off by one in Linux interrupts collector * Fix off by one in CPU column handler. * Add test. * Enable interrupts in end-to-end test.	2017-11-02 09:59:46 +01:00
Matt Layher	f9ad88fc03	xfs: expose correct fields, fix metric names	2017-10-20 18:41:51 -04:00
Ben Kochie	deadfef4c9	Update vendoring (#685 ) * Update vendor github.com/coreos/go-systemd/dbus@v15 * Update vendor github.com/ema/qdisc * Update vendor github.com/godbus/dbus * Update vendor github.com/golang/protobuf/proto * Update vendor github.com/lufia/iostat * Update vendor github.com/matttproud/golang_protobuf_extensions/pbutil@v1.0.0 * Update vendor github.com/prometheus/client_golang/... * Update vendor github.com/prometheus/common/... * Update vendor github.com/prometheus/procfs/... * Update vendor github.com/sirupsen/logrus@v1.0.3 Adds vendor golang.org/x/crypto * Update vendor golang.org/x/net/... * Update vendor golang.org/x/sys/... * Update end to end output.	2017-10-05 16:20:47 +02:00
Karsten Weiss	b0d5c00832	cpu: Metric 'package_throttles_total' is per package. (#657 ) * cpu: Metric 'package_throttles_total' is per package. 'package_throttles_total' is per package, not per cpu. This also reduces the total number of cpu time series a lot (esp for multi core cpus). * cpu: Better handling of a cpulist edge-case. * cpu: Extract the package number from the directory name. Do not rely on the range index. * cpu: Add package_throttle_count for node0 cpu1 This file must be ignored by the cpu collector.	2017-09-07 23:24:18 +02:00
Ben Kochie	46c31d8a7e	Enable IPVS collector by default (#623 ) * Silence error output when no IPVS present. * Enable by default. * Update end-to-end fixture. * Update README.	2017-07-26 15:20:28 +02:00
Andrea De Pasquale	1369763067	Change raid0 status line regexp for mdadm collector (#619 )	2017-07-20 17:04:33 +02:00
Aleksey Zhukov	7a914e58f2	Add parsing /proc/net/snmp6 file for netstat-linux (#615 ) * Add parsing /proc/net/snmp6 file * add /proc/net/snmp6 fixture * fix e2e test * gofmt * remove unuser variable * safe checks * add tests * change help format	2017-07-08 20:16:35 +02:00
Matt Layher	6e82fd1c56	Add XFS block mapping and block map B-tree stats (#575 )	2017-07-07 07:27:52 +02:00
ideaship	8d90276283	Add bcache collector (#597 ) * Add bcache collector for Linux This collector gathers metrics related to the Linux block cache (bcache) from sysfs. * Removed commented out code * Use project comment style * Add _sectors to metric name to indicate unit * Really use project comment style * Rename bcache.go to bcache_linux.go * Keep collector namespace clean Rename: - metric -> bcacheMetric - periodStatsToMetrics -> bcachePeriodStatsToMetric * Shorten slice initialization * Change label names to backing_device, cache_device * Remove five minute metrics (keep only total) * Include units in additional metric names * Enable bcache collector by default * Provide metrics in seconds, not nanoseconds * remove metrics with label "all" * Add fixtures, update end-to-end for bcache collector * Move fixtures/sys into tar.gz This changeset moves the collector/fixtures/sys directory into collector/fixtures/sys.tar.gz and tweaks the Makefile to unpack the tarball before tests are run. The reason for this change is that Windows does not allow colons in a path (colons are present in some of the bcache fixture files), nor can it (out of the box) deal with pathnames longer than 260 characters (which we would be increasingly likely to hit if we tried to replace colons with longer codes that are guaranteed not the turn up in regular file names). * Add ttar: plain text archive, replacement for tar This changeset adds ttar, a plain text replacement for tar, and uses it for the sysfs fixture archive. The syntax is loosely based on tar(1). Using a plain text archive makes it possible to review changes without downloading and extracting the archive. Also, when working on the repo, git diff and git log become useful again, allowing a committer to verify and track changes over time. The code is written in bash, because bash is available out of the box on all major flavors of Linux and on macOS. The feature set used is restricted to bash version 3.2 because that is what Apple is still shipping. The programm also works on Windows if bash is installed. Obviously, it does not solve the Windows limitations (path length limited to 260 characters, no symbolic links) that prompted the move to an archive format in the first place.	2017-07-07 07:20:18 +02:00
Rene Treffer	bcc3cd92b8	Fix cpufreq statistics by converting kHz to Hz	2017-06-27 11:05:55 +02:00
Ben Kochie	182810056f	Fix Linux cpu errors (#606 ) Make the Linux cpu collector soft-error on missing `cpufreq` and `thermal_throttle` features.	2017-06-20 07:51:26 +02:00
Rene Treffer	2e9f1913b8	Move stat_linux to cpu_linux and add cpufreq stats (#548 )	2017-06-13 11:21:53 +02:00
Emanuele Rocca	047003b6bb	Add qdisc collector for Linux (#580 ) * Add qdisc collector for Linux This collector gathers basic queueing discipline metrics via netlink, similarly to what `tc -s qdisc show` does. * qdisc collector: nl-specific code moved, names fixed - netlink-specific parts moved to github.com/ema/qdisc - avoid using shortened names - counters renamed into XXX_total * Get rid of parseMessage error checking leftover * Add github.com/ema/qdisc to vendored packages * Update help texts and comments * Add qdisc collector to README file * qdisc collector end-to-end testing * Update qdisc dependency to latest version Update github.com/ema/qdisc dependency to revision 2c7e72d, which includes unit testing. * qdisc collector: rename "iface" label into "device"	2017-05-23 11:55:50 +02:00
Robert Clark	58f50b31f2	Multiply port data XMIT/RCV metrics by 4 (#579 ) According to Mellanox, it is standard practice that the port_xmit_data and port_rcv_data files are split into 4 lanes. To get the actual transmit and receive values for each port, the metric needs to be multiplied by 4. Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>	2017-05-12 07:28:53 +02:00
Matt Layher	1feb091b36	Initial XFS collector	2017-04-22 11:53:07 -04:00
Karsten Weiss	d9703ff7c6	edac: Fix typo in csrow label of node_edac_csrow_uncorrectable_errors_total metric.	2017-04-18 12:45:06 +02:00
Karsten Weiss	45ca8db352	Support the 'guest_nice' cpu mode of /proc/stat. 'guest_nice' is available since Linux 2.6.33.	2017-04-14 12:50:37 +02:00
Sam Kottler	6eafa51fa8	Add ARP collector for Linux (#540 ) * Implement commonalities and linux support for ARP collection * Add ARP collector to fixtures and run as part of e2e tests * Bubble up scanner errors * Use single return values where it makes sense * Add missing annotation * Move arp_common into arp_linux * Add license header to arp_linux.go * Address initial feedback * Use strings.Fields instead of strings.Split * Deal with scanner.Err() rather than throwing away errors * Check for scan errors in-line before interacting with the entries map * Don't interact with potentially empty text from scan * Check for scan errors outside the scan loop * Add comment about moving procfs parsing * Add more direct comment * Update initialism style to match go style guide * Put function args on the same line * Add TODO in front of comment about procfs extraction * Guard against strings.Fields returning an empty slice * Be more defensive about ARP table format and use upcase more broadly * Enable the ARP collector by default * Add ARP collector to the README * Remove 'entry'	2017-04-11 17:45:19 +02:00
Johannes 'fish' Ziemke	9676f5f2dc	Merge pull request #523 from roclark/support-legacy-infiniband Add support for legacy InfiniBand drivers	2017-03-21 10:52:07 +01:00
Matt Layher	2bfe410fb7	Expand wifi collector for more interface types	2017-03-20 12:25:01 -04:00
Robert Clark	3a5917dfdc	Add support for legacy InfiniBand drivers Older versions of the OFED drivers contain 64-bit variants of the port counters and are located in a directory named 'counters_ext'. This patch includes these older metrics that have since been deprecated with OFED 4.0. Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>	2017-03-20 10:37:21 -05:00
Tobias Schmidt	0400e437be	Fix and simplify parsing of raid metrics Fixes the wrong reporting of active+total disk metrics for inactive raids. Also simplifies the code and removes a couple of redundant comments.	2017-03-19 08:03:58 -03:00
Matt Layher	69368b7f9c	Add synthetic node_wifi_station_info metric for BSS information	2017-03-16 16:24:23 -04:00
Brian Brazil	a02e469b07	Report collector success/failure and duration per scrape. (#516 ) This is in line with best practices, and also saves us 63 timeseries on a default Linux setup.	2017-03-16 17:21:00 +00:00
Tobias Schmidt	ce117d7a40	Update vendored packages	2017-02-28 18:20:24 -04:00
Tobias Schmidt	d1dfda86ee	Fix wrong end-to-end expectation	2017-02-28 16:02:43 -04:00
Ben Kochie	38cd07ebb9	Merge pull request #450 from roclark/add-infiniband infiniband: Add new collector for InfiniBand statistics	2017-02-16 14:33:19 +01:00
Ben Kochie	a097dd36b3	Merge pull request #459 from joehandzik/wip-zpool-io-cherrypick ZFS Collector: Add zpool IO statistics	2017-02-16 08:16:55 +01:00
Thorhallur Sverrisson	5ab285e098	Adding buddyinfo to end to end test.	2017-02-15 10:15:44 -06:00
Thorhallur Sverrisson	3ba15c1ddb	Adding support for /proc/buddyinfo for linux free memory fragmentation. /prod/buddyinfo returns data on the free blocks fragments available for use from the kernel. This data is useful when diagnosing possible memory fragmentation. More info can be found in: * https://lwn.net/Articles/7868/ * https://andorian.blogspot.com/2014/03/making-sense-of-procbuddyinfo.html	2017-02-15 10:15:43 -06:00
Joe Handzik	bb8b3fca88	ZFS Collector: Add zpool IO statistics Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-02-10 13:31:25 -06:00
Robert Clark	4866adcb71	Add new collector for InfiniBand statistics Add new metrics for the InfiniBand network protocol including the amount of packets sent and received, the number of times the link has been downed and how many times the link has recovered from an error state. Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>	2017-02-07 11:09:08 -06:00
Joe Handzik	8c23f5ff54	ZFS Collector: Convert dashes to underscores for metrics This fixes #442, and prevents other ZFS metrics from slipping through in the future. Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-01-31 14:11:56 -06:00
Ben Kochie	71362d45eb	Merge pull request #432 from joehandzik/wip-zfs-zfetchstats Update ZFS Collector with most non-zpool metrics	2017-01-31 08:52:41 -05:00
Joe Handzik	e5ee274a32	ZFS Collector: Move from camelcase to underscores for metric prefixes Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-01-29 15:59:01 -06:00
Ben Kochie	5a6db5c8d2	Handle multiple NFS device mounts It's possible to mount an NFS share in multiple locations. * Duplicates contain the same metric values, so they can be ignored. * Update fixture.	2017-01-24 13:44:08 +01:00
Joe Handzik	94fb93a9f3	ZFS Collector: Add dmu_tx functionality Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-01-23 16:41:15 -06:00
Joe Handzik	07c7ae733a	ZFS Collector: Add fm functionality Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-01-23 16:31:22 -06:00
Joe Handzik	05048c067d	ZFS Collector: Add xuio_stats functionality Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-01-23 16:30:37 -06:00
Joe Handzik	3c9e779989	ZFS Collector: Add vdev_cache_stats functionality Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-01-23 16:29:50 -06:00
Joe Handzik	a02ca9502c	ZFS Collector: Add zil functionality Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-01-23 16:29:00 -06:00
Joe Handzik	a3125ab4d9	ZFS Collector: Add zfetchstats functionality Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-01-23 16:28:11 -06:00
Johannes 'fish' Ziemke	2884181cce	Merge pull request #415 from mdlayher/mountstats-nfs-additional Add NFS event metrics to mountstats collector	2017-01-12 14:08:21 +01:00
Matt Layher	e3f99e13b9	Add NFS event metrics to mountstats collector	2017-01-11 11:41:13 -05:00
Matt Layher	efa25665ec	Add initial wifi collector, bump netlink to fix 32-bit builds	2017-01-11 10:08:44 -05:00
Johannes 'fish' Ziemke	55170e8feb	Merge pull request #411 from discordianfish/hwmon-move-label-metrics Use filename as label, move 'label' to own metric	2017-01-10 12:21:18 +01:00
Ben Kochie	38a4a36061	Update end-to-end test.	2017-01-10 10:23:16 +01:00
Ben Kochie	b4fa10ca9d	Add collector for Linux EDAC Collect "Error detection and correction" metrics from memory controllers. * Supported on Linux only. * Add basic fixtures. * Enabled by default.	2017-01-10 10:14:19 +01:00
Johannes 'fish' Ziemke	6aef20f8d8	Use filename as label, move 'label' to own metric This closes #406	2017-01-09 18:33:31 +01:00
Joe Handzik	e7442d6517	end-to-end-test.sh: Add zfs plugin Enables fixture test and updates e2e-output.txt. Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-01-08 11:13:35 -06:00
Corey Stewart	10ba27bf2c	Remove FreeBSD support for zfs plugin. This also involves removing zfs_zpool code for now. Signed-Off-By: Corey Stewart <stewa169@purdue.edu> Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>	2017-01-08 11:13:35 -06:00
Christian Schwarz	f29f3873ea	Add a collector for ZFS, currently focussed on ARC stats. It is tested on FreeBSD 10.2-RELEASE and Linux (ZFS on Linux 0.6.5.4). On FreeBSD, Solaris, etc. ZFS metrics are exposed through sysctls. ZFS on Linux exposes the same metrics through procfs `/proc/spl/...`. In addition to sysctl metrics, 'computed metrics' are exposed by the collector, which are based on several sysctl values. There is some conditional logic involved in computing these metrics which cannot be easily mapped to PromQL. Not all 92 ARC sysctls are exposed right now but this can be changed with one additional LOC each.	2017-01-08 10:23:58 -06:00
Johannes 'fish' Ziemke	2e47fcb8c5	Only store relevant e2e output This makes commits ligher/more readable when updating the output.	2017-01-06 12:36:26 +01:00
Johannes 'fish' Ziemke	b68a9ec7af	Merge pull request #359 from CloudAndHeat/feature/hwmon_chip_name_metric hwmon: Provide annotation metric to link chip sysfs paths to human-readable chip types	2017-01-03 14:38:43 +01:00
Johannes 'fish' Ziemke	8e50b80d12	Convert remaining collectors to use ConstMetrics	2017-01-03 14:11:10 +01:00
Johannes 'fish' Ziemke	71ea37987f	Merge pull request #365 from EdSchouten/drbd A collector for DRBD	2016-12-25 11:04:43 +01:00
Ed Schouten	4adf7fa96c	Improve the help strings, as proposed in the code review.	2016-12-23 15:55:49 +01:00
Ed Schouten	b7daf27678	Process feedback from the code review. - Use the right number of printf() arguments. Use %q where it makes sense. - Use "DRBD" instead of "Drbd", per Go's style guide. - Add _total suffixes to counter metrics. - Mention the unit (bytes) in documentation strings once more.	2016-12-22 13:57:19 +01:00
Matt Layher	25a93e38e7	Add mountstats collector for detailed NFS statistics	2016-12-20 11:13:02 -05:00
Ed Schouten	6269f7502a	Add a collector for DRBD. This collector exposes most of the useful information that can be found in /proc/drbd. Sizes are normalised to be in bytes, as /proc/drbd uses kibibytes.	2016-12-11 11:55:28 +01:00
Ed Schouten	a696830c38	Add a collector for NFS client statistics. This change adds a new collector called "nfs" that parses the contents of /proc/net/rpc/nfs and turns it into metrics. It can be used to inspect the number of operations per type, but also to keep an eye on an extraneous number of retransmissions, which may indicate connectivity issues. I've picked the name "nfs", as most operating systems use "nfs" for the client component and "nfsd" as the server component. If we want to add stats for the NFS server as well, we'd better call such a collector "nfsd".	2016-12-09 19:58:08 +01:00
Jonas Wielicki	3efaa1a6a8	Update end-to-end tests	2016-12-01 10:00:50 +01:00
dan mcweeney	1f6b5aee39	#219 - add fixes for @samzhang111 super token	2016-11-16 14:49:57 -05:00
dan mcweeney	8d756cab50	Fixes end to end test	2016-11-16 14:47:03 -05:00
dan mcweeney	00c9a88a55	Fixes #219 - use the default to catch personalities that are unknown Assumes all raid configurations start with raid and that anything else is unknown.	2016-11-16 14:47:03 -05:00
Ed Schouten	9749c2c0b3	mdstat: Fix parsing of RAID0 lines that contain additional attributes. We seem to have a small number of Linux servers here that have lines in /proc/mdstat that cannot be parsed by the node exporter, due to them containing attributes that are not matched by the regular expression ("super 1.2"). Extend the regular expression to skip this data, just like we do for all of the other status lines.	2016-11-16 17:21:25 +01:00
Rene Treffer	abe8e297a6	Prefer device path based names over exported names (#334 ) * Prefer device path based names over exported names For some sensors (like coretemp) it is possible that multiple instances exist, thus base the name on the device path and not on the exported name. * Update end-to-end test for dual socket machines Explicitly have 2 coretemp instances with a symlink for the device such that the hwmon collector must pick that name (or fail)	2016-10-28 20:25:44 +01:00
Ben Kochie	c6162312f2	Add Linux NUMA "numastat" metrics (#249 ) * Add Linux NUMA "numastat" metrics Read the `numastat` metrics from /sys/devices/system/node/node* when reading NUMA meminfo metrics. * Update end-to-end test output. * Add `numastat` metrics as counters. * Add tests for error conditions. * Refactor meminfo numa metrics struct * Refactor meminfoKey into a simple struct of metric data. This makes it easier to pass slices of metrics around. * Refactor tests. * Fixup: Add suggested fixes. * Fixup: More fixes * Add another scanner.Err() return * Add "_total" to counter metrics.	2016-10-12 13:07:49 +02:00
Rene Treffer	081ecc5db0	Add hwmon /sensors support (#278 ) * Add hwmon support (mainly known from lm-sensors) This commit adds initial support for linux hardware sensors, exported through sysfs. Details of the interface can be found at https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface * Add end-to-end test with some real life data * Cleanup comments on hwmon collector * Drop raw sensor name from hwmon output * Let the sensor label be "sensor" * Add hwmon short description to README.	2016-10-06 16:33:24 +01:00
Ben Kochie	afac1f7433	Update mdstat fixture based on linux source. Update `Contains` matching for `resync=`	2016-09-19 16:11:16 +02:00
Ben Kochie	64b82596ef	Fix mdadm collector for resync=PENDING. Add fix for mdadm devices in state `resync=PENDING`. * Update test and fixture.	2016-09-18 08:30:20 +02:00
Julius Volz	9128952454	Fix end-to-end tests after netstat conversion	2016-08-12 01:09:20 +02:00
Thomas Frössman	32e3445d72	Fix mdstat tabs parsing	2016-08-06 14:08:11 +02:00

1 2 3 4 5 ...

283 commits