node_exporter

mirror of https://github.com/prometheus/node_exporter.git synced 2025-08-20 18:33:52 -07:00

Author	SHA1	Message	Date
Ben Kochie	a1ce712e22	Cleanup unused /proc/mounts fixture. (#1097 ) * Cleanup unused /proc/mounts fixture. * Ignore Uint -> Unit in codespell. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-10-04 18:07:12 +02:00
Mario Trangoni	3659260b66	infiniband: Handle iWARP* RDMA modules N/A (#974 ) * infiniband: Add not connected i40iw0/ports/1 fixtures * infiniband: Handle issue when iWARP* RDMA modules are not available This is related to #966, and handle this error, Jun 07 13:33:24 hostname node_exporter[81888]: time="2018-06-07T13:33:24+02:00" level=error msg="ERROR: infiniband collector failed after 0.000929s: strconv.ParseUint: parsing \"N/A (no PMA)\": invalid syntax" source="collector.go:132" Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>	2018-10-04 15:05:59 +02:00
Yecheng Fu	0f9842f20a	[continue 912] strip rootfs prefix for run in docker (#1058 ) * strip rootfs prefix for run in docker * Use `/` as default value of path.rootfs, and parse mounts from `/proc/1/mounts`. * No need to mount `/proc` and `/sys` because we share host's PID namespace, which allows processes within the container to see all of the processes on the system. Closes: #66 Signed-off-by: Ivan Mikheykin <ivan.mikheykin@flant.com> Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>	2018-10-04 14:11:21 +02:00
gentlejo	2269df255c	Add node_exporter script for init.d (#1059 ) * Add node_exporter script for init.d Signed-off-by: gentlejo <josungil@gmail.com>	2018-10-04 13:57:49 +02:00
Andrew Banchich	5da107b02c	Add missing words and update markdown syntax (#1095 ) Signed-off-by: Andrew Banchich <andrewbanchich@gmail.com>	2018-10-03 09:03:25 +02:00
Ralf Horstmann	9f820bd3ee	Update cpu collector for OpenBSD 6.4 (#1094 ) Starting with (not yet released) OpenBSD 6.4, sysctl KERN_CPTIME2 will return ENODEV for offline CPUs. SMT siblings are reported as offline when hw.smt is disabled, which is the default since one of the later Spectre variants. So this might affect a few systems. For more details see: https://cvsweb.openbsd.org/src/sys/kern/kern_sysctl.c#rev1.348 Signed-off-by: Ralf Horstmann <ralf+github@ackstorm.de>	2018-10-02 10:21:30 +02:00
Ben Kochie	5a461d261c	Add linux/s390x build (#1092 ) Signed-off-by: Ben Kochie <superq@gmail.com>	2018-09-30 16:45:32 +02:00
Ben Kochie	526eac15c5	Add ppc64 build. (#1089 ) Add ppc64 build.	2018-09-30 13:45:47 +02:00
Fabian Heymann	2f381f0c44	Update dependency mattn/go-xmlrpc (#1091 ) Signed-off-by: Fabian Heymann <fabian.heymann@finanzcheck.de>	2018-09-30 09:27:14 +02:00
Daniele Sluijters	d999dacdc6	filesystem: Ignore netns/nsfs mounts (#1047 ) When starting Docker containers a whole bunch of netns (network namespace) mounts are created that the node exporter can't make any sense of (and can't read either). This ignores all nsfs filesystems. Fixes #875 Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>	2018-09-26 10:45:51 +02:00
Ben Kochie	c7dfb82dac	Update build (#1081 ) * Update build * Only use CGO when building non-Linux. * Update build to Go 1.11 * Use tab indenting consistently. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-09-25 16:02:42 +02:00
Ben Kochie	0fdc089187	Change systemd unit filtering (#1083 ) * Change systemd unit filtering Get all units from systemd and filter in Go. * Improves compatibility with older versions of systemd. * Improve debugging by printing when units pass the filter. * Remove extraneous newlines from log messages. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-09-24 15:04:55 +02:00
Luca Bruno	4672ea1671	collector/timex: remove cgo dependency (#1079 ) This removes the cgo import from timex collector, as it was only used to define two constants. Those are part of the Linux kernel<->userspace interface, thus there is no need to depend on libc to source them: https://github.com/torvalds/linux/blob/v4.18/include/uapi/linux/timex.h Signed-off-by: Luca Bruno <luca.bruno@coreos.com>	2018-09-20 11:51:34 +02:00
Christopher Blum	6aa5cfba6c	textfile example script rework (#1074 ) * textfile smartmon.sh Added functions to also parse megaraid disks. Added parsing to also detect the grown_defects counters. * textfile storcli.py Reworked the example file to export lots more information about megaraid attached controllers, VDs and PDs. Signed-off-by: Christopher Blum <christopher.blum@profitbricks.com>	2018-09-18 22:43:20 +02:00
Björn Rabenstein	1c9ea46cca	Update vendoring for client_golang and friends (#1076 ) Signed-off-by: beorn7 <beorn@soundcloud.com>	2018-09-17 17:09:52 +02:00
Mateusz Piotrowski	b46cd80200	Note how to get moreutils on FreeBSD (#1073 ) Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org>	2018-09-14 14:14:45 +02:00
Ben Kochie	ebdd524123	Correctly cast Darwin memory info (#1060 ) * Correctly cast Darwin memory info * Cast stats to float64 before doing math on them to avoid integer wrapping. * Remove invalid `_total` suffix from gauge values. * Handle counters in `meminfo.go`. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-09-07 22:27:52 +02:00
Marco Tulio R Braga	05e55bddad	Fix typo on description of read_time_seconds_total (#1057 ) Fix typo on unit description of metric `*read_time_seconds_total` from milliseconds to seconds. Signed-off-by: Marco Tulio R Braga <marco.tulio@mtulio.eng.br>	2018-09-02 09:46:45 +02:00
Tariq Ibrahim	834e35112c	Using the recommended syntax for maintainer label (#1053 ) Signed-off-by: Tariq Ibrahim <tariq.ibrahim@microsoft.com>	2018-08-28 19:28:58 +02:00
Dan Fredell	c52e0d3353	Fix SmartOS build #1017 (#1018 ) Signed-off-by: Dan Fredell <Dan.Fredell@gmail.com>	2018-08-23 10:57:15 +00:00
Matt Bostock	9e0aee8ae7	Add metrics exposing extended md RAID info (#958 ) Add metrics that expose more information about MD RAID devices and disks: - the RAID level in use - the RAID set that a disk belongs to This allows for things like alert on unusually high I/O utilisation for a disk compared to other disks in the same RAID set, which usually means the disk is failing, and for comparing write/read latency across RAID sets. Output looks like: node_md_disk_info{disk_device="/dev/dm-0", md_device="md1", md_set="A"} 1 node_md_disk_info{disk_device="/dev/dm-3", md_device="md1", md_set="B"} 1 node_md_disk_info{disk_device="/dev/dm-2", md_device="md1", md_set="A"} 1 node_md_disk_info{disk_device="/dev/dm-1", md_device="md1", md_set="B"} 1 node_md_disk_info{disk_device="/dev/dm-4", md_device="md1", md_set="A"} 1 node_md_disk_info{disk_device="/dev/dm-5", md_device="md1", md_set="B"} 1 node_md_info{md_device="md1", md_name="foo", raid_level="10", md_metadata_version="1.2"} 1 The `node_md_info` metric, which gives additional information about the RAID array, is intentionally separate to avoid adding all of those labels to each disk. If you need to query using the labels contained in `node_md_info`, you can do that using PromQL: https://www.robustperception.io/how-to-have-labels-for-machine-roles/ I looked at adding the array UUID, but there's no sysfs entry for it and I'm not sure there's a strong use case for it. This patch to add a sysfs entry for the UUID was apparently not accepted: https://www.spinics.net/lists/raid/msg40667.html Add these metrics as a textfile script rather than adding them to the Go 'md' module as they're perhaps less commonly useful. If lots of people find them useful, we can later rewrite this in Go. Signed-off-by: Matt Bostock <mbostock@cloudflare.com>	2018-08-18 08:57:51 +00:00
Matt Layher	d84873727f	vendor: bump github.com/mdlayher/wifi and dependencies (#1045 ) Signed-off-by: Matt Layher <mdlayher@gmail.com>	2018-08-14 21:15:07 +02:00
James Hartig	60c827231a	NRestarts or NRefused aren't available on older systemd versions (#1039 ) * If NRestarts or NRefused are not available, don't ignore the unit itself * Don't report systemd metrics (NRestarts/NRefused) that are not available Signed-off-by: James Hartig <james@getadmiral.com>	2018-08-14 14:28:26 +02:00
Ben Kochie	fe5a117831	Handle vanishing PIDs (#1043 ) PIDs can vanish (exit) from /proc/ between gathering the list of PIDs and getting all of their stats. * Ignore file not found errors. * Explicitly count the PIDs we find. * Cleanup some error style issues. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-08-13 17:27:23 +02:00
Ben Kochie	099c1527f1	Update build (#1041 ) Update build * Update to Go 1.10. * Enable `ppc64le` build. * Enable MIPS builds. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-08-13 17:26:55 +02:00
Ben Kochie	0662673ad6	Disable wifi collector by default (#1037 ) * Disable wifi collector by default Disable the wifi collector by default due to suspected cashing issues and goroutine leaks. * https://github.com/prometheus/node_exporter/issues/870 * https://github.com/prometheus/node_exporter/issues/1008 Signed-off-by: Ben Kochie <superq@gmail.com>	2018-08-07 10:27:20 +02:00
Ben Kochie	5d23ad0ca7	Fix supervisord collector (#978 ) * Replace supervisord xmlrpc library * Use `github.com/mattn/go-xmlrpc` that doesn't leak goroutines. * Fix uptime metric * Use Prometheus best practices for uptime metric. * Use "start time" rather than "uptime". * Don't emit a start time if the process is down. * Add changelog entry. * Add example compatibility rules. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-08-06 16:54:46 +02:00
Julius Volz	2c52b8c761	systemd: Remove unneeded/unhandled error returns (#1035 ) Signed-off-by: Julius Volz <julius.volz@gmail.com>	2018-08-05 16:55:25 +02:00
Christian Hoffmann	6bdc5558ec	build: make staticcheck happy by using real regexp patterns #1025 (#1026 ) Signed-off-by: Christian Hoffmann <mail@hoffmann-christian.info>	2018-07-30 07:57:18 +02:00
Rene Treffer	80a5712b97	Fix sample rules for migration (#1022 ) - add conversion from _ms to _seconds on disk metrics - add missing node_textfile_mtime section - add groups: header to pass promtool check rules Signed-off-by: Rene Treffer <rene.treffer@soundcloud.com>	2018-07-27 14:27:44 +02:00
Hannes Körber	14a4f0028e	Enable nfs protocol (#998 ) * vendor: Update prometheus/procfs Signed-off-by: Hannes Körber <hannes.koerber@haktec.de> * mountstats: Use new NFS protocol field In https://github.com/prometheus/procfs/pull/100, the NFSTransportStats struct was expanded by a field called protocol that specifies the NFS protocol in use, either "tcp" or "udp". This commit adds the protocol as a label to all NFS metrics exported via the mountstats collector. Signed-off-by: Hannes Körber <hannes.koerber@haktec.de> * Update fixtures for UDP mount Signed-off-by: Hannes Körber <hannes.koerber@haktec.de>	2018-07-24 00:47:12 +02:00
Johannes Wienke	5c780d132c	Exclude only subdirectories of /var/lib/docker (#1003 ) It is quite common to put /var/lib/docker itself on a separate partition and that should be monitored as well. Signed-off-by: Johannes Wienke <languitar@semipol.de>	2018-07-23 15:43:42 +02:00
Ben Kochie	ca2fa4684b	Fix docker build (#1016 ) Fix override of make docker target to include new `DOCKER_REPO` variable pattern. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-07-23 10:56:20 +02:00
Ben Kochie	981de58fad	Update build (#1010 ) * Update from upstream `Makefile.common`. * Update CircleCI with simplifed upstream templating. * Cleanup `Makefile`. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-07-23 09:38:39 +02:00
Ben Kochie	23f95c8e04	Fix ntp collector thread safety (#1014 ) Make the ntp collector thread safe by wrapping a mutex lock around the leapMidnight variable. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-07-22 14:36:33 +02:00
xginn8	140b8b85c3	Filter out uninstalled systemd units when collecting all units (#1011 ) fixes #567 Signed-off-by: Matthew McGinn <mamcgi@gmail.com>	2018-07-22 09:20:03 +02:00
Sven Lange	2ae8c1c7a7	Add systemd uptime metric collection (#952 ) * Add systemd uptime metric collection Signed-off-by: Sven Lange <tdl@hadiko.de>	2018-07-18 16:02:05 +02:00
Ben Kochie	354115511c	Add note about SYS_TIME capability for Docker. (#1001 ) Signed-off-by: Ben Kochie <superq@gmail.com>	2018-07-16 18:30:19 +02:00
neiledgar	7e4d9bd150	Update wifi stats to support multiple stations (#977 ) (#980 ) Signed-off-by: neiledgar <neil.edgar@btinternet.com>	2018-07-16 16:02:25 +02:00
xginn8	9b97f44a70	Add a counter for refused socket unit connections, available as of systemd 239 (#995 ) Signed-off-by: xginn8 <mamcgi@gmail.com>	2018-07-16 16:01:42 +02:00
Brandon Gilmore	76bbd8dd18	Use /proc/mounts instead of statfs(2) for ro state (#1002 ) While the statfs(2) approach is reliable for normally mounted filesystems, the flags returned can be inconsistent when filesystem has been remounted read-only after encountering an error. The returned flags do accurately represent the internal state of the filesystem, but they do not reflect whether the VFS layer will accept writes. Instead, it makes sense to parse the current VFS mount state from the options field in /proc/mounts since it takes precedence. Signed-off-by: Brandon Gilmore <bgilmore@valvesoftware.com>	2018-07-16 15:56:27 +02:00
Jan Klat	c4102f1175	Add sys/class/net parsing from procfs and expose its metrics (#851 ) * add sys/class/net parsing from procfs and expose its metrics Signed-off-by: Jan Klat <jenik@klatys.cz> * change code to use int pointers per procfs change, move netclass to separate collector, change metric naming Signed-off-by: Jan Klat <jenik@klatys.cz> * bump year in licence, remove redundant newline, correct fixtures Signed-off-by: Jan Klat <jenik@klatys.cz> * fix style Signed-off-by: Jan Klat <jenik@klatys.cz> * change carrier changes to counter type Signed-off-by: Jan Klat <jenik@klatys.cz> * fix e2e output Signed-off-by: Jan Klat <jenik@klatys.cz> * add fixtures Signed-off-by: Jan Klat <jenik@klatys.cz> * update vendor, use fixtures correctly Signed-off-by: Jan Klat <jenik@klatys.cz> * change fixtures (device in /sys/class/net should be symlinked) Signed-off-by: Jan Klat <jenik@klatys.cz> * correct fixtures for 64k page, updated readme Signed-off-by: Jan Klat <jenik@klatys.cz>	2018-07-16 15:08:18 +02:00
mknapphrt	09b4305090	Changed the way that stuck mounts are handled. If a mount fails to return, it will stop being queried until it returns. (#997 ) Fixed spelling mistakes. Update transport_generic.go Changed to a mutex approach instead of channels and added a timeout before declaring a mount stuck. Removed unnecessary lock channel and clarified some var names. Fixed style nits. Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>	2018-07-14 11:10:28 +02:00
xginn8	ac5a981761	Adding socket stat collection for systemd socket units (#968 ) Signed-off-by: xginn8 <mamcgi@gmail.com>	2018-07-05 16:26:48 +02:00
xginn8	8af84a215d	Add support for NRestarts counter introduced in systemd 235 (#992 ) * Add support for NRestarts counter introduced in systemd 235 `.service` units increment this counter any time the Restart= condition is triggered. Signed-off-by: Matthew McGinn <mamcgi@gmail.com>	2018-07-05 13:31:45 +02:00
Bernd Müller	ee1e1997bc	Add scsi smart data to prometheus exporter (#862 ) Add scsi smart data to prometheus exporter Signed-off-by: mueller <mueller@b1-systems.de>	2018-07-04 00:30:20 +02:00
Ivan Kiselev	ae90bac5b8	Add example of translating new metrics to old format in case of migration to 1.16 version (#982 ) Add additional example of how to save old metrics Signed-off-by: Ivan Kiselev <ivan@messagebird.com>	2018-07-02 12:39:32 +02:00
Ben Kochie	107e5dfecc	Fix mdadm collector issues (#985 ) * Send "Personality unknown" to debug, not info, remove unnecessary newline. * Add support for "linear" personality. * Always set number of active disks to 0 when a device is inactive. * Add total disks calculation to unknown personalites. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-07-02 12:38:20 +02:00
Roman Vynar	55c32fcf02	Add compat rules for filesystem collector. (#973 ) Signed-off-by: Roman Vynar <roman.vynar@goquiq.com>	2018-06-13 18:32:07 +02:00
Matt Bostock	f56e8fcdf4	Fix spelling of celsius in IPMI example script (#967 ) 'Celsius' should be spelt with an 's': https://en.wikipedia.org/wiki/Celsius Signed-off-by: Matt Bostock <mbostock@cloudflare.com>	2018-06-08 19:21:19 +02:00

1 2 3 4 5 ...

1100 commits