Commit graph

540 commits

Author SHA1 Message Date
Dario Maiocchi 01ec8c5c5c Remove continue with label (#1084)
Instead of continue with label use helper function
Signed-off-by: dmaiocchi <dmaiocchi@suse.com>
2018-10-05 13:20:30 +02:00
Ben Kochie a1ce712e22
Cleanup unused /proc/mounts fixture. (#1097)
* Cleanup unused /proc/mounts fixture.
* Ignore Uint -> Unit in codespell.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-10-04 18:07:12 +02:00
Mario Trangoni 3659260b66 infiniband: Handle iWARP* RDMA modules N/A (#974)
* infiniband: Add not connected i40iw0/ports/1 fixtures
* infiniband: Handle issue when iWARP* RDMA modules are not available

This is related to #966, and handle this error,

Jun 07 13:33:24 hostname node_exporter[81888]: time="2018-06-07T13:33:24+02:00" level=error msg="ERROR: infiniband
collector failed after 0.000929s: strconv.ParseUint: parsing \"N/A (no PMA)\": invalid syntax" source="collector.go:132"

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2018-10-04 15:05:59 +02:00
Yecheng Fu 0f9842f20a [continue 912] strip rootfs prefix for run in docker (#1058)
* strip rootfs prefix for run in docker
* Use `/` as default value of path.rootfs, and parse mounts from `/proc/1/mounts`.
* No need to mount `/proc` and `/sys` because we share host's PID
namespace, which allows processes within the container to see all of the
processes on the system.

Closes: #66

Signed-off-by: Ivan Mikheykin <ivan.mikheykin@flant.com>
Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-10-04 14:11:21 +02:00
Ralf Horstmann 9f820bd3ee Update cpu collector for OpenBSD 6.4 (#1094)
Starting with (not yet released) OpenBSD 6.4, sysctl KERN_CPTIME2 will
return ENODEV for offline CPUs.

SMT siblings are reported as offline when hw.smt is disabled, which is
the default since one of the later Spectre variants. So this might
affect a few systems.

For more details see:
https://cvsweb.openbsd.org/src/sys/kern/kern_sysctl.c#rev1.348

Signed-off-by: Ralf Horstmann <ralf+github@ackstorm.de>
2018-10-02 10:21:30 +02:00
Daniele Sluijters d999dacdc6 filesystem: Ignore netns/nsfs mounts (#1047)
When starting Docker containers a whole bunch of netns (network
namespace) mounts are created that the node exporter can't make any
sense of (and can't read either).

This ignores all nsfs filesystems.

Fixes #875

Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>
2018-09-26 10:45:51 +02:00
Ben Kochie 0fdc089187
Change systemd unit filtering (#1083)
* Change systemd unit filtering

Get all units from systemd and filter in Go.
* Improves compatibility with older versions of systemd.
* Improve debugging by printing when units pass the filter.
* Remove extraneous newlines from log messages.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-09-24 15:04:55 +02:00
Luca Bruno 4672ea1671 collector/timex: remove cgo dependency (#1079)
This removes the cgo import from timex collector, as it was only used
to define two constants. Those are part of the Linux kernel<->userspace
interface, thus there is no need to depend on libc to source them:
https://github.com/torvalds/linux/blob/v4.18/include/uapi/linux/timex.h

Signed-off-by: Luca Bruno <luca.bruno@coreos.com>
2018-09-20 11:51:34 +02:00
Björn Rabenstein 1c9ea46cca Update vendoring for client_golang and friends (#1076)
Signed-off-by: beorn7 <beorn@soundcloud.com>
2018-09-17 17:09:52 +02:00
Ben Kochie ebdd524123
Correctly cast Darwin memory info (#1060)
* Correctly cast Darwin memory info

* Cast stats to float64 before doing math on them to avoid integer
wrapping.
* Remove invalid `_total` suffix from gauge values.
* Handle counters in `meminfo.go`.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-09-07 22:27:52 +02:00
Marco Tulio R Braga 05e55bddad Fix typo on description of read_time_seconds_total (#1057)
Fix typo on unit description of metric `*read_time_seconds_total` from milliseconds to seconds.

Signed-off-by: Marco Tulio R Braga <marco.tulio@mtulio.eng.br>
2018-09-02 09:46:45 +02:00
Dan Fredell c52e0d3353 Fix SmartOS build #1017 (#1018)
Signed-off-by: Dan Fredell <Dan.Fredell@gmail.com>
2018-08-23 10:57:15 +00:00
James Hartig 60c827231a NRestarts or NRefused aren't available on older systemd versions (#1039)
* If NRestarts or NRefused are not available, don't ignore the unit itself
* Don't report systemd metrics (NRestarts/NRefused) that are not available

Signed-off-by: James Hartig <james@getadmiral.com>
2018-08-14 14:28:26 +02:00
Ben Kochie fe5a117831
Handle vanishing PIDs (#1043)
PIDs can vanish (exit) from /proc/ between gathering the list of PIDs
and getting all of their stats.

* Ignore file not found errors.
* Explicitly count the PIDs we find.
* Cleanup some error style issues.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-08-13 17:27:23 +02:00
Ben Kochie 0662673ad6
Disable wifi collector by default (#1037)
* Disable wifi collector by default

Disable the wifi collector by default due to suspected cashing issues and goroutine leaks.
* https://github.com/prometheus/node_exporter/issues/870
* https://github.com/prometheus/node_exporter/issues/1008

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-08-07 10:27:20 +02:00
Ben Kochie 5d23ad0ca7
Fix supervisord collector (#978)
* Replace supervisord xmlrpc library
* Use `github.com/mattn/go-xmlrpc` that doesn't leak goroutines.
* Fix uptime metric

* Use Prometheus best practices for uptime metric.
  * Use "start time" rather than "uptime".
  * Don't emit a start time if the process is down.
* Add changelog entry.
* Add example compatibility rules.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-08-06 16:54:46 +02:00
Julius Volz 2c52b8c761
systemd: Remove unneeded/unhandled error returns (#1035)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-05 16:55:25 +02:00
Christian Hoffmann 6bdc5558ec build: make staticcheck happy by using real regexp patterns #1025 (#1026)
Signed-off-by: Christian Hoffmann <mail@hoffmann-christian.info>
2018-07-30 07:57:18 +02:00
Hannes Körber 14a4f0028e Enable nfs protocol (#998)
* vendor: Update prometheus/procfs

Signed-off-by: Hannes Körber <hannes.koerber@haktec.de>

* mountstats: Use new NFS protocol field

In https://github.com/prometheus/procfs/pull/100, the NFSTransportStats
struct was expanded by a field called protocol that specifies the NFS
protocol in use, either "tcp" or "udp". This commit adds the protocol as
a label to all NFS metrics exported via the mountstats collector.

Signed-off-by: Hannes Körber <hannes.koerber@haktec.de>

* Update fixtures for UDP mount

Signed-off-by: Hannes Körber <hannes.koerber@haktec.de>
2018-07-24 00:47:12 +02:00
Johannes Wienke 5c780d132c Exclude only subdirectories of /var/lib/docker (#1003)
It is quite common to put /var/lib/docker itself on a separate partition
and that should be monitored as well.

Signed-off-by: Johannes Wienke <languitar@semipol.de>
2018-07-23 15:43:42 +02:00
Ben Kochie 23f95c8e04
Fix ntp collector thread safety (#1014)
Make the ntp collector thread safe by wrapping a mutex lock around the
leapMidnight variable.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-07-22 14:36:33 +02:00
xginn8 140b8b85c3 Filter out uninstalled systemd units when collecting all units (#1011)
fixes #567

Signed-off-by: Matthew McGinn <mamcgi@gmail.com>
2018-07-22 09:20:03 +02:00
Sven Lange 2ae8c1c7a7 Add systemd uptime metric collection (#952)
* Add systemd uptime metric collection

Signed-off-by: Sven Lange <tdl@hadiko.de>
2018-07-18 16:02:05 +02:00
neiledgar 7e4d9bd150 Update wifi stats to support multiple stations (#977) (#980)
Signed-off-by: neiledgar <neil.edgar@btinternet.com>
2018-07-16 16:02:25 +02:00
xginn8 9b97f44a70 Add a counter for refused socket unit connections, available as of systemd 239 (#995)
Signed-off-by: xginn8 <mamcgi@gmail.com>
2018-07-16 16:01:42 +02:00
Brandon Gilmore 76bbd8dd18 Use /proc/mounts instead of statfs(2) for ro state (#1002)
While the statfs(2) approach is reliable for normally mounted filesystems, the
flags returned can be inconsistent when filesystem has been remounted read-only
after encountering an error. The returned flags do accurately represent the
internal state of the filesystem, but they do not reflect whether the VFS layer
will accept writes. Instead, it makes sense to parse the current VFS mount
state from the options field in /proc/mounts since it takes precedence.

Signed-off-by: Brandon Gilmore <bgilmore@valvesoftware.com>
2018-07-16 15:56:27 +02:00
Jan Klat c4102f1175 Add sys/class/net parsing from procfs and expose its metrics (#851)
* add sys/class/net parsing from procfs and expose its metrics

Signed-off-by: Jan Klat <jenik@klatys.cz>

* change code to use int pointers per procfs change, move netclass to separate collector, change metric naming

Signed-off-by: Jan Klat <jenik@klatys.cz>

* bump year in licence, remove redundant newline, correct fixtures

Signed-off-by: Jan Klat <jenik@klatys.cz>

* fix style

Signed-off-by: Jan Klat <jenik@klatys.cz>

* change carrier changes to counter type

Signed-off-by: Jan Klat <jenik@klatys.cz>

* fix e2e output

Signed-off-by: Jan Klat <jenik@klatys.cz>

* add fixtures

Signed-off-by: Jan Klat <jenik@klatys.cz>

* update vendor, use fixtures correctly

Signed-off-by: Jan Klat <jenik@klatys.cz>

* change fixtures (device in /sys/class/net should be symlinked)

Signed-off-by: Jan Klat <jenik@klatys.cz>

* correct fixtures for 64k page, updated readme

Signed-off-by: Jan Klat <jenik@klatys.cz>
2018-07-16 15:08:18 +02:00
mknapphrt 09b4305090 Changed the way that stuck mounts are handled. If a mount fails to return, it will stop being queried until it returns. (#997)
Fixed spelling mistakes.

Update transport_generic.go

Changed to a mutex approach instead of channels and added a timeout before declaring a mount stuck.

Removed unnecessary lock channel and clarified some var names.

Fixed style nits.

Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>
2018-07-14 11:10:28 +02:00
xginn8 ac5a981761 Adding socket stat collection for systemd socket units (#968)
Signed-off-by: xginn8 <mamcgi@gmail.com>
2018-07-05 16:26:48 +02:00
xginn8 8af84a215d Add support for NRestarts counter introduced in systemd 235 (#992)
* Add support for NRestarts counter introduced in systemd 235

`.service` units increment this counter any time the Restart= condition is
triggered.

Signed-off-by: Matthew McGinn <mamcgi@gmail.com>
2018-07-05 13:31:45 +02:00
Ben Kochie 107e5dfecc
Fix mdadm collector issues (#985)
* Send "Personality unknown" to debug, not info, remove unnecessary newline.
* Add support for "linear" personality.
* Always set number of active disks to 0 when a device is inactive.
* Add total disks calculation to unknown personalites.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-07-02 12:38:20 +02:00
Derek Marcotte 2678d68dcc Fix for #945, cpu temperature is signed. (#965)
* Fix for #945, cpu temperature is signed.

Added a type conversion to cpu temperature sysctl.  Will still
collect/report -1 when the value is -1, this is because it should be up
to interpretation whether this is the correct value for the system or
not.

Some drivers will report -1 for cpu temperature.  Other sensors will
report "an input into the fan control algorithm", i.e. not the actual
temperature, but how much fan it wants.  Some people cool their machines
with liquid nitrogen.

Signed-off-by: Derek Marcotte <554b8425@razorfever.net>
2018-06-07 15:01:25 +02:00
Brad Beam e3cf1d5187 Adding support for evaluating octal characters in mountpoint (#954)
Signed-off-by: Brad Beam <brad.beam@b-rad.info>
2018-06-06 16:49:19 +02:00
Pavlo Kutishchev 456bf5094a Add processes exporter (#950)
* Add processes exporter

Signed-off-by: Pavel Kutishchev <pavel.kutishchev@olx.com>
Signed-off-by: Ben Kochie <superq@gmail.com>
2018-06-05 19:38:32 +02:00
Alexey Kopytov dd98a09bb2 A couple of ARM64-related fixes (#934)
* Do not rely on AArch64 CPUs to support 32-bit ARM for cross-testing.

Signed-off-by: Alexey Kopytov <akopytov@gmail.com>

* aarch64 like ppc64le reports 64k node_sockstat_TCP_mem_bytes due to 64k pages.

Signed-off-by: Alexey Kopytov <akopytov@gmail.com>
2018-05-14 15:55:49 +02:00
Steve Kotsopoulos 84dc362b05 Align Darwin disk stat names with Linux (#930)
Signed-off-by: Steve Kotsopoulos <sk@fywss.com>
2018-05-02 11:32:55 +02:00
Mario Trangoni 24a28fcc9e Remove unused func, var, and const (#928)
Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2018-04-29 14:35:43 +02:00
Mario Trangoni c9f421d0dd Fix some golint issues (#927)
* collector/cpu_*: rename nodeCpuSecondsDesc to nodeCPUSecondsDesc

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>

* collector/qdisc_linux.go: add NewQdiscStatCollector comment

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>

* collector/cpu_linux.go: rename core_map to coreMap

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2018-04-29 14:34:47 +02:00
Ben Kochie 361b5bf85d
Merge pull request #852 from prometheus/remove-gmond
Remove gmond collector
2018-04-27 10:02:16 +02:00
Ben Kochie b10ca77680
Fix /proc/net/dev/ interface name handling
* Allow any character (UTF-8) for Linux interface names.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-04-18 12:53:59 +02:00
Ben Kochie 1ab4a460c7 Update ppc64le end-to-end fixture.
Signed-off-by: Ben Kochie <superq@gmail.com>
2018-04-18 09:12:21 +02:00
Johannes 'fish' Ziemke fd66a86a30 Remove gmond collector
Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
2018-04-17 20:20:24 +02:00
Ben Kochie 0f5be132ac
Merge pull request #904 from prometheus/superq/if_alias
Fix parsing of interface aliases in netdev linux
2018-04-17 13:37:21 +02:00
Ben Kochie a528966dcd Fix parsing of interface aliases in netdev linux
Very old kernels expose interface aliases as `foo0:0`, adjust the line
parsing to handle these names.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-04-17 13:15:02 +02:00
Ben Kochie f6008b242b
Merge pull request #901 from mischief/bsd_boottime
collector: implement node_boot_time_seconds for OpenBSD/NetBSD/Darwin
2018-04-17 07:48:39 +02:00
Jürgen Hötzel de0632c2e9 Fix memory corruption when number of filesystems > 16 (#900)
Signed-off-by: Juergen Hoetzel <juergen@archlinux.org>
2018-04-16 12:39:15 +02:00
mischief 26a385d7ab collector: implement node_boot_time_seconds for OpenBSD/NetBSD/Darwin
Signed-off-by: mischief <mischief@offblast.org>
2018-04-15 08:26:46 +00:00
Ben Kochie 015b86670a
Update ppc64le e2e output.
Signed-off-by: Ben Kochie <superq@gmail.com>
2018-04-14 15:28:06 +02:00
Ben Kochie 0507b0c9a2
Fix formatting.
Signed-off-by: Ben Kochie <superq@gmail.com>
2018-04-14 15:02:20 +02:00
Dmitriy Lukyanchikov eddd1b9357 Fix netdev collector for linux (#890)
fix variable name, fix transmitHeader extracting
modify fixtures to run tests with updated netdev_linux collector

Signed-off-by: dmitriy-lukyanchikov <d.lukyanchikov@anchorfree.com>
2018-04-14 13:58:56 +02:00