Commit graph

379 commits

Author SHA1 Message Date
Joe Handzik a3125ab4d9 ZFS Collector: Add zfetchstats functionality
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-23 16:28:11 -06:00
Ben Kochie acb495ccab Merge pull request #425 from mdlayher/wifi-update
Update vendored wifi, handle stations with missing info
2017-01-20 08:43:44 -05:00
Matt Layher dfd661a633
Allow graceful failure in hwmon collector 2017-01-17 11:24:28 -05:00
Matt Layher ca3f07feef
Update vendored wifi, handle stations with missing info 2017-01-17 00:54:18 -05:00
Ben Kochie 92537020a3 Fix runit collector flag typo. 2017-01-16 23:41:33 +01:00
Julius Volz 276112c7ef Merge pull request #418 from mdlayher/wifi-graceful-fail
Make wifi collector fail gracefully if metrics not available
2017-01-13 20:31:21 -05:00
Matt Layher d3089f2ce8
Make wifi collector fail gracefully if metrics not available 2017-01-13 13:35:20 -05:00
Matt Layher 1e1775e761
Make ZFS collector fail gracefully when not available 2017-01-12 12:54:16 -05:00
Johannes 'fish' Ziemke 2884181cce Merge pull request #415 from mdlayher/mountstats-nfs-additional
Add NFS event metrics to mountstats collector
2017-01-12 14:08:21 +01:00
Matt Layher e3f99e13b9
Add NFS event metrics to mountstats collector 2017-01-11 11:41:13 -05:00
Matt Layher efa25665ec
Add initial wifi collector, bump netlink to fix 32-bit builds 2017-01-11 10:08:44 -05:00
Johannes 'fish' Ziemke 55170e8feb Merge pull request #411 from discordianfish/hwmon-move-label-metrics
Use filename as label, move 'label' to own metric
2017-01-10 12:21:18 +01:00
Ben Kochie 38a4a36061 Update end-to-end test. 2017-01-10 10:23:16 +01:00
Ben Kochie b4fa10ca9d Add collector for Linux EDAC
Collect "Error detection and correction" metrics from memory
controllers.
* Supported on Linux only.
* Add basic fixtures.
* Enabled by default.
2017-01-10 10:14:19 +01:00
Johannes 'fish' Ziemke 6aef20f8d8 Use filename as label, move 'label' to own metric
This closes #406
2017-01-09 18:33:31 +01:00
Joe Handzik e7442d6517 end-to-end-test.sh: Add zfs plugin
Enables fixture test and updates e2e-output.txt.

Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-08 11:13:35 -06:00
Corey Stewart 10ba27bf2c Remove FreeBSD support for zfs plugin.
This also involves removing zfs_zpool code for now.

Signed-Off-By: Corey Stewart <stewa169@purdue.edu>
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-08 11:13:35 -06:00
Corey Stewart a8c94d48e6 Style changes and cleanup
This patch makes stylistic changes to error strings, unexports method names by lower casing them, removes unused dataSetMetric, and adds copyright/licence information.

Signed-Off-By: Corey Stewart <stewa169@purdue.edu>
2017-01-08 10:23:58 -06:00
Christian Schwarz f29f3873ea Add a collector for ZFS, currently focussed on ARC stats.
It is tested on FreeBSD 10.2-RELEASE and Linux (ZFS on Linux 0.6.5.4).

On FreeBSD, Solaris, etc. ZFS metrics are exposed through sysctls.
ZFS on Linux exposes the same metrics through procfs `/proc/spl/...`.

In addition to sysctl metrics, 'computed metrics' are exposed by
the collector, which are based on several sysctl values.
There is some conditional logic involved in computing these metrics
which cannot be easily mapped to PromQL.

Not all 92 ARC sysctls are exposed right now but this can be changed
with one additional LOC each.
2017-01-08 10:23:58 -06:00
Johannes 'fish' Ziemke 2e47fcb8c5 Only store relevant e2e output
This makes commits ligher/more readable when updating the output.
2017-01-06 12:36:26 +01:00
Johannes 'fish' Ziemke ad2eb4a788 Use Gauge for megacli counters
Without refactoring this to use const metrics, we need to make this a
gauge to we can keep using Set() which was deprecated for counters.
2017-01-06 12:33:21 +01:00
Johannes 'fish' Ziemke 01a9a37556 Stop using deprecated SetMetricFamilyInjectionHook 2017-01-06 12:21:12 +01:00
Johannes 'fish' Ziemke 3e266e28b9 Merge pull request #397 from dominikh/freebsd-cpu
Collect CPU temperatures on FreeBSD
2017-01-05 17:32:48 +01:00
Johannes 'fish' Ziemke fc1113cd11 Merge pull request #396 from dominikh/bsd-memleak
Don't leak or race in FreeBSD devstat collector
2017-01-05 17:31:57 +01:00
Dominik Honnef d827db8e17 Better error handling when collecting CPU temps
Log why we couldn't collect the temperature, and set metric to NaN if
the CPU should support temperature collection but had an error.
2017-01-05 15:19:56 +01:00
Johannes 'fish' Ziemke 91f4781234 Merge pull request #311 from kpettijohn/solaris-loadavg
Added loadavg collector for Solaris
2017-01-05 11:49:16 +01:00
Dominik Honnef 9847257bc0 Add missing license headers 2017-01-05 06:18:34 +01:00
Dominik Honnef 782eaee100 Collect CPU temperatures on FreeBSD 2017-01-05 06:17:16 +01:00
Dominik Honnef 38c5890428 Reuse devinfo struct
The devstat API expects us to reuse one devinfo for many invocations of
devstat_getstats. In particular, it allocates and resizes memory
referenced by devinfo.
2017-01-05 05:38:26 +01:00
Dominik Honnef ea55d0f5cb Don't race in FreeBSD devstat collector
Querying the number of devices separately from the device list itself is
racy. Devices may be added or removed between the two calls; and removed
devices would lead to a segfault.
2017-01-05 05:38:26 +01:00
Dominik Honnef 5e220c1665 Move cgo portions of FreeBSD devstat collector into own file
Embedding 100 lines of code in a comment doesn't make for good reading,
editing or code quality.
2017-01-05 05:38:26 +01:00
Dominik Honnef 20ca0f1376 Eliminate memory leak in FreeBSD devstat collector
The memory allocated by calloc was never freed. Since the devinfo struct
never leaves the function, anyway, we might as well just allocate it on
the stack.
2017-01-05 05:38:26 +01:00
Dominik Honnef 732dd67729 Fix build of cpu_freebsd.go
Corrects an incorrect merge in 8e50b80
2017-01-05 03:16:51 +01:00
Kevin Pettijohn d2fbeeb3c3 Added loadavg collector for solaris
It seems solaris prefers "sys/loadavg.h" over "stdlib.h" when
fetching the load average.

For Illumos based OSes it was required to include "sys/time.h" to
ensure that "hrtime_t" was defined.

https://www.illumos.org/issues/6002

It also required setting the ldflags "-fno-stack-protector -lssp" to
avoid undefined symbols when linking with gcc.

/opt/local/go/pkg/tool/solaris_amd64/link: running gcc failed: exit status 1
Undefined                       first referenced
 symbol                             in file
 __stack_chk_fail                    /tmp/go-link-138622936/000002.o
 __stack_chk_guard                   /tmp/go-link-138622936/000002.o
2017-01-04 17:45:40 -08:00
Johannes 'fish' Ziemke f9d3f830cb Merge pull request #399 from discordianfish/fish-fs-uniq-metric
Make sure we only return one metric per mounted fs
2017-01-04 16:48:04 +01:00
Johannes 'fish' Ziemke 4c9131b7d8 Make sure we only return one metric per mounted fs 2017-01-04 16:45:25 +01:00
Johannes 'fish' Ziemke 6dd39b15c2 Do not build meminfo on freebsd 2017-01-04 16:02:49 +01:00
Johannes 'fish' Ziemke a97ff2bcda Do not build meminfo on windows 2017-01-04 15:16:13 +01:00
Johannes 'fish' Ziemke d17b1b44a6 Merge pull request #398 from prometheus/fish-netdev-check-scan-errror
Check for errors in netdev scanner
2017-01-03 16:00:08 +01:00
Johannes 'fish' Ziemke 9969f93e7d Merge pull request #387 from discordianfish/fish-fix-meminfo-darwin
Refactor meminfo and add darwin metrics
2017-01-03 14:50:52 +01:00
Johannes 'fish' Ziemke 6576571ac8 Check for errors in netdev scanner 2017-01-03 14:48:52 +01:00
Johannes 'fish' Ziemke 26c6182c84 Move comment and remove superfluous newline 2017-01-03 14:41:05 +01:00
Johannes 'fish' Ziemke b68a9ec7af Merge pull request #359 from CloudAndHeat/feature/hwmon_chip_name_metric
hwmon: Provide annotation metric to link chip sysfs paths to human-readable chip types
2017-01-03 14:38:43 +01:00
Johannes 'fish' Ziemke 4e696d5d31 Merge pull request #391 from discordianfish/fish-add-cpu-darwin
Add cpu collector for darwin
2017-01-03 14:23:50 +01:00
Johannes 'fish' Ziemke 079fd701a0 Merge pull request #389 from prometheus/fish-use-const-metrics
Convert remaining collectors to use ConstMetrics
2017-01-03 14:22:58 +01:00
Johannes 'fish' Ziemke d2ca252457 Merge pull request #393 from discordianfish/fish-add-netdev-darwin
Add netdev collector for darwin
2017-01-03 14:12:36 +01:00
Johannes 'fish' Ziemke 8e50b80d12 Convert remaining collectors to use ConstMetrics 2017-01-03 14:11:10 +01:00
Johannes 'fish' Ziemke 3db2f442ae Limit node-exporter scope, deprecated collectors 2017-01-03 14:03:23 +01:00
Johannes 'fish' Ziemke c21c59dfeb Add meminfo stats for Darwin 2017-01-03 11:22:46 +01:00
Johannes 'fish' Ziemke 2983c4a31d Refactor meminfo collector similar to filesystem
Instead of doing the whole metric exposition in a platform specific collector
implementation, this creates and updates the metrics in meminfo.go and
expected a platform specific implementation of getMemInfo on
*meminfoCollector.
2017-01-03 11:20:36 +01:00
Johannes 'fish' Ziemke 3c47ef8e60 Add netdev collector for darwin
Same as for openbsd, this is just slightly adjusted from freebsd
variant.
2016-12-29 19:17:15 +01:00
Dominik Honnef f0adcd163d Implement CPU collector on FreeBSD without cgo 2016-12-29 04:29:52 +01:00
Dominik Honnef d2a43f7d05 Implement meminfo on BSD without cgo
This removes some error handling, which should be fine. If the calls
fail, we will get the zeroes, which is a safe enough fallback.
Additionally, if the first sysctl (page_size) succeeded it is unlikely
that other ones will fail.
2016-12-29 02:19:21 +01:00
Johannes 'fish' Ziemke 050d6f7f13 Add cpu collector for darwin 2016-12-28 18:38:52 +01:00
Dominik Honnef 0f6191987e Implement file systems on FreeBSD without cgo
The code may also work for other BSDs, but I don't have access to those
for testing.
2016-12-26 23:06:17 +01:00
Dominik Honnef 54c74923ee Implement loadavg on FreeBSD without cgo
The code may also work for other BSDs, but I don't have access to those
for testing.
2016-12-26 23:06:05 +01:00
Ben Kochie 10e525ff02 Merge pull request #375 from prometheus/fish-add-runit-servicedir-flag
Add runit service dir flag
2016-12-26 13:01:51 +01:00
Johannes 'fish' Ziemke d506b2266c Merge pull request #374 from prometheus/fish-add-filesystem-errors
Add node_filesystem_device_errors_total metric
2016-12-26 11:51:14 +01:00
Bjørn Forsman 64e637cbcc Ignore autofs filesystems on linux
node_exporter currently triggers autofs to mount the underlying
filesystem on every scrape. This is undesirable. Better ignore autofs.

The underlying filesystem that autofs mounts will be monitored though,
when the (real) filesystem is mounted.
2016-12-25 15:13:45 +01:00
Johannes 'fish' Ziemke 71ea37987f Merge pull request #365 from EdSchouten/drbd
A collector for DRBD
2016-12-25 11:04:43 +01:00
Ed Schouten b0d15eaac6 Reduce the severity of these messages.
They get printed all the time, as there are some tokens in the /proc
file that we simply don't support. It's better to keep these as
debugging messages, which may come in useful if new tags start to
appear.
2016-12-23 15:57:46 +01:00
Ed Schouten 4adf7fa96c Improve the help strings, as proposed in the code review. 2016-12-23 15:55:49 +01:00
Ed Schouten b7daf27678 Process feedback from the code review.
- Use the right number of printf() arguments. Use %q where it makes sense.
- Use "DRBD" instead of "Drbd", per Go's style guide.
- Add _total suffixes to counter metrics.
- Mention the unit (bytes) in documentation strings once more.
2016-12-22 13:57:19 +01:00
Björn Rabenstein 08c9347e88 Merge pull request #367 from mdlayher/mountstats
Add mountstats collector for detailed NFS statistics
2016-12-20 17:20:41 +01:00
Matt Layher 25a93e38e7
Add mountstats collector for detailed NFS statistics 2016-12-20 11:13:02 -05:00
Johannes 'fish' Ziemke 9039a425d0 Add runit service dir flag 2016-12-19 13:10:38 +01:00
Johannes 'fish' Ziemke deebf0aa49 Add node_filesystem_device_errors_total metric
This metric is the total number of errors occurred when getting stats
for the given device.
2016-12-19 11:48:32 +01:00
Ed Schouten d1fa279105 Use a descriptive name for the file descriptor. 2016-12-16 11:45:14 +01:00
Ben Kochie 677ed28575 Merge pull request #361 from lucasbergman/mips-build-fix
mips64 build fix
2016-12-16 11:39:53 +01:00
Ed Schouten 6ff620e387 Properly propagate parse errors. 2016-12-16 11:36:36 +01:00
Ed Schouten 6269f7502a Add a collector for DRBD.
This collector exposes most of the useful information that can be found
in /proc/drbd. Sizes are normalised to be in bytes, as /proc/drbd uses
kibibytes.
2016-12-11 11:55:28 +01:00
Ed Schouten a696830c38 Add a collector for NFS client statistics.
This change adds a new collector called "nfs" that parses the contents
of /proc/net/rpc/nfs and turns it into metrics. It can be used to
inspect the number of operations per type, but also to keep an eye on an
extraneous number of retransmissions, which may indicate connectivity
issues.

I've picked the name "nfs", as most operating systems use "nfs" for the
client component and "nfsd" as the server component. If we want to add
stats for the NFS server as well, we'd better call such a collector
"nfsd".
2016-12-09 19:58:08 +01:00
Jonas Wielicki 3efaa1a6a8 Update end-to-end tests 2016-12-01 10:00:50 +01:00
Jonas Wielicki c481dd19da Re-introduce human-readable chip types
The chip label generation has been changed in #334 to prefer the
unique device path (e.g. the location on the PCI bus) due to #333.

Here, a new annotation metric ``node_hwmon_chip_names`` is
introduced which allows to link the unique chip sysfs path to a
human-readable chip name which may not be unique among chip sysfs
paths (for example, dual-slot systems have multiple
chipType="coretemp" sensors).

This allows to mitigate the downsides of the solution to #333
(namely that the device path may not be stable across kernels and
reboots) for cases where it does not matter that multiple devices
may have the same human-readable name (e.g. aggregation or where
at most one device with a common chip name is present).

For cases where no human-readable name can be derived, the
annotation metric is not emitted.
2016-12-01 09:59:52 +01:00
Lucas Bergman 4f479e55e0 linux/mips: Unbreak the build
Specifically, uname syscall support on Linux is controlled by a build
tag white list, and both mips64 platforms were missing from the list.
2016-11-30 13:13:49 -06:00
Ben Kochie f8af350ae2 Merge pull request #346 from mcdan/people/mcdan/issues/219
Fix additional mdadm parsing cases
2016-11-17 21:13:38 +01:00
dan mcweeney 13aa37025f Feedback on PR, thanks @tcolgate for the review 2016-11-17 10:23:01 -05:00
Ben Kochie 4fd03c31e4 Merge pull request #323 from stuartnelson3/dfly-devstat
Dragonfly devstat
2016-11-17 13:33:50 +01:00
Ben Kochie 7a9aad01b4 Merge pull request #310 from stuartnelson3/dfly-cpu
export DragonFlyBSD CPU time
2016-11-17 13:33:11 +01:00
stuart nelson e589a2b8af Remove gauges and convert to NewConstMetric format 2016-11-17 13:23:54 +01:00
stuart nelson 2b74cf7498 Export devstat for dragonfly 2016-11-17 13:23:54 +01:00
dan mcweeney 1f6b5aee39 #219 - add fixes for @samzhang111 super token 2016-11-16 14:49:57 -05:00
dan mcweeney 8d756cab50 Fixes end to end test 2016-11-16 14:47:03 -05:00
dan mcweeney 00c9a88a55 Fixes #219 - use the default to catch personalities that are unknown
Assumes all raid configurations start with raid and that anything
else is unknown.
2016-11-16 14:47:03 -05:00
Ed Schouten 9749c2c0b3 mdstat: Fix parsing of RAID0 lines that contain additional attributes.
We seem to have a small number of Linux servers here that have lines in
/proc/mdstat that cannot be parsed by the node exporter, due to them
containing attributes that are not matched by the regular expression
("super 1.2").

Extend the regular expression to skip this data, just like we do for all
of the other status lines.
2016-11-16 17:21:25 +01:00
Rene Treffer abe8e297a6 Prefer device path based names over exported names (#334)
* Prefer device path based names over exported names

For some sensors (like coretemp) it is possible that multiple
instances exist, thus base the name on the device path and not on
the exported name.

* Update end-to-end test for dual socket machines

Explicitly have 2 coretemp instances with a symlink for the device
such that the hwmon collector must pick that name (or fail)
2016-10-28 20:25:44 +01:00
Ben Kochie c6162312f2 Add Linux NUMA "numastat" metrics (#249)
* Add Linux NUMA "numastat" metrics
  Read the `numastat` metrics from /sys/devices/system/node/node* when reading NUMA meminfo metrics.
* Update end-to-end test output.
* Add `numastat` metrics as counters.
* Add tests for error conditions.
* Refactor meminfo numa metrics struct
* Refactor meminfoKey into a simple struct of metric data.
  This makes it easier to pass slices of metrics around.
* Refactor tests.
* Fixup: Add suggested fixes.
* Fixup:  More fixes
* Add another scanner.Err() return
* Add "_total" to counter metrics.
2016-10-12 13:07:49 +02:00
Rene Treffer 081ecc5db0 Add hwmon /sensors support (#278)
* Add hwmon support (mainly known from lm-sensors)

This commit adds initial support for linux hardware sensors, exported
through sysfs.

Details of the interface can be found at
https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface

* Add end-to-end test with some real life data

* Cleanup comments on hwmon collector

* Drop raw sensor name from hwmon output

* Let the sensor label be "sensor"

* Add hwmon short description to README.
2016-10-06 16:33:24 +01:00
stuart nelson 450fe0f3ba Add test 2016-09-28 09:10:05 +02:00
stuart nelson cf3710191a Compile meminfo for dfly (#315)
* Compile meminfo for dfly

* Update README.me
2016-09-28 08:08:19 +01:00
stuart nelson ef1925db7d Compile netdev on dragonfly (#314)
* Compile netdev on dragonfly

* Only run netdev bsd test on bsd

* Update README.md
2016-09-27 21:44:13 +01:00
stuart nelson ee37a27d91 Export values as uint64_t 2016-09-20 23:27:56 +02:00
stuart nelson e942d7e234 Maintain granularity in cpu data
Export cpu mode times as original uint64_t data,
and update frequency, and do the conversion to
float64 and subsequent division in go.
2016-09-20 09:10:53 +02:00
Ben Kochie afac1f7433 Update mdstat fixture based on linux source.
Update `Contains` matching for `resync=`
2016-09-19 16:11:16 +02:00
stuart nelson 57f88ac4f6 Update comment 2016-09-19 09:48:53 +02:00
stuart nelson 78c84b1a47 Remove old freq finding code
This is the code that was lifted from the freebsd
implementation, but was not correct.
2016-09-19 09:48:34 +02:00
stuart nelson 45ac033d9e Use correct frequency for calculating cpu time
The correct frequency is the systimer frequency,
not the stathz.

From one of the DragonFly developers:

The bump upon each statclock is:
((cur_systimer - prev_systimer) * systimer_freq) >> 32

systimer_freq can be extracted from following
sysctl in userspace:
sysctl kern.cputimer.freq
2016-09-19 09:35:41 +02:00
stuart nelson 8cc06aab04 Remove unneeded ncpu variable 2016-09-18 17:36:39 +02:00
stuart nelson 9f7822ccdc Remember to bzero string
Duplication was caused by malloc returning a
region of memory that already had data in it.
2016-09-18 16:17:49 +02:00
stuart nelson c02dcdeb35 Remove unused comment. 2016-09-18 14:21:54 +02:00