Commit graph

1192 commits

Author SHA1 Message Date
Ben Kochie 01bd99fb1a
Refactor NFS client collector (#816)
* Update vendor github.com/prometheus/procfs/...

* Refactor NFS collector

Use new procfs library to parse NFS client stats.

* Ignore nfs proc file not existing.

* Refactor with reflection to walk the structs.
2018-02-15 13:40:38 +01:00
Brian Brazil 52c031890e
Add _seconds suffix to node_time. (#823) 2018-02-14 16:59:08 +00:00
Ben Kochie 05eabe60fb
Fix error output in nfsd collector. (#821) 2018-02-14 13:57:35 +01:00
Matt Layher dcb31670d6 Makefile: add checkmetrics target, use in CI (#797) 2018-02-13 18:04:03 +01:00
Ben Kochie 3de2542d21
Fix NFSd metric type (#819)
RPC Count should be a counter, not a gauge.
2018-02-13 17:03:22 +01:00
Matt Layher 544488ddd6 Fix remaining metric naming issues (#799) 2018-02-12 18:53:31 +01:00
Ben Kochie 6a041692ed
Add NFS Server metrics collector. (#803)
* Add NFS Server metrics collector.

* Add File Handles metrics.

* Add nfsd IO stats.

* Add metrics for NFSd threads.

* Add metrics for NFSd read ahead cache.

* Add NFSd network traffic counters.

* Add RPC metrics.

* Add V2 requests metrics.

* Add NFSv3 metrics.

* Add NFSv4 metrics.

* Update reply cache comment.

* Update help text.
2018-02-12 17:56:05 +01:00
Tobias Schmidt 9a5bd5f8e4
Merge pull request #815 from prometheus/debug-log
Fix log level regression in #533
2018-02-07 16:33:14 +01:00
Brian Brazil 1072f2868d Fix log level regression in #533 2018-02-07 15:16:20 +00:00
Brian Brazil 7e41a2b279 Ignore /var/lib/docker by default. (#814)
The node exporter runs unprivileged, so it cannot statfs any filesystems
under this directory causing log spam.  In addition there tends to be
high churn in the filesystems here (as it's basically application
monitoring) which can cause high cardinaltiy and in one case caused
Prometheus's index symbol table to get very large.
Accordingly this should be ignored to reduce log spam and avoid
performance issues. The filesystems themselves can in principle be
monitored via container oriented exporters, and the underlying
filesystems will still be monitored.
2018-02-06 17:10:59 +01:00
tobald 2978728b00 Fix apt.sh syntax (#811)
This patch fixes:

./apt.test: command substitution: line 19: syntax error near unexpected token `|'
./apt.test: command substitution: line 19: `  | /usr/bin/sort   | /usr/bin/uniq -c   | awk '{ gsub(/\\\\/,
2018-02-05 20:43:25 +01:00
Ralf Horstmann 29ac809e48 Use unified CPU metric description on OpenBSD (#810) 2018-02-01 23:59:19 +01:00
Derek Marcotte fde5d2c6c9 Remove unsafe typecasts from sysctl_bsd getStructTimeval. (#741)
There is a simpler way.
2018-02-01 18:43:40 +01:00
Ben Kochie 14d60958d6
Unify CPU collector conventions (#806)
* Unify CPU collector conventions

Add a common CPU metric description.
* All collectors use the same `nodeCpuSecondsDesc`.
* All collectors drop the `cpu` prefix for `cpu` label values.

* Fix subsystem string in cpu_freebsd.

* Fix Linux CPU freq label names.
2018-02-01 18:42:20 +01:00
Ralf Horstmann e3c76b1f0c Add OpenBSD CPU collector (#805) 2018-02-01 18:33:49 +01:00
Tom Wilkie 05d14ef9ee
Merge pull request #807 from tomwilkie/systemd-timers
Export systemd timers last trigger seconds.
2018-02-01 13:05:56 +00:00
Tom Wilkie 6833eec187 Fix tests. 2018-01-31 15:22:17 +00:00
Tom Wilkie 0316bacceb Only use one dbus connection, required some refactoring. 2018-01-31 15:19:18 +00:00
Tom Wilkie a7fd6b8743 Export systemd timer last trigger sec. 2018-01-31 15:07:04 +00:00
Ben Kochie f9e91156d0
Update vendoring (#801)
* Update vendor github.com/godbus/dbus@v4.1.0

* Update vendor github.com/golang/protobuf/proto

* Update vendor github.com/mdlayher/netlink/...

* Update vendor github.com/prometheus/client_golang/prometheus/...

* Update vendor github.com/prometheus/client_model/go

* Update vendor github.com/prometheus/common/...

* Update vendor github.com/prometheus/procfs/...

* Update vendor github.com/sirupsen/logrus@v1.0.4

* Update vendor golang.org/x/...

* Update vendor gopkg.in/alecthomas/kingpin.v2

* Remove obsolete vendor github.com/mdlayher/netlink/genetlink
2018-01-25 18:20:39 +01:00
Shevchenko Vitaliy 4ed49e73fb Escape double quotes in device model family (#772) 2018-01-24 11:35:14 +01:00
Ben Kochie 111e3af437
Remove obsolete megacli collector. (#798)
This collector has been replaced by the textfile collector tool
`storcli.py`.
2018-01-23 11:25:42 +01:00
Ben Kochie 1ad5ba4dc7
Fix smartmon.sh bugs (#792)
* Fix smartmon.sh info label consistency.

* Fix parsing of SMART-ID attributes <= 99.
2018-01-22 16:51:20 +01:00
Julius Volz 6cac74f0e0
Add unit suffix to textfile collector mtime metric (#796) 2018-01-22 14:02:19 +01:00
Brian Brazil a98067a294 Make metrics better follow guidelines (#787)
* Improve stat linux metric names.

cpu is no longer used.

* node_cpu -> node_cpu_seconds_total for Linux

* Improve filesystem metric names with units

* Improve units and names of linux disk stats

Remove sector metrics, the bytes metrics cover those already.

* Infiniband counters should end in _total

* Improve timex metric names, convert to more normal units.

See
3c073991eb/kernel/time/ntp.c (L909)
for what stabil means, looks like a moving average of some form.

* Update test fixture

* For meminfo metrics that had "kB" units, add _bytes

* Interrupts counter should have _total
2018-01-17 17:55:55 +01:00
Ben Kochie b4d7ba119a
Add fixture for ppc64le (#785)
* Add support for per-architecture fixtures.
* Add output for ppc64le.
2018-01-11 13:56:19 +01:00
Ben Kochie bc38ffc538
Update collect[] param documentation (#784)
Improve recommendations and wording around advanced use of the collect[]
param.

Remove example that causes users to copy-and-paste it.
2018-01-10 15:16:33 +01:00
Bruce Lee 8d3484d0ca Update storcli.py (#783) 2018-01-09 09:10:30 +01:00
Nick Owens 0629a081db multiply page size after float64 coercion to avoid signed integer overflow (#780) 2018-01-08 15:36:49 +01:00
Franz Pletz d432f9857e Use uint64 in the ZFS collector (#714)
ZFS metrics can also be unsigned 64-bit integers that won't fit in
int64 and causes the whole collector to fail.
2018-01-06 12:36:55 +01:00
zloo ae280f2b04 Add Prometheus 2.0 compatible example rules file - new YAML format (#739) 2018-01-04 12:31:25 +01:00
Derek Marcotte 477fe4665a Move FreeBSD/DragonflyBSD out of meminfo add kvm. (#547)
* Move FreeBSD/DragonflyBSD out of meminfo add kvm.

This gives us SwapUsed, and everything under one roof.

* Fix typos per review.

* Update to use newer API.

* Remove premature optimization per PR feedback.
2018-01-04 12:23:26 +01:00
Tobias Schmidt 052422ec61 Fix panic by updating github.com/ema/qdisc dependency (#778) 2018-01-04 12:13:02 +01:00
Sevag Hanssian 4329b0a86b Add summary metrics for systemd exporter (#765) 2018-01-04 11:49:36 +01:00
Ben Kochie 8f9c8a060d Update README
Add OpenBSD to supported list for meminfo collector[0].

[0]: https://github.com/prometheus/node_exporter/pull/724
2018-01-04 10:33:57 +01:00
Matthieu Guegan d6ef10bb56 Add openbsd meminfo (#724)
* Implements meminfo collector for OpenBSD

This is a rework of #151.

* Fix CGO import

* Add some useful metrics

* Rename total -> size for normalization
2018-01-04 10:32:08 +01:00
Ben Kochie 7f6c59e198
Ignore more virtual filesystems (#775)
Add additional Linux virtual filesystem types to the default list.
2018-01-03 17:22:02 +01:00
Netmonk 2aa8d0eb0c [FIX] Exclude Linux proc from filesystem type regexp (#774)
* [FIX] Issue 63, error on excluding proc filesystem on linux, improving regexp

* [FIX] Reordering filter order
2018-01-03 11:40:32 +01:00
Julius Volz f536857ac6
Fix e2e tests after textfile custom timestamp removal (#768) 2017-12-24 11:54:33 +01:00
Shubheksha Jalan 1f2458f42c Filter out testfile metrics correctly when using collect[] filters (#763)
* remove injection hook for textfile metrics, convert them to prometheus format

* add support for summaries

* add support for histograms

* add logic for handling inconsistent labels within a metric family for counter, gauge, untyped

* change logic for parsing the metrics textfile

* fix logic to adding missing labels

* Export time and error metrics for textfiles

* Add tests for new textfile collector, fix found bugs

* refactor Update() to split into smaller functions

* remove parseTextFiles(), fix import issue

* add mtime metric directly to channel, fix handling of mtime during testing

* rename variables related to labels

* refactor: add default case, remove if guard for metrics, remove extra loop and slice

* refactor: remove extra loop iterating over metric families

* test: add test case for different metric type, fix found bug

* test: add test for metrics with inconsistent labels

* test: add test for histogram

* test: add test for histogram with extra dimension

* test: add test for summary

* test: add test for summary with extra dimension

* remove unnecessary creation of protobuf

* nit: remove extra blank line
2017-12-23 20:21:58 +01:00
Ben Kochie cd2a17176a
Add full make to CircleCI (#761)
* Add full make to CircleCI

Ensure end-to-end test is run.

* Fix go fmt error.

* Fix end-to-end output.
2017-12-21 16:24:23 +01:00
Mario Trangoni a40f7e78da StorCli text collector: fix pylint issues and handle StorCli not installed (#758)
* StorCli text collector: fix pylint issues and handle StorCli not installed

* StorCli text collector: Add HELP and TYPE strings.
2017-12-12 18:48:06 +01:00
Filippo Giunchedi af4cf20b46 apt.sh: handle multiple origins in apt-get output (#757)
It might happen that a given upgrade comes from multiple origins, in
which case the origins are separated by ", " and thus breaking
whitespace-based split. For example:

Inst package [1.2.3] (1.2.4 Debian:8.10/oldstable, Debian-Security:8/oldstable [amd64])

To workaround this case, mangle the apt-get output to remove whitespaces from
the origins list.
2017-12-12 10:45:59 +01:00
Wei Li 1e9bb4ec3a textfile: fix duplicate metrics error (#738)
The textfile gatherer should only be added to gatherer list once.

Signed-off-by: Li Wei <liwei@anbutu.com>
2017-12-06 17:05:40 +01:00
Kristian Klausen a96f1738b3 netdev: Change valueType to CounterValue (#749)
All the metric only goes up, so the type should be counter.
This also add _total to all the metric name.

Fix: #747
2017-12-06 13:58:35 +01:00
Derek Marcotte 1527789f76 Added text collector conversion for ipmitool output. (#746)
* Added text collector conversion for ipmitool output.

* Sort metrics before exporting, add namespace.

* Added HELP string, tidy up a bit.

* Make status a gauge.
2017-12-01 12:58:39 +01:00
Ben Kochie 2a80537547
Split out guest cpu metrics on Linux. (#744)
Linux "guest" metrics for VMs are already accounted for in node_cpu
`user` and `nice` metrics.  Separate these into their own metric to
avoid duplication of data.
2017-11-23 15:04:47 +01:00
Karsten Weiss a8d7d1101a cpu: Support processor-less (memory-only) NUMA nodes (#734)
* cpu: Support processor-less (memory-only) NUMA nodes

Processor-less (memory-only) NUMA nodes exist e.g. in systems that use
Intel Optane drives for RAM expansion using Intel Memory Drive
Technology (IMDT).

IMDT RAM expansion supports two modes:

* "Unify Remote Memory domains": present a processor-less (memory-only)
  NUMA domain, which is the default
* "Expand local memory domains": to expand each processor’s memory domain
  with a portion of the memory made available by Optane and IMDT

This commit fixes a crash in the first case (when "cpulist" is empty).

Here's an example of such a system:

$ numastat -m|head -n5

Per-node system memory usage (in MBs):
                          Node 0          Node 1          Node 2           Total
                 --------------- --------------- --------------- ---------------
MemTotal               118239.56       130816.00       464384.00       713439.56

$ for i in {0..2}; do echo -n "$i: " ; cat /sys/bus/node/devices/node$i/cpulist ; done
0: 0-7,16-23
1: 8-15,24-31
2:

$ /opt/vsmp/bin/vsmpversion -vvv
Memory Drive Technology: 8.2.1455.74 (Sep 28 2017 13:09:59)
System configuration:
    Boards:      3
       1 x Proc. + I/O + Memory
       2 x NVM devices (Intel SSDPED1K375GAQ)
    Processors:  2, Cores: 16, Threads: 32
        Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz Stepping 01
    Memory (MB): 713472 (of 977450), Cache: 251416, Private: 12562
       1 x 249088MB   [262036/   678/12270]
       1 x 232192MB   [357707/125369/  146]  82:00.0#1
       1 x 232192MB   [357707/125369/  146]  83:00.0#1

* cpu: rename some variables (pkg => node)

* cpu: Use %v not %q in log.Debugf() format strings
2017-11-10 15:31:26 +01:00
Matt Layher f6f9c8d6cc Add and use sysReadFile in hwmon collector (#728) 2017-11-07 07:49:37 +01:00
Ben Kochie 4d7aa57da0
Update vendoring (#722)
* Update vendor github.com/beevik/ntp@v0.2.0

* Update vendor github.com/mdlayher/netlink/...

* Update vendor github.com/mdlayher/wifi/...

Adds vendor github.com/mdlayher/genetlink

* Update vendor github.com/prometheus/common/...

* Update vendor github.com/prometheus/procfs/...

* Update vendor golang.org/x/sys/unix

* Update vendor golang.org/x/sys/windows
2017-11-02 12:30:34 +01:00