Commit graph

87 commits

Author SHA1 Message Date
Calle Pettersson 859a825bb8 Replace --collectors.enabled with per-collector flags (#640)
* Move NodeCollector into package collector

* Refactor collector enabling

* Update README with new collector enabled flags

* Fix out-of-date inline flag reference syntax

* Use new flags in end-to-end tests

* Add flag to disable all default collectors

* Track if a flag has been set explicitly

* Add --collectors.disable-defaults to README

* Revert disable-defaults flag

* Shorten flags

* Fixup timex collector registration

* Fix end-to-end tests

* Change procfs and sysfs path flags

* Fix review comments
2017-09-28 15:06:26 +02:00
Sami Kerola 3762191e66 Add timex collector (#664)
This collector is based on adjtimex(2) system call.  The collector returns
three values, status if time is synchronised, offset to remote reference,
and local clock frequency adjustment.

Values are taken from kernel time keeping data structures to avoid getting
involved how the synchronisation is implemented.  By that I mean one should
not care if time is update using ntpd, systemd.timesyncd, ptpd, and so on.
Since all time sync implementation will always end up telling to kernel what
is the status with time one can simply omit the software in between, and
look results of the syncing.  As a positive side effect this makes collector
very quick and conceptually specific, this does not monitor availability of
NTP server, or network in between, or dns resolution, and other unrelated
but necessary things.

Minimum set of values to keep eye on are the following three:

    The node_timex_sync_status tells if local clock is in sync with a remote
    clock.  Value is set to zero when synchronisation to a reliable server
    is lost, or a time sync software is misconfigured.

    The node_timex_offset_seconds tells how much local clock is off when
    compared to reference.  In case of multiple time references this value
    is outcome of RFC 5905 adjustment algorithm.  Ideally offset should be
    close to zero, and it depends about use case how large value is
    acceptable.  For example a typical web server is probably fine if offset
    is about 0.1 or less, but that would not be good enough for mobile phone
    base station operator.

    The node_timex_freq tells amount of adjustment to local clock tick
    frequency.  For example if offset is one second and growing the local
    clock will need instruction to tick quicker.  Number value itself is not
    very important, and occasional small adjustments are fine.  When
    frequency is unusually in stable one can assume quality of time stamps
    will not be accurate to very far in sub second range.  Obviously
    explaining why local clock frequency behaves like a passenger in roller
    coaster is different matter.  Explanations can vary from system load, to
    environmental issues such as a machine being physically too hot.

Rest of the measurements can help when debugging.  If you run a clock server
do probably want to collect and keep track of everything.

Pull-request: https://github.com/prometheus/node_exporter/pull/664
2017-09-19 07:54:06 -07:00
Leonid Evdokimov c169b4b1c5 Add metrics from SNTPv4 packet to ntp collector & add ntpd sanity check (#655)
* Add metrics from SNTPv4 packet to ntp collector & add ntpd sanity check

1. Checking local clock against remote NTP daemon is bad idea, local
ntpd acting as a  client should do it better and avoid excessive load on
remote NTP server so the collector is refactored to query local NTP
server.

2. Checking local clock against remote one does not check local ntpd
itself. Local ntpd may be down or out of sync due to network issues, but
clock will be OK.

3. Checking NTP server using sanity of it's response is tricky and
depends on ntpd implementation, that's why common `node_ntp_sanity`
variable is exported.

* `govendor add golang.org/x/net/ipv4`, it is dependency of github.com/beevik/ntp

* Update github.com/beevik/ntp to include boring SNTP fix

* Use variable name from RFC5905

* ntp: move code to make export of raw metrics more explicit

* Move NTP math to `github.com/beevik/ntp`

* Make `golint` happy

* Add some brief docs explaining `ntp` #655 and `timex` #664 modules

* ntp: drop XXX comment that got its decision

* ntp: add `_seconds` suffix to relevant metrics

* Better `node_ntp_leap` comment

* s/node_ntp_reftime/node_ntp_reference_timestamp_seconds/ as requested by @discordianfish

* Extract subsystem name to const as suggested by @SuperQ
2017-09-19 10:36:14 +02:00
Ben Kochie 9947f602f3 Add buildkite status badge. 2017-08-24 12:29:34 +02:00
Joe Handzik 4b011bfe44 Clarify Infiniband collector support (#643)
Tested a DL360 Gen9 box with an Omni-Path adapter in it. The existing InfiniBand collector can provide support for the same metrics on Omni-Path cards as well.

Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-08-16 07:32:54 +02:00
Calle Pettersson dfe07eaae8 Switch to kingpin flags (#639)
* Switch to kingpin flags

* Fix logrus vendoring

* Fix flags in main tests

* Fix vendoring versions
2017-08-12 15:07:24 +02:00
Vojtech Galda 1467d845fb Status information in /proc/drbd (#630)
in version 8.4 deprecated (but won’t be removed)
2017-08-02 08:04:13 +02:00
Teoh Han Hui 0b1f64bb15 Fix Docker mountpoint prefix docs 2017-07-28 15:06:28 +08:00
Ben Kochie 46c31d8a7e Enable IPVS collector by default (#623)
* Silence error output when no IPVS present.
* Enable by default.
* Update end-to-end fixture.
* Update README.
2017-07-26 15:20:28 +02:00
ideaship 8d90276283 Add bcache collector (#597)
* Add bcache collector for Linux

This collector gathers metrics related to the Linux block cache
(bcache) from sysfs.

* Removed commented out code

* Use project comment style

* Add _sectors to metric name to indicate unit

* Really use project comment style

* Rename bcache.go to bcache_linux.go

* Keep collector namespace clean

Rename:
- metric -> bcacheMetric
- periodStatsToMetrics -> bcachePeriodStatsToMetric

* Shorten slice initialization

* Change label names to backing_device, cache_device

* Remove five minute metrics (keep only total)

* Include units in additional metric names

* Enable bcache collector by default

* Provide metrics in seconds, not nanoseconds

* remove metrics with label "all"

* Add fixtures, update end-to-end for bcache collector

* Move fixtures/sys into tar.gz

This changeset moves the collector/fixtures/sys directory into
collector/fixtures/sys.tar.gz and tweaks the Makefile to unpack the
tarball before tests are run.

The reason for this change is that Windows does not allow colons in a
path (colons are present in some of the bcache fixture files), nor can
it (out of the box) deal with pathnames longer than 260 characters
(which we would be increasingly likely to hit if we tried to replace
colons with longer codes that are guaranteed not the turn up in regular
file names).

* Add ttar: plain text archive, replacement for tar

This changeset adds ttar, a plain text replacement for tar, and uses it
for the sysfs fixture archive. The syntax is loosely based on tar(1).

Using a plain text archive makes it possible to review changes without
downloading and extracting the archive. Also, when working on the repo,
git diff and git log become useful again, allowing a committer to verify
and track changes over time.

The code is written in bash, because bash is available out of the box on
all major flavors of Linux and on macOS. The feature set used is
restricted to bash version 3.2 because that is what Apple is still
shipping.

The programm also works on Windows if bash is installed. Obviously, it
does not solve the Windows limitations (path length limited to 260
characters, no symbolic links) that prompted the move to an archive
format in the first place.
2017-07-07 07:20:18 +02:00
kadota kyohei a077024f51 add diskstats on Darwin (#593)
* Add diskstats collector for Darwin

* Update year in the header

* Update README.md

* Add github.com/lufia/iostat to vendored packages

* Change stats to follow naming guidelines

* Add a entry of github.com/lufia/iostat into vendor.json

* Remove /proc/diskstats from description
2017-07-06 13:51:24 +02:00
Ben Kochie f3a4afc059 Add go path help to build instructions (#601)
Add `go get` and source directory requirements to the build
instructions.
2017-06-15 09:32:45 +02:00
Rene Treffer 2e9f1913b8 Move stat_linux to cpu_linux and add cpufreq stats (#548) 2017-06-13 11:21:53 +02:00
Johannes 'fish' Ziemke 0451c9c5f6 Make bind-mounts in Docker example read-only (#598) 2017-06-08 18:21:44 +02:00
Emanuele Rocca 047003b6bb Add qdisc collector for Linux (#580)
* Add qdisc collector for Linux

This collector gathers basic queueing discipline metrics via netlink,
similarly to what `tc -s qdisc show` does.

* qdisc collector: nl-specific code moved, names fixed

- netlink-specific parts moved to github.com/ema/qdisc
- avoid using shortened names
- counters renamed into XXX_total

* Get rid of parseMessage error checking leftover

* Add github.com/ema/qdisc to vendored packages

* Update help texts and comments

* Add qdisc collector to README file

* qdisc collector end-to-end testing

* Update qdisc dependency to latest version

Update github.com/ema/qdisc dependency to revision 2c7e72d, which
includes unit testing.

* qdisc collector: rename "iface" label into "device"
2017-05-23 11:55:50 +02:00
Matt Layher 1feb091b36
Initial XFS collector 2017-04-22 11:53:07 -04:00
Tobias Florek 2a38b57a2a Mention copr yum repository, add systemd unit (#529)
* add systemd unit as example

* mention community yum repo

fixes #498

* rename textfile collector dir
2017-04-19 18:54:15 +02:00
Sam Kottler 6eafa51fa8 Add ARP collector for Linux (#540)
* Implement commonalities and linux support for ARP collection

* Add ARP collector to fixtures and run as part of e2e tests

* Bubble up scanner errors

* Use single return values where it makes sense

* Add missing annotation

* Move arp_common into arp_linux

* Add license header to arp_linux.go

* Address initial feedback

* Use strings.Fields instead of strings.Split

* Deal with scanner.Err() rather than throwing away errors

* Check for scan errors in-line before interacting with the entries map

* Don't interact with potentially empty text from scan

* Check for scan errors outside the scan loop

* Add comment about moving procfs parsing

* Add more direct comment

* Update initialism style to match go style guide

* Put function args on the same line

* Add TODO in front of comment about procfs extraction

* Guard against strings.Fields returning an empty slice

* Be more defensive about ARP table format and use upcase more broadly

* Enable the ARP collector by default

* Add ARP collector to the README

* Remove 'entry'
2017-04-11 17:45:19 +02:00
Julius Volz 6580c95305 Add info about flags to README.md 2017-03-22 17:20:34 +01:00
Derek Marcotte 5c28ab044d Add BSD exec statistics collector (#457)
* First pass of a sysctl_bsd source, exec_bsd + exec metrics

* Incorportate PR feedback, including removing pre-build descriptions, unit conversion callback.

* Remove redundant cached_description field, per PR feedback

* Incorporate PR feedback
2017-02-28 17:23:10 -04:00
Tobias Schmidt 2ccd3340df Add goreportcard to README 2017-02-28 14:01:28 -04:00
Ben Kochie 38cd07ebb9 Merge pull request #450 from roclark/add-infiniband
infiniband: Add new collector for InfiniBand statistics
2017-02-16 14:33:19 +01:00
Thorhallur Sverrisson 2f16f6b84e Adding buddyinfo to README.md 2017-02-15 10:15:43 -06:00
Robert Clark b2d2b69af6 Update README to include InfiniBand collector
Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
2017-02-07 11:09:08 -06:00
Matt Layher ba635842fc Add wifi collector to default collectors (#447) 2017-02-04 07:44:01 +00:00
Ben Kochie 71362d45eb Merge pull request #432 from joehandzik/wip-zfs-zfetchstats
Update ZFS Collector with most non-zpool metrics
2017-01-31 08:52:41 -05:00
Cameron Moore f4d24e5044 Add missing collectors to README 2017-01-25 21:06:10 -06:00
Joe Handzik 1dde3ec31b README.md: Remove note about ZFS being limited to ARC
Because it's not true after this PR goes up.

Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-23 16:41:15 -06:00
Brian Brazil 8d4f36aa42 Point to WMI exporter (#431)
There's a slow trickle of people trying to use the node exporter on windows, so point them in the right direction.
2017-01-23 19:57:08 +00:00
Matt Layher efa25665ec
Add initial wifi collector, bump netlink to fix 32-bit builds 2017-01-11 10:08:44 -05:00
Ben Kochie b4fa10ca9d Add collector for Linux EDAC
Collect "Error detection and correction" metrics from memory
controllers.
* Supported on Linux only.
* Add basic fixtures.
* Enabled by default.
2017-01-10 10:14:19 +01:00
Corey Stewart a8c94d48e6 Style changes and cleanup
This patch makes stylistic changes to error strings, unexports method names by lower casing them, removes unused dataSetMetric, and adds copyright/licence information.

Signed-Off-By: Corey Stewart <stewa169@purdue.edu>
2017-01-08 10:23:58 -06:00
Christian Schwarz c95bfa705e Enable ZFS exporter by default and update README. 2017-01-08 10:23:58 -06:00
Johannes 'fish' Ziemke 7617e8b4be Update archs supported by collectors in README 2017-01-04 12:30:48 +01:00
Johannes 'fish' Ziemke 4e696d5d31 Merge pull request #391 from discordianfish/fish-add-cpu-darwin
Add cpu collector for darwin
2017-01-03 14:23:50 +01:00
Johannes 'fish' Ziemke 3db2f442ae Limit node-exporter scope, deprecated collectors 2017-01-03 14:03:23 +01:00
Johannes 'fish' Ziemke 050d6f7f13 Add cpu collector for darwin 2016-12-28 18:38:52 +01:00
Johannes 'fish' Ziemke 71ea37987f Merge pull request #365 from EdSchouten/drbd
A collector for DRBD
2016-12-25 11:04:43 +01:00
Ben Kochie 481392d75c Merge pull request #376 from prometheus/fish-update-docker-readme
Improve Docker documentation
2016-12-20 18:36:34 +01:00
Matt Layher 25a93e38e7
Add mountstats collector for detailed NFS statistics 2016-12-20 11:13:02 -05:00
Johannes 'fish' Ziemke 21173e21f0 Improve Docker documentation
This adds bind-mounts and ignore flags to Docker example and explains
why it's best run uncontainerized.
2016-12-19 16:17:53 +01:00
Ed Schouten 6269f7502a Add a collector for DRBD.
This collector exposes most of the useful information that can be found
in /proc/drbd. Sizes are normalised to be in bytes, as /proc/drbd uses
kibibytes.
2016-12-11 11:55:28 +01:00
Ed Schouten a696830c38 Add a collector for NFS client statistics.
This change adds a new collector called "nfs" that parses the contents
of /proc/net/rpc/nfs and turns it into metrics. It can be used to
inspect the number of operations per type, but also to keep an eye on an
extraneous number of retransmissions, which may indicate connectivity
issues.

I've picked the name "nfs", as most operating systems use "nfs" for the
client component and "nfsd" as the server component. If we want to add
stats for the NFS server as well, we'd better call such a collector
"nfsd".
2016-12-09 19:58:08 +01:00
stuart nelson 4c2f393af4 Update README.md (#352)
Dragonfly exports CPU, also
2016-11-25 09:02:02 +00:00
stuart nelson 08bc709c35 Update README.md 2016-11-17 13:23:54 +01:00
Rene Treffer 081ecc5db0 Add hwmon /sensors support (#278)
* Add hwmon support (mainly known from lm-sensors)

This commit adds initial support for linux hardware sensors, exported
through sysfs.

Details of the interface can be found at
https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface

* Add end-to-end test with some real life data

* Cleanup comments on hwmon collector

* Drop raw sensor name from hwmon output

* Let the sensor label be "sensor"

* Add hwmon short description to README.
2016-10-06 16:33:24 +01:00
stuart nelson cf3710191a Compile meminfo for dfly (#315)
* Compile meminfo for dfly

* Update README.me
2016-09-28 08:08:19 +01:00
stuart nelson ef1925db7d Compile netdev on dragonfly (#314)
* Compile netdev on dragonfly

* Only run netdev bsd test on bsd

* Update README.md
2016-09-27 21:44:13 +01:00
stuart nelson 61f36ac1ab Activate filesystem collector on DragonFly (#302) 2016-09-11 12:08:00 -04:00
Karsten Weiss 262ed7a8c1 README.md: Mention /proc/sys/fs/file-nr filename. 2016-06-20 18:09:13 +02:00