Commit graph

86 commits

Author SHA1 Message Date
ToMe25 db3a43783a Fix promhttp_metric_handler_errors_total metric not being disabled by flag
Signed-off-by: ToMe25 <ToMe25@gmx.de>
2023-09-22 18:07:18 +02:00
LamGC c13f808619
Allow root path as metrics path. (#2590)
Signed-off-by: LamGC <lam827@lamgc.net>
2023-03-08 17:15:39 +01:00
Ben Kochie c23b76bfbb
Update exporter-toolkit
* Bump exporter-toolkit to the latest release.
* Use new toolkit landing page function.
* Update kingpin flags.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-03-07 15:18:38 +01:00
Simon Pasquier 2d05f2e0c3 Log current value of GOMAXPROCS
With `--runtime.gomaxprocs=0`, the GOMAXPROXS value will default to the
number of logical CPUs. In this case, it is more useful to log the
actual value than the value set by the user via the command-line.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2022-11-30 17:16:49 +01:00
Ben Kochie fb26b9fc72
Default GOMAXPROCS to 1 (#2530)
Avoid running on all CPUs by limiting the Go runtime to one CPU by
default. Avoids having Go routines schedule on every CPU, driving up the
visible run queue length on high CPU count systems.

This also helps workaround a kernel deadlock issue with reading from
sysfs concurrently.

See:
* https://github.com/prometheus/node_exporter/issues/1880
* https://github.com/prometheus/node_exporter/issues/2500

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-11-29 12:01:58 +01:00
Perry Naseck 440a132c38
Add multiple listeners and systemd socket listener activation (#2393)
Update exporter-toolkit to v0.8.1 to enable new listener support.

Signed-off-by: Perry Naseck <git@perrynaseck.com>
2022-10-23 12:45:29 +02:00
Ben Kochie 49db7c81e1
Fixup codespell (#2455)
* Fix some mistakes
* Switch to an ignore file.

Signed-off-by: Ben Kochie <superq@gmail.com>

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-09-02 10:49:47 +02:00
Ben Kochie 97583ff340
Use new client_golang collectors package.
Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-06 11:51:16 +02:00
Ben Kochie 3bc9a93c20
Add ErrorLog plumbing to promhttp
Fix the error logging of the promhttp handler by connecting it to the
promlog setup.
* Switch to go-kit/log.
* Cleanup CHANGELOG.

Fixes: https://github.com/prometheus/node_exporter/issues/1886

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-06-03 10:47:41 +02:00
David Leadbeater 0387f55a9b Make node_exporter print usage to STDOUT
Matches updated behaviour in Prometheus and other tools.

Signed-off-by: David Leadbeater <dgl@dgl.cx>
2021-05-11 15:21:09 +01:00
Julien Pivotto 809eed400a Add a warning when node exporter runs as root
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-01-22 23:55:34 +01:00
Changmin Kim cfdd9dd0c9
Change exporter-toolkit module name(https -> web) (#1931)
* Change deprecated Listen to ListenAndServe
* Change exporter-toolkit module name(https -> web)

Signed-off-by: changmink <changmin043@gmail.com>
2021-01-18 08:20:33 +01:00
Ben Kochie f0dea09749
Convert to exporter-toolkit/https
Use the new exporter-toolkit https package.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-12-29 13:47:32 +01:00
Julien Pivotto 202ecf9c9d
Add basic authentication (#1683)
* Add basic authentication

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-05-01 14:26:51 +02:00
Ben Kochie ef7c05816a
Release 1.0.0-rc.0 (#1614)
Update CHANGELOG/VERSION for 1.0.0-rc.0 release.
* Add a note about new https settings to top-level README.
* Mark --web.config flag as experimental.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-02-20 13:42:47 +01:00
Paul Gier b40954dce5
new flag to disable all default collectors (#1460)
* new flag to disable all default collectors

Signed-off-by: Paul Gier <pgier@redhat.com>

Co-authored-by: Ben Kochie <superq@gmail.com>
2020-02-20 11:03:33 +01:00
Ben Ye 2477c5c67d switch to go-kit/log (#1575)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2019-12-31 17:19:37 +01:00
ksherryBAE aede04172c Adding TLS to node exporter - cleaner version (#1277)
Add support for https connections.

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
Signed-off-by: James Ritchie <james.g.ritchie@baesystems.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Ben RIdley <benridley29@gmail.com>
2019-11-16 00:12:57 +01:00
Matthias Rampke b133213c7a Report non-fatal collection errors in the exporter metric. (#1439)
As per prometheus/client_golang#543, pass the Registry for exporter
metrics when setting up the /metrics HTTP handler.

With this, the `promhttp_metric_handler_errors_total` metric will
increment on (possibly non-fatal) collection-time errors, such as
duplicate metrics from text files.

Signed-off-by: Matthias Rampke <mr@soundcloud.com>
2019-07-28 10:37:10 +02:00
Ben Kochie ffefc8e74d Add a limit to the number of in-flight requests (#1166)
In order to avoid stuck collectors using up all system resources, add a
limit to the number of parallel in-flight scrape requests. This will
return a 503 error.

Default to 40 requests, this seems like a reasonable number based on:
* Two Prometheus servers scraping every 15 seconds.
* Failing scrapes after 5 minutes of stuckness.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-11-20 18:11:40 +01:00
beorn7 cd2331a185 Add --web.disable-exporter-metrics flag
If this flag is set, the metrics about the exporter itself (go_*,
process_*, promhttp_*) will be excluded from /metrics.

The Kingpin way of handling boolean flags makes the negative flag
wording (_dis_able) the most reasonably one.

This also refactors the flow in node_exporter.go quite a bit to avoid
mixing up the global and a local registry and to avoid re-creating a
registry even if no filtering is requested.

Signed-off-by: beorn7 <beorn@soundcloud.com>
2018-11-13 14:22:25 +01:00
Brian Brazil be9d82b66e
Sort collector names in startup logs (#857)
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-03-29 13:42:44 +01:00
Ben Kochie d33a447047
Remove deprecated prometheus.InstrumentHandlerFunc (#831)
Update Prometheus client golang use to use `promhttp.Handler()` instead
of `prometheus.InstrumentHandlerFunc()`.
2018-02-19 15:44:59 +01:00
Siavash Safi f3a7022602 Add collect[] parameter (#699)
* Add `collect[]` parameter

* Add TODo comment about staticcheck ignored

* Restore promhttp.HandlerOpts

* Log a warning and return HTTP error instead of failing

* Check collector existence and status, cleanups

* Fix warnings and error messages

* Don't panic, return error if collector registration failed

* Update README
2017-10-14 14:23:42 +02:00
Calle Pettersson 859a825bb8 Replace --collectors.enabled with per-collector flags (#640)
* Move NodeCollector into package collector

* Refactor collector enabling

* Update README with new collector enabled flags

* Fix out-of-date inline flag reference syntax

* Use new flags in end-to-end tests

* Add flag to disable all default collectors

* Track if a flag has been set explicitly

* Add --collectors.disable-defaults to README

* Revert disable-defaults flag

* Shorten flags

* Fixup timex collector registration

* Fix end-to-end tests

* Change procfs and sysfs path flags

* Fix review comments
2017-09-28 15:06:26 +02:00
Sami Kerola 3762191e66 Add timex collector (#664)
This collector is based on adjtimex(2) system call.  The collector returns
three values, status if time is synchronised, offset to remote reference,
and local clock frequency adjustment.

Values are taken from kernel time keeping data structures to avoid getting
involved how the synchronisation is implemented.  By that I mean one should
not care if time is update using ntpd, systemd.timesyncd, ptpd, and so on.
Since all time sync implementation will always end up telling to kernel what
is the status with time one can simply omit the software in between, and
look results of the syncing.  As a positive side effect this makes collector
very quick and conceptually specific, this does not monitor availability of
NTP server, or network in between, or dns resolution, and other unrelated
but necessary things.

Minimum set of values to keep eye on are the following three:

    The node_timex_sync_status tells if local clock is in sync with a remote
    clock.  Value is set to zero when synchronisation to a reliable server
    is lost, or a time sync software is misconfigured.

    The node_timex_offset_seconds tells how much local clock is off when
    compared to reference.  In case of multiple time references this value
    is outcome of RFC 5905 adjustment algorithm.  Ideally offset should be
    close to zero, and it depends about use case how large value is
    acceptable.  For example a typical web server is probably fine if offset
    is about 0.1 or less, but that would not be good enough for mobile phone
    base station operator.

    The node_timex_freq tells amount of adjustment to local clock tick
    frequency.  For example if offset is one second and growing the local
    clock will need instruction to tick quicker.  Number value itself is not
    very important, and occasional small adjustments are fine.  When
    frequency is unusually in stable one can assume quality of time stamps
    will not be accurate to very far in sub second range.  Obviously
    explaining why local clock frequency behaves like a passenger in roller
    coaster is different matter.  Explanations can vary from system load, to
    environmental issues such as a machine being physically too hot.

Rest of the measurements can help when debugging.  If you run a clock server
do probably want to collect and keep track of everything.

Pull-request: https://github.com/prometheus/node_exporter/pull/664
2017-09-19 07:54:06 -07:00
Calle Pettersson dfe07eaae8 Switch to kingpin flags (#639)
* Switch to kingpin flags

* Fix logrus vendoring

* Fix flags in main tests

* Fix vendoring versions
2017-08-12 15:07:24 +02:00
Ben Kochie 46c31d8a7e Enable IPVS collector by default (#623)
* Silence error output when no IPVS present.
* Enable by default.
* Update end-to-end fixture.
* Update README.
2017-07-26 15:20:28 +02:00
ideaship 8d90276283 Add bcache collector (#597)
* Add bcache collector for Linux

This collector gathers metrics related to the Linux block cache
(bcache) from sysfs.

* Removed commented out code

* Use project comment style

* Add _sectors to metric name to indicate unit

* Really use project comment style

* Rename bcache.go to bcache_linux.go

* Keep collector namespace clean

Rename:
- metric -> bcacheMetric
- periodStatsToMetrics -> bcachePeriodStatsToMetric

* Shorten slice initialization

* Change label names to backing_device, cache_device

* Remove five minute metrics (keep only total)

* Include units in additional metric names

* Enable bcache collector by default

* Provide metrics in seconds, not nanoseconds

* remove metrics with label "all"

* Add fixtures, update end-to-end for bcache collector

* Move fixtures/sys into tar.gz

This changeset moves the collector/fixtures/sys directory into
collector/fixtures/sys.tar.gz and tweaks the Makefile to unpack the
tarball before tests are run.

The reason for this change is that Windows does not allow colons in a
path (colons are present in some of the bcache fixture files), nor can
it (out of the box) deal with pathnames longer than 260 characters
(which we would be increasingly likely to hit if we tried to replace
colons with longer codes that are guaranteed not the turn up in regular
file names).

* Add ttar: plain text archive, replacement for tar

This changeset adds ttar, a plain text replacement for tar, and uses it
for the sysfs fixture archive. The syntax is loosely based on tar(1).

Using a plain text archive makes it possible to review changes without
downloading and extracting the archive. Also, when working on the repo,
git diff and git log become useful again, allowing a committer to verify
and track changes over time.

The code is written in bash, because bash is available out of the box on
all major flavors of Linux and on macOS. The feature set used is
restricted to bash version 3.2 because that is what Apple is still
shipping.

The programm also works on Windows if bash is installed. Obviously, it
does not solve the Windows limitations (path length limited to 260
characters, no symbolic links) that prompted the move to an archive
format in the first place.
2017-07-07 07:20:18 +02:00
Matt Layher 1feb091b36
Initial XFS collector 2017-04-22 11:53:07 -04:00
Sam Kottler 6eafa51fa8 Add ARP collector for Linux (#540)
* Implement commonalities and linux support for ARP collection

* Add ARP collector to fixtures and run as part of e2e tests

* Bubble up scanner errors

* Use single return values where it makes sense

* Add missing annotation

* Move arp_common into arp_linux

* Add license header to arp_linux.go

* Address initial feedback

* Use strings.Fields instead of strings.Split

* Deal with scanner.Err() rather than throwing away errors

* Check for scan errors in-line before interacting with the entries map

* Don't interact with potentially empty text from scan

* Check for scan errors outside the scan loop

* Add comment about moving procfs parsing

* Add more direct comment

* Update initialism style to match go style guide

* Put function args on the same line

* Add TODO in front of comment about procfs extraction

* Guard against strings.Fields returning an empty slice

* Be more defensive about ARP table format and use upcase more broadly

* Enable the ARP collector by default

* Add ARP collector to the README

* Remove 'entry'
2017-04-11 17:45:19 +02:00
Brian Brazil a02e469b07 Report collector success/failure and duration per scrape. (#516)
This is in line with best practices, and also saves us
63 timeseries on a default Linux setup.
2017-03-16 17:21:00 +00:00
Tobias Schmidt dace41e3d4 Continue scrape with duplicated metrics
Problems of a single collector, like duplicated metrics read via the
textfile collector, should not fail the collection and export of other
metrics.
2017-03-14 00:38:02 -03:00
Derek Marcotte 5c28ab044d Add BSD exec statistics collector (#457)
* First pass of a sysctl_bsd source, exec_bsd + exec metrics

* Incorportate PR feedback, including removing pre-build descriptions, unit conversion callback.

* Remove redundant cached_description field, per PR feedback

* Incorporate PR feedback
2017-02-28 17:23:10 -04:00
Tobias Schmidt c703435790 Fix all open go lint and vet issues 2017-02-28 13:05:38 -04:00
Robert Clark b0c9133cba Enable InfiniBand by default
Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
2017-02-07 11:09:08 -06:00
Matt Layher ba635842fc Add wifi collector to default collectors (#447) 2017-02-04 07:44:01 +00:00
Ben Kochie b4fa10ca9d Add collector for Linux EDAC
Collect "Error detection and correction" metrics from memory
controllers.
* Supported on Linux only.
* Add basic fixtures.
* Enabled by default.
2017-01-10 10:14:19 +01:00
Christian Schwarz c95bfa705e Enable ZFS exporter by default and update README. 2017-01-08 10:23:58 -06:00
Johannes 'fish' Ziemke 3983cd58ff Use promhttp and setup logger 2017-01-05 19:30:48 +01:00
Rene Treffer 081ecc5db0 Add hwmon /sensors support (#278)
* Add hwmon support (mainly known from lm-sensors)

This commit adds initial support for linux hardware sensors, exported
through sysfs.

Details of the interface can be found at
https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface

* Add end-to-end test with some real life data

* Cleanup comments on hwmon collector

* Drop raw sensor name from hwmon output

* Let the sensor label be "sensor"

* Add hwmon short description to README.
2016-10-06 16:33:24 +01:00
Steve Durrheimer 60cbc9efc0
Make version informations consistent between prometheus components
This also fixes #231 by adding the '-version' flag
2016-05-04 08:43:33 +02:00
Christian Schwarz 9a189b903e Add FreeBSD 'cpu' exporter to default collectors.
As of `1fc84e2fb69ee3d1f063399b00a6284fc8e27cb8` it does not require root anymore.
2016-02-18 12:15:08 +01:00
Tobias Schmidt 3a96e6881b Remove unused flag -debug.memprofile-file
The option to write out a memory profile to file was removed in a730cff.
Declaring flags as local variable does not only result in cleaner, more
testable code, but also ensures that the program won't compile anymore
when unused flags are left in place.
2016-02-04 20:24:16 -05:00
Richard Hartmann aee580d8d8 Introduce entropy collector for Linux 2016-01-13 18:29:52 +01:00
Florian Koch 5d5346af8a Add vmstat collector, enabled per default 2016-01-11 07:58:30 +01:00
Caskey L. Dickson ab9ee574fb Build cleanly under windows.
Removes unused signal handlers left over from signal based collection
and block the non windows-relevant collectors loadavg and interrupts.

Signal based collection removed in 1c17481a42.
2016-01-07 17:59:16 -08:00
Daniel Bechler fc3931c924 Add build_info metric similar to the one of Prometheus itself 2016-01-06 23:54:33 +01:00
Brian Brazil a82b4c30cb Add linux conntrack collector. 2015-12-20 00:57:52 +00:00
Pavel Borzenkov 46527808aa Filter list of collectors enabled by default
Enabled by default collectors are chosen for Linux, which supports all
of the implemented collectors. But for other OSes (OS X, for example)
this list is not suitable, because they lack most of those collectors.

Because of that, it is not possible to run node_exporter with default
options on such OSes. Fix this by filtering list of enabled by default
collectors based on their availability for current platform.

Closes #149

Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com>
2015-11-13 10:42:10 +03:00