node_exporter

mirror of https://github.com/prometheus/node_exporter.git synced 2025-08-20 18:33:52 -07:00

Author	SHA1	Message	Date
Ben Kochie	73c9a10d37	Handle small backwards jumps in CPU idle The Linux CPU idle stat can also jump backwards slightly in some cases. Allow the jump back up to 3 seconds before we attempt to reset the CPU counter cache. Fixes: https://github.com/prometheus/node_exporter/issues/1903 Signed-off-by: Ben Kochie <superq@gmail.com>	2021-07-07 12:24:46 +02:00
Ben Kochie	3bc9a93c20	Add ErrorLog plumbing to promhttp Fix the error logging of the promhttp handler by connecting it to the promlog setup. * Switch to go-kit/log. * Cleanup CHANGELOG. Fixes: https://github.com/prometheus/node_exporter/issues/1886 Signed-off-by: Ben Kochie <superq@gmail.com>	2021-06-03 10:47:41 +02:00
Ben Kochie	306a365377	Downgrade CPU counter warnings We've gathered enough evidence that the CPU counter bug workaround is working as intended. Downgrade the message from Warning to Debug. Signed-off-by: Ben Kochie <superq@gmail.com>	2020-10-01 12:41:15 +02:00
Julius Volz	d05aac43e4	Fix capitalization of CPU acronym throughout Signed-off-by: Julius Volz <julius.volz@gmail.com>	2020-09-03 23:34:33 +02:00
domchan	503e4fc848	Expose cpu bugs and flags as info metrics. (#1788 ) * Expose cpu bugs and flags as info metrics with a regexp filter. * Automatically enable CPU info metrics when using flags or bugs feature. Signed-off-by: domgoer <domdoumc@gmail.com>	2020-07-17 18:32:23 +02:00
Ben Kochie	3565316d7e	Linux CPU: Cache CPU metrics Cache CPU metrics to avoid counters (ie iowait) jumping backwards. Fixes: https://github.com/prometheus/node_exporter/issues/1686 Signed-off-by: Ben Kochie <superq@gmail.com>	2020-05-24 16:31:26 +02:00
Benjamin Drung	34d50e15d5	Add model_name and stepping to node_cpu_info metric The `node_cpu_info` metric contains some information like the `model` (which is an integer), but not the human readable model name. Also the stepping of the processor might be interesting, since different stepping of a processor might behave differently. Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>	2020-03-20 17:27:11 +01:00
Julian Kornberger	cfcaeee145	Use strconv.Itoa() instead of fmt.Sprintf() (#1566 ) Signed-off-by: Julian Kornberger <jk+github@digineo.de>	2020-02-19 14:34:05 +01:00
Ben Ye	2477c5c67d	switch to go-kit/log (#1575 ) Signed-off-by: yeya24 <yb532204897@gmail.com>	2019-12-31 17:19:37 +01:00
Julian Kornberger	043fecbfd8	Wrap errors in the Go 1.13 way Signed-off-by: Julian Kornberger <jk+github@digineo.de>	2019-12-19 15:26:55 +01:00
Paul Gier	4d72cb8059	add node_cpu_info metric Contains information gathered from /proc/cpuinfo Signed-off-by: Paul Gier <pgier@redhat.com>	2019-09-25 14:38:57 -05:00
Paul Gier	2bc133cd48	update procfs to v0.0.2 (#1376 ) Signed-off-by: Paul Gier <pgier@redhat.com>	2019-06-12 20:47:16 +02:00
Paul Gier	b1298677aa	Early init of procfs (#1315 ) Minor change to match naming convention in other collectors. Initialize the proc or sys FS instance once while initializing each collector instead of re-creating for each metric update. Signed-off-by: Paul Gier <pgier@redhat.com>	2019-04-10 18:16:12 +02:00
Paul Gier	cc847f2f44	collector/cpu: split cpu freq metrics into separate collector (#1253 ) The cpu frequency information is not always needed and/or available. This change allows the cpu frequency metrics to be enabled/disabled separately from the other cpu metrics, and also prevents a frequency metric failure (such as a parse error) from failing the main cpu collector. Fixes #1241 Signed-off-by: Paul Gier <pgier@redhat.com>	2019-02-19 17:22:54 +01:00
mknapphrt	7fbdd0ae93	Update procfs vendor (#1248 ) Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>	2019-02-04 16:54:41 +01:00
Ben Kochie	a0a164defb	Update cpufreq metrics collector (#1117 ) * Update Linux cpufreq collector to use new procfs library functions. * Split thermal throttle collection to a separate function. * Add new required fixtures and repack ttar file. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-10-18 17:28:19 +02:00
Mario Trangoni	24a28fcc9e	Remove unused func, var, and const (#928 ) Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>	2018-04-29 14:35:43 +02:00
Mario Trangoni	c9f421d0dd	Fix some golint issues (#927 ) * collector/cpu_: rename nodeCpuSecondsDesc to nodeCPUSecondsDesc Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com> collector/qdisc_linux.go: add NewQdiscStatCollector comment Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com> * collector/cpu_linux.go: rename core_map to coreMap Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>	2018-04-29 14:34:47 +02:00
Karsten Weiss	efc1fdb6d0	cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total (#871 ) * cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total This commit fixes the node_cpu_core_throttles_total metrics on multi-socket systems as the core_ids are the same for each package. I.e. we need to count them seperately. Rename the node_package_throttles_total metric label `node` to `package`. Reorganize the sys.ttar archive and use the same symlinks as the Linux kernel. Also, the new fixtures now use a dual-socket dual-core cpu w/o HT/SMT (node0: cpu0+1, node1: cpu2+3) as well as processor-less (memory-only) NUMA node 'node2' (this is a very rare case). Signed-off-by: Karsten Weiss <knweiss@gmail.com> * cpu: Use the direct /sys path to the cpu files. Use the direct path /sys/devices/system/cpu/cpu[0-9]* (without symlinks) instead of /sys/bus/cpu/devices/cpu[0-9]. The latter path also does not exist e.g. on RHEL 6.9's kernel. Signed-off-by: Karsten Weiss <knweiss@gmail.com> cpu: Reverse core+package throttle processing order Signed-off-by: Karsten Weiss <knweiss@gmail.com> * cpu: Add documentation URLs Signed-off-by: Karsten Weiss <knweiss@gmail.com>	2018-04-09 18:01:52 +02:00
Rene Treffer	c504c7e264	Only report core throttles per core, not per cpu (#836 ) * Only report core throttles per core, not per cpu * Add topology/core_id to the cpu sysfs fixtures * Add new cpu fixtures to ttar file * Merge core_id reading and thermal throttle accounting * Declare core_id	2018-02-27 19:43:15 +01:00
Ben Kochie	14d60958d6	Unify CPU collector conventions (#806 ) * Unify CPU collector conventions Add a common CPU metric description. * All collectors use the same `nodeCpuSecondsDesc`. * All collectors drop the `cpu` prefix for `cpu` label values. * Fix subsystem string in cpu_freebsd. * Fix Linux CPU freq label names.	2018-02-01 18:42:20 +01:00
Brian Brazil	a98067a294	Make metrics better follow guidelines (#787 ) * Improve stat linux metric names. cpu is no longer used. * node_cpu -> node_cpu_seconds_total for Linux * Improve filesystem metric names with units * Improve units and names of linux disk stats Remove sector metrics, the bytes metrics cover those already. * Infiniband counters should end in _total * Improve timex metric names, convert to more normal units. See `3c073991eb/kernel/time/ntp.c (L909)` for what stabil means, looks like a moving average of some form. * Update test fixture * For meminfo metrics that had "kB" units, add _bytes * Interrupts counter should have _total	2018-01-17 17:55:55 +01:00
Ben Kochie	2a80537547	Split out guest cpu metrics on Linux. (#744 ) Linux "guest" metrics for VMs are already accounted for in node_cpu `user` and `nice` metrics. Separate these into their own metric to avoid duplication of data.	2017-11-23 15:04:47 +01:00
Karsten Weiss	a8d7d1101a	cpu: Support processor-less (memory-only) NUMA nodes (#734 ) * cpu: Support processor-less (memory-only) NUMA nodes Processor-less (memory-only) NUMA nodes exist e.g. in systems that use Intel Optane drives for RAM expansion using Intel Memory Drive Technology (IMDT). IMDT RAM expansion supports two modes: * "Unify Remote Memory domains": present a processor-less (memory-only) NUMA domain, which is the default * "Expand local memory domains": to expand each processor’s memory domain with a portion of the memory made available by Optane and IMDT This commit fixes a crash in the first case (when "cpulist" is empty). Here's an example of such a system: $ numastat -m\|head -n5 Per-node system memory usage (in MBs): Node 0 Node 1 Node 2 Total --------------- --------------- --------------- --------------- MemTotal 118239.56 130816.00 464384.00 713439.56 $ for i in {0..2}; do echo -n "$i: " ; cat /sys/bus/node/devices/node$i/cpulist ; done 0: 0-7,16-23 1: 8-15,24-31 2: $ /opt/vsmp/bin/vsmpversion -vvv Memory Drive Technology: 8.2.1455.74 (Sep 28 2017 13:09:59) System configuration: Boards: 3 1 x Proc. + I/O + Memory 2 x NVM devices (Intel SSDPED1K375GAQ) Processors: 2, Cores: 16, Threads: 32 Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz Stepping 01 Memory (MB): 713472 (of 977450), Cache: 251416, Private: 12562 1 x 249088MB [262036/ 678/12270] 1 x 232192MB [357707/125369/ 146] 82:00.0#1 1 x 232192MB [357707/125369/ 146] 83:00.0#1 * cpu: rename some variables (pkg => node) * cpu: Use %v not %q in log.Debugf() format strings	2017-11-10 15:31:26 +01:00
Calle Pettersson	859a825bb8	Replace --collectors.enabled with per-collector flags (#640 ) * Move NodeCollector into package collector * Refactor collector enabling * Update README with new collector enabled flags * Fix out-of-date inline flag reference syntax * Use new flags in end-to-end tests * Add flag to disable all default collectors * Track if a flag has been set explicitly * Add --collectors.disable-defaults to README * Revert disable-defaults flag * Shorten flags * Fixup timex collector registration * Fix end-to-end tests * Change procfs and sysfs path flags * Fix review comments	2017-09-28 15:06:26 +02:00
Karsten Weiss	b0d5c00832	cpu: Metric 'package_throttles_total' is per package. (#657 ) * cpu: Metric 'package_throttles_total' is per package. 'package_throttles_total' is per package, not per cpu. This also reduces the total number of cpu time series a lot (esp for multi core cpus). * cpu: Better handling of a cpulist edge-case. * cpu: Extract the package number from the directory name. Do not rely on the range index. * cpu: Add package_throttle_count for node0 cpu1 This file must be ignored by the cpu collector.	2017-09-07 23:24:18 +02:00
Rene Treffer	56bf8d4b2d	Add link to kernel documentation for sysfs/cpufreq files	2017-06-27 11:25:06 +02:00
Rene Treffer	bcc3cd92b8	Fix cpufreq statistics by converting kHz to Hz	2017-06-27 11:05:55 +02:00
Ben Kochie	182810056f	Fix Linux cpu errors (#606 ) Make the Linux cpu collector soft-error on missing `cpufreq` and `thermal_throttle` features.	2017-06-20 07:51:26 +02:00
Rene Treffer	2e9f1913b8	Move stat_linux to cpu_linux and add cpufreq stats (#548 )	2017-06-13 11:21:53 +02:00

30 commits