node_exporter

mirror of https://github.com/prometheus/node_exporter.git synced 2024-11-10 07:34:09 -08:00

Author	SHA1	Message	Date
Ben Kochie	c23b76bfbb	Update exporter-toolkit * Bump exporter-toolkit to the latest release. * Use new toolkit landing page function. * Update kingpin flags. Signed-off-by: Ben Kochie <superq@gmail.com>	2023-03-07 15:18:38 +01:00
Haoyu Sun	37d49746bc	Remove metrics of offline CPUs in CPU collector Signed-off-by: Haoyu Sun <hasun@redhat.com>	2023-03-07 14:01:02 +01:00
Jia Xin	39b4556b5b	fix cpustat when some cpus are offline Signed-off-by: Jia Xin <alexjx@gmail.com>	2023-01-20 01:24:06 +00:00
david	c2085cf8ca	flip branches for early return Signed-off-by: david <davidventura27@gmail.com>	2022-07-26 11:21:08 +02:00
david	75c05f3d97	remove error from signature; update doc for function Signed-off-by: david <davidventura27@gmail.com>	2022-07-26 11:21:08 +02:00
david	840d32622f	check for nil isolatedCpus before calling updateIsolated Signed-off-by: david <davidventura27@gmail.com>	2022-07-26 11:21:08 +02:00
david	5340d1ec37	add debug log for not existent file Signed-off-by: david <davidventura27@gmail.com>	2022-07-26 11:21:08 +02:00
david	c05af934af	warn if isolcpus cannot be read and default to an empty slice Signed-off-by: david <davidventura27@gmail.com>	2022-07-26 11:21:08 +02:00
david	9ea9a5f029	only publish metrics for isolated cpus Signed-off-by: david <davidventura27@gmail.com>	2022-07-26 11:21:08 +02:00
david	5d68d5b9ad	move logic to procfs; create a new metric for isolation Signed-off-by: david <davidventura27@gmail.com>	2022-07-26 11:21:08 +02:00
david	512e086dec	Implement #2250 : Add "isolated" label on cpu collector on linux Signed-off-by: david <davidventura27@gmail.com>	2022-07-26 11:21:08 +02:00
Park Beomsu	c861ba93aa	Remove redundant nil check (#2206 ) Signed-off-by: computerphilosopher <bspark@jam2in.com>	2021-11-15 11:23:49 +01:00
Julien Pivotto	68a6c78c0d	Update go to 1.17 (#2159 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-10-03 13:35:24 +02:00
Sergei Semenchuk	5de46c6bac	collect flag_info and bug_info only for one core (#2156 ) Signed-off-by: binjip978 <binjip978@gmail.com>	2021-09-28 07:44:03 +02:00
Ben Kochie	84b36c4fd8	Add flag to disable guest CPU metrics In high scale virtualized / cloud environments there are typically no guest VMs. Add a boolean flag to allow disabling the Linux guest CPU metrics. Signed-off-by: Ben Kochie <superq@gmail.com>	2021-08-17 13:04:46 +02:00
Ben Kochie	73c9a10d37	Handle small backwards jumps in CPU idle The Linux CPU idle stat can also jump backwards slightly in some cases. Allow the jump back up to 3 seconds before we attempt to reset the CPU counter cache. Fixes: https://github.com/prometheus/node_exporter/issues/1903 Signed-off-by: Ben Kochie <superq@gmail.com>	2021-07-07 12:24:46 +02:00
Ben Kochie	3bc9a93c20	Add ErrorLog plumbing to promhttp Fix the error logging of the promhttp handler by connecting it to the promlog setup. * Switch to go-kit/log. * Cleanup CHANGELOG. Fixes: https://github.com/prometheus/node_exporter/issues/1886 Signed-off-by: Ben Kochie <superq@gmail.com>	2021-06-03 10:47:41 +02:00
Ben Kochie	306a365377	Downgrade CPU counter warnings We've gathered enough evidence that the CPU counter bug workaround is working as intended. Downgrade the message from Warning to Debug. Signed-off-by: Ben Kochie <superq@gmail.com>	2020-10-01 12:41:15 +02:00
Julius Volz	d05aac43e4	Fix capitalization of CPU acronym throughout Signed-off-by: Julius Volz <julius.volz@gmail.com>	2020-09-03 23:34:33 +02:00
domchan	503e4fc848	Expose cpu bugs and flags as info metrics. (#1788 ) * Expose cpu bugs and flags as info metrics with a regexp filter. * Automatically enable CPU info metrics when using flags or bugs feature. Signed-off-by: domgoer <domdoumc@gmail.com>	2020-07-17 18:32:23 +02:00
Ben Kochie	3565316d7e	Linux CPU: Cache CPU metrics Cache CPU metrics to avoid counters (ie iowait) jumping backwards. Fixes: https://github.com/prometheus/node_exporter/issues/1686 Signed-off-by: Ben Kochie <superq@gmail.com>	2020-05-24 16:31:26 +02:00
Benjamin Drung	34d50e15d5	Add model_name and stepping to node_cpu_info metric The `node_cpu_info` metric contains some information like the `model` (which is an integer), but not the human readable model name. Also the stepping of the processor might be interesting, since different stepping of a processor might behave differently. Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>	2020-03-20 17:27:11 +01:00
Julian Kornberger	cfcaeee145	Use strconv.Itoa() instead of fmt.Sprintf() (#1566 ) Signed-off-by: Julian Kornberger <jk+github@digineo.de>	2020-02-19 14:34:05 +01:00
Ben Ye	2477c5c67d	switch to go-kit/log (#1575 ) Signed-off-by: yeya24 <yb532204897@gmail.com>	2019-12-31 17:19:37 +01:00
Julian Kornberger	043fecbfd8	Wrap errors in the Go 1.13 way Signed-off-by: Julian Kornberger <jk+github@digineo.de>	2019-12-19 15:26:55 +01:00
Paul Gier	4d72cb8059	add node_cpu_info metric Contains information gathered from /proc/cpuinfo Signed-off-by: Paul Gier <pgier@redhat.com>	2019-09-25 14:38:57 -05:00
Paul Gier	2bc133cd48	update procfs to v0.0.2 (#1376 ) Signed-off-by: Paul Gier <pgier@redhat.com>	2019-06-12 20:47:16 +02:00
Paul Gier	b1298677aa	Early init of procfs (#1315 ) Minor change to match naming convention in other collectors. Initialize the proc or sys FS instance once while initializing each collector instead of re-creating for each metric update. Signed-off-by: Paul Gier <pgier@redhat.com>	2019-04-10 18:16:12 +02:00
Paul Gier	cc847f2f44	collector/cpu: split cpu freq metrics into separate collector (#1253 ) The cpu frequency information is not always needed and/or available. This change allows the cpu frequency metrics to be enabled/disabled separately from the other cpu metrics, and also prevents a frequency metric failure (such as a parse error) from failing the main cpu collector. Fixes #1241 Signed-off-by: Paul Gier <pgier@redhat.com>	2019-02-19 17:22:54 +01:00
mknapphrt	7fbdd0ae93	Update procfs vendor (#1248 ) Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>	2019-02-04 16:54:41 +01:00
Ben Kochie	a0a164defb	Update cpufreq metrics collector (#1117 ) * Update Linux cpufreq collector to use new procfs library functions. * Split thermal throttle collection to a separate function. * Add new required fixtures and repack ttar file. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-10-18 17:28:19 +02:00
Mario Trangoni	24a28fcc9e	Remove unused func, var, and const (#928 ) Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>	2018-04-29 14:35:43 +02:00
Mario Trangoni	c9f421d0dd	Fix some golint issues (#927 ) * collector/cpu_: rename nodeCpuSecondsDesc to nodeCPUSecondsDesc Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com> collector/qdisc_linux.go: add NewQdiscStatCollector comment Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com> * collector/cpu_linux.go: rename core_map to coreMap Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>	2018-04-29 14:34:47 +02:00
Karsten Weiss	efc1fdb6d0	cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total (#871 ) * cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total This commit fixes the node_cpu_core_throttles_total metrics on multi-socket systems as the core_ids are the same for each package. I.e. we need to count them seperately. Rename the node_package_throttles_total metric label `node` to `package`. Reorganize the sys.ttar archive and use the same symlinks as the Linux kernel. Also, the new fixtures now use a dual-socket dual-core cpu w/o HT/SMT (node0: cpu0+1, node1: cpu2+3) as well as processor-less (memory-only) NUMA node 'node2' (this is a very rare case). Signed-off-by: Karsten Weiss <knweiss@gmail.com> * cpu: Use the direct /sys path to the cpu files. Use the direct path /sys/devices/system/cpu/cpu[0-9]* (without symlinks) instead of /sys/bus/cpu/devices/cpu[0-9]. The latter path also does not exist e.g. on RHEL 6.9's kernel. Signed-off-by: Karsten Weiss <knweiss@gmail.com> cpu: Reverse core+package throttle processing order Signed-off-by: Karsten Weiss <knweiss@gmail.com> * cpu: Add documentation URLs Signed-off-by: Karsten Weiss <knweiss@gmail.com>	2018-04-09 18:01:52 +02:00
Rene Treffer	c504c7e264	Only report core throttles per core, not per cpu (#836 ) * Only report core throttles per core, not per cpu * Add topology/core_id to the cpu sysfs fixtures * Add new cpu fixtures to ttar file * Merge core_id reading and thermal throttle accounting * Declare core_id	2018-02-27 19:43:15 +01:00
Ben Kochie	14d60958d6	Unify CPU collector conventions (#806 ) * Unify CPU collector conventions Add a common CPU metric description. * All collectors use the same `nodeCpuSecondsDesc`. * All collectors drop the `cpu` prefix for `cpu` label values. * Fix subsystem string in cpu_freebsd. * Fix Linux CPU freq label names.	2018-02-01 18:42:20 +01:00
Brian Brazil	a98067a294	Make metrics better follow guidelines (#787 ) * Improve stat linux metric names. cpu is no longer used. * node_cpu -> node_cpu_seconds_total for Linux * Improve filesystem metric names with units * Improve units and names of linux disk stats Remove sector metrics, the bytes metrics cover those already. * Infiniband counters should end in _total * Improve timex metric names, convert to more normal units. See `3c073991eb/kernel/time/ntp.c (L909)` for what stabil means, looks like a moving average of some form. * Update test fixture * For meminfo metrics that had "kB" units, add _bytes * Interrupts counter should have _total	2018-01-17 17:55:55 +01:00
Ben Kochie	2a80537547	Split out guest cpu metrics on Linux. (#744 ) Linux "guest" metrics for VMs are already accounted for in node_cpu `user` and `nice` metrics. Separate these into their own metric to avoid duplication of data.	2017-11-23 15:04:47 +01:00
Karsten Weiss	a8d7d1101a	cpu: Support processor-less (memory-only) NUMA nodes (#734 ) * cpu: Support processor-less (memory-only) NUMA nodes Processor-less (memory-only) NUMA nodes exist e.g. in systems that use Intel Optane drives for RAM expansion using Intel Memory Drive Technology (IMDT). IMDT RAM expansion supports two modes: * "Unify Remote Memory domains": present a processor-less (memory-only) NUMA domain, which is the default * "Expand local memory domains": to expand each processor’s memory domain with a portion of the memory made available by Optane and IMDT This commit fixes a crash in the first case (when "cpulist" is empty). Here's an example of such a system: $ numastat -m\|head -n5 Per-node system memory usage (in MBs): Node 0 Node 1 Node 2 Total --------------- --------------- --------------- --------------- MemTotal 118239.56 130816.00 464384.00 713439.56 $ for i in {0..2}; do echo -n "$i: " ; cat /sys/bus/node/devices/node$i/cpulist ; done 0: 0-7,16-23 1: 8-15,24-31 2: $ /opt/vsmp/bin/vsmpversion -vvv Memory Drive Technology: 8.2.1455.74 (Sep 28 2017 13:09:59) System configuration: Boards: 3 1 x Proc. + I/O + Memory 2 x NVM devices (Intel SSDPED1K375GAQ) Processors: 2, Cores: 16, Threads: 32 Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz Stepping 01 Memory (MB): 713472 (of 977450), Cache: 251416, Private: 12562 1 x 249088MB [262036/ 678/12270] 1 x 232192MB [357707/125369/ 146] 82:00.0#1 1 x 232192MB [357707/125369/ 146] 83:00.0#1 * cpu: rename some variables (pkg => node) * cpu: Use %v not %q in log.Debugf() format strings	2017-11-10 15:31:26 +01:00
Calle Pettersson	859a825bb8	Replace --collectors.enabled with per-collector flags (#640 ) * Move NodeCollector into package collector * Refactor collector enabling * Update README with new collector enabled flags * Fix out-of-date inline flag reference syntax * Use new flags in end-to-end tests * Add flag to disable all default collectors * Track if a flag has been set explicitly * Add --collectors.disable-defaults to README * Revert disable-defaults flag * Shorten flags * Fixup timex collector registration * Fix end-to-end tests * Change procfs and sysfs path flags * Fix review comments	2017-09-28 15:06:26 +02:00
Karsten Weiss	b0d5c00832	cpu: Metric 'package_throttles_total' is per package. (#657 ) * cpu: Metric 'package_throttles_total' is per package. 'package_throttles_total' is per package, not per cpu. This also reduces the total number of cpu time series a lot (esp for multi core cpus). * cpu: Better handling of a cpulist edge-case. * cpu: Extract the package number from the directory name. Do not rely on the range index. * cpu: Add package_throttle_count for node0 cpu1 This file must be ignored by the cpu collector.	2017-09-07 23:24:18 +02:00
Rene Treffer	56bf8d4b2d	Add link to kernel documentation for sysfs/cpufreq files	2017-06-27 11:25:06 +02:00
Rene Treffer	bcc3cd92b8	Fix cpufreq statistics by converting kHz to Hz	2017-06-27 11:05:55 +02:00
Ben Kochie	182810056f	Fix Linux cpu errors (#606 ) Make the Linux cpu collector soft-error on missing `cpufreq` and `thermal_throttle` features.	2017-06-20 07:51:26 +02:00
Rene Treffer	2e9f1913b8	Move stat_linux to cpu_linux and add cpufreq stats (#548 )	2017-06-13 11:21:53 +02:00

45 commits