* Add include and exclude filter for sensors in hwmon collector
Fixes#2242
This commit adds two new flags (`collector.hwmon.sensor-include` and `collector.hwmon.sensor-exclude`) to the `hwmon` collector to allow inclusion or exclusion of specific sensors.
Some devices export nonsensical values for certain sensors. Here is an example:
```
node_hwmon_temp_celsius{chip="platform_nct6775_656",sensor="temp13"} 49.75
node_hwmon_temp_celsius{chip="platform_nct6775_656",sensor="temp15"} 3.892313987e+06
node_hwmon_temp_celsius{chip="platform_nct6775_656",sensor="temp16"} 3.892313987e+06
```
As a user I would like to only exclude these sensors, not necessarily the complete device (as is currently possible with the `--collector.hwmon.chip-exclude` flag) as other sensor values might be sensical or desired.
The new option filters based both on device name and sensor name, separated by a semicolon. For example, to exclude the two sensors above, the following regex can be used:
~~~
--collector.hwmon.sensor-exclude="platform_nct6775_656;temp1[5,6]"
~~~
---------
Signed-off-by: Simon Krenger <skrenger@redhat.com>
Fix golangci-lint "ineffectual assignment" by correctly capturing any
errors within the hwmon gathering loop.
Signed-off-by: Ben Kochie <superq@gmail.com>
Use the correct include value to the device filter function.
* Add new bogus hwmon fixture.
* Update end-to-end test to use hwmon chip include flag.
Signed-off-by: Ben Kochie <superq@gmail.com>
* Add include and exclude flags chip name flags to hwmon collector, following example in systemd collector
---------
Signed-off-by: Conall O'Brien <conall@conall.net>
Co-authored-by: Ben Kochie <superq@gmail.com>
We don't need to fully sanitize the hwmon label values to metric/label
name strings.
* Just make sure they're valid UTF-8.
* Always included the label metric to avoid group_left failures.
Signed-off-by: Ben Kochie <superq@gmail.com>
Signed-off-by: Ben Kochie <superq@gmail.com>
Fix the error logging of the promhttp handler by connecting it to the
promlog setup.
* Switch to go-kit/log.
* Cleanup CHANGELOG.
Fixes: https://github.com/prometheus/node_exporter/issues/1886
Signed-off-by: Ben Kochie <superq@gmail.com>
Many collectors depend on underlying features to be enabled. This causes
confusion about what "success" means. This changes the behavior of the
`node_scrape_collector_success` metric.
* When a collector is unable to find data don't return success.
* Catch the no data error and send to Debug log level to avoid log spam.
* Update collectors to support this new functionality.
* Fix copy-pasta mistake in infiband debug message.
Closes: https://github.com/prometheus/node_exporter/issues/1323
Signed-off-by: Ben Kochie <superq@gmail.com>
According to the golang docs, the syscall package is deprecated.
https://golang.org/pkg/syscall
This updates collectors to use the x/sys/unix package instead.
Also updates the vendored x/sys/unix module to latest.
Signed-off-by: Paul Gier <pgier@redhat.com>
Similar to #1228. Update the remaining collectors to use
'path/filepath' intead of 'path' for manipulating file paths.
Signed-off-by: Paul Gier <pgier@redhat.com>
In some cases the file might be called "temp" instead of the usual format "temp<index>_<item>"
as described in the kernel docs: https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface
In this case, treat this as an _input file containing the current temperature reading.
Fixes#1122
Signed-off-by: Paul Gier <pgier@redhat.com>
* Move NodeCollector into package collector
* Refactor collector enabling
* Update README with new collector enabled flags
* Fix out-of-date inline flag reference syntax
* Use new flags in end-to-end tests
* Add flag to disable all default collectors
* Track if a flag has been set explicitly
* Add --collectors.disable-defaults to README
* Revert disable-defaults flag
* Shorten flags
* Fixup timex collector registration
* Fix end-to-end tests
* Change procfs and sysfs path flags
* Fix review comments
Named return variables should only be used to describe the returned type
further, e.g. `err error` doesn't add any new information and is just
stutter.
The chip label generation has been changed in #334 to prefer the
unique device path (e.g. the location on the PCI bus) due to #333.
Here, a new annotation metric ``node_hwmon_chip_names`` is
introduced which allows to link the unique chip sysfs path to a
human-readable chip name which may not be unique among chip sysfs
paths (for example, dual-slot systems have multiple
chipType="coretemp" sensors).
This allows to mitigate the downsides of the solution to #333
(namely that the device path may not be stable across kernels and
reboots) for cases where it does not matter that multiple devices
may have the same human-readable name (e.g. aggregation or where
at most one device with a common chip name is present).
For cases where no human-readable name can be derived, the
annotation metric is not emitted.
* Prefer device path based names over exported names
For some sensors (like coretemp) it is possible that multiple
instances exist, thus base the name on the device path and not on
the exported name.
* Update end-to-end test for dual socket machines
Explicitly have 2 coretemp instances with a symlink for the device
such that the hwmon collector must pick that name (or fail)
* Add hwmon support (mainly known from lm-sensors)
This commit adds initial support for linux hardware sensors, exported
through sysfs.
Details of the interface can be found at
https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface
* Add end-to-end test with some real life data
* Cleanup comments on hwmon collector
* Drop raw sensor name from hwmon output
* Let the sensor label be "sensor"
* Add hwmon short description to README.