* Add qdisc collector for Linux
This collector gathers basic queueing discipline metrics via netlink,
similarly to what `tc -s qdisc show` does.
* qdisc collector: nl-specific code moved, names fixed
- netlink-specific parts moved to github.com/ema/qdisc
- avoid using shortened names
- counters renamed into XXX_total
* Get rid of parseMessage error checking leftover
* Add github.com/ema/qdisc to vendored packages
* Update help texts and comments
* Add qdisc collector to README file
* qdisc collector end-to-end testing
* Update qdisc dependency to latest version
Update github.com/ema/qdisc dependency to revision 2c7e72d, which
includes unit testing.
* qdisc collector: rename "iface" label into "device"
According to Mellanox, it is standard practice that the port_xmit_data and port_rcv_data
files are split into 4 lanes. To get the actual transmit and receive values for each
port, the metric needs to be multiplied by 4.
Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
* Implement commonalities and linux support for ARP collection
* Add ARP collector to fixtures and run as part of e2e tests
* Bubble up scanner errors
* Use single return values where it makes sense
* Add missing annotation
* Move arp_common into arp_linux
* Add license header to arp_linux.go
* Address initial feedback
* Use strings.Fields instead of strings.Split
* Deal with scanner.Err() rather than throwing away errors
* Check for scan errors in-line before interacting with the entries map
* Don't interact with potentially empty text from scan
* Check for scan errors outside the scan loop
* Add comment about moving procfs parsing
* Add more direct comment
* Update initialism style to match go style guide
* Put function args on the same line
* Add TODO in front of comment about procfs extraction
* Guard against strings.Fields returning an empty slice
* Be more defensive about ARP table format and use upcase more broadly
* Enable the ARP collector by default
* Add ARP collector to the README
* Remove 'entry'
Older versions of the OFED drivers contain 64-bit variants of the port counters and are located in a directory named 'counters_ext'. This patch includes these older metrics that have since been deprecated with OFED 4.0.
Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
Add new metrics for the InfiniBand network protocol including the amount of packets sent and received, the number of times the link has been downed and how many times the link has recovered from an error state.
Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
- Use the right number of printf() arguments. Use %q where it makes sense.
- Use "DRBD" instead of "Drbd", per Go's style guide.
- Add _total suffixes to counter metrics.
- Mention the unit (bytes) in documentation strings once more.
This collector exposes most of the useful information that can be found
in /proc/drbd. Sizes are normalised to be in bytes, as /proc/drbd uses
kibibytes.
This change adds a new collector called "nfs" that parses the contents
of /proc/net/rpc/nfs and turns it into metrics. It can be used to
inspect the number of operations per type, but also to keep an eye on an
extraneous number of retransmissions, which may indicate connectivity
issues.
I've picked the name "nfs", as most operating systems use "nfs" for the
client component and "nfsd" as the server component. If we want to add
stats for the NFS server as well, we'd better call such a collector
"nfsd".
We seem to have a small number of Linux servers here that have lines in
/proc/mdstat that cannot be parsed by the node exporter, due to them
containing attributes that are not matched by the regular expression
("super 1.2").
Extend the regular expression to skip this data, just like we do for all
of the other status lines.
* Prefer device path based names over exported names
For some sensors (like coretemp) it is possible that multiple
instances exist, thus base the name on the device path and not on
the exported name.
* Update end-to-end test for dual socket machines
Explicitly have 2 coretemp instances with a symlink for the device
such that the hwmon collector must pick that name (or fail)
* Add Linux NUMA "numastat" metrics
Read the `numastat` metrics from /sys/devices/system/node/node* when reading NUMA meminfo metrics.
* Update end-to-end test output.
* Add `numastat` metrics as counters.
* Add tests for error conditions.
* Refactor meminfo numa metrics struct
* Refactor meminfoKey into a simple struct of metric data.
This makes it easier to pass slices of metrics around.
* Refactor tests.
* Fixup: Add suggested fixes.
* Fixup: More fixes
* Add another scanner.Err() return
* Add "_total" to counter metrics.
* Add hwmon support (mainly known from lm-sensors)
This commit adds initial support for linux hardware sensors, exported
through sysfs.
Details of the interface can be found at
https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface
* Add end-to-end test with some real life data
* Cleanup comments on hwmon collector
* Drop raw sensor name from hwmon output
* Let the sensor label be "sensor"
* Add hwmon short description to README.
Add new collector which exposes the content of /sys/kernel/mm/ksm
directory. This directory contains control and statistics files for
Kernel Samepage Merging daemon.
The collector is not enabled by default.
Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com>
It is sometimes useful to understand the distribution of free/occupied
memory between NUMA nodes to deal with performance problems. To do so,
add new meminfo_numa collector that enables exporting of per node
statistics along with unit and end-to-end tests for it.
Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com>
This test runs a selection of collectors against the fixtures and
compares the output to a reference.
The uname and filesystem collectors are disabled because they use system
calls that cannot be fixtured easily.