Adds a count for TCP packets received out of orders. This can be an
indication that there is packet loss on the way packets travel towards
this server. In that case, the sender will retransmit (and we can
already monitor the Tcp_RetransSegs there), but we have no way to
monitor the packet loss on the receiver side. When a packet is received
and the receiver detects previous one missing, it will increase the
TCPOFOQueue counter and reply with selective ACK to the sender, both
possible indications of packet loss. Confirmation of packet loss can be
achieved by taking packet captures, ignoring wireshark analysis, and
carefully looking at data being retransmitted based on the TCP seq.
Just like RetransSegs, TCPOFOQueue should be interesting for any
deployment as a mean to detect packet loss, so here suggesting adding it
to the default list.
Signed-off-by: François Rigault <frigo@amadeus.com>
Co-authored-by: François Rigault <frigo@amadeus.com>
This attribute was introduced it v6.6-rc1.
The relevant changes in procfs were merged here:
https://github.com/prometheus/procfs/pull/574
and are part of procfs v0.11.2
I have also figured out that the stat should be part of the v4 ops
counters struct, but that will need changes to both procfs and this
code. Since people are already using 6.6-rc1, I think it's better to get
the code out there --- even if they don't care about wdeleg_getattr,
currently they get _no_ nfsd stats with 6.6-rc1.
I will make two follow-up PRs to clean this up in the next releases of
procfs and node-exporter.
Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de>
Since netdev metrics are now read from netlink instead of `/proc/net/dev`, we
can't easily spoof them for the end-to-end tests by reading a fixture file in
place of `/proc/net/dev`.
Therefore, we only get metrics for `lo` and ignore those that would return
unpredictable values (i.e. the byte and packet counters).
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
Allow filtering APR entries based on device. Useful for ignoring
entries for network namespaces (containers).
Signed-off-by: Ben Kochie <superq@gmail.com>
* Two new states will be added to the tcpstat collector called rx_queued_bytes and tx_queued_bytes.
For UDP datagrams an additional collector 'udp_queues' can be used to expose the total lengths of the tx_queue and rx_queue.
@SuperQ and @discordianfish this changes gives us the option to check for overloaded UDP + TCP processing.
The names of the new TCP states and the UDP metric can be discussed.
The current reasons are just:
I don't want to add another collector for the same exposed file, so I just added the new states to the tcpstat collector.
I chose the name 'udp_queue' instead of 'udpstat' as UDP has no state.
Signed-off-by: Peter Bueschel <peter.bueschel@logmein.com>
* Implement commonalities and linux support for ARP collection
* Add ARP collector to fixtures and run as part of e2e tests
* Bubble up scanner errors
* Use single return values where it makes sense
* Add missing annotation
* Move arp_common into arp_linux
* Add license header to arp_linux.go
* Address initial feedback
* Use strings.Fields instead of strings.Split
* Deal with scanner.Err() rather than throwing away errors
* Check for scan errors in-line before interacting with the entries map
* Don't interact with potentially empty text from scan
* Check for scan errors outside the scan loop
* Add comment about moving procfs parsing
* Add more direct comment
* Update initialism style to match go style guide
* Put function args on the same line
* Add TODO in front of comment about procfs extraction
* Guard against strings.Fields returning an empty slice
* Be more defensive about ARP table format and use upcase more broadly
* Enable the ARP collector by default
* Add ARP collector to the README
* Remove 'entry'
This change adds a new collector called "nfs" that parses the contents
of /proc/net/rpc/nfs and turns it into metrics. It can be used to
inspect the number of operations per type, but also to keep an eye on an
extraneous number of retransmissions, which may indicate connectivity
issues.
I've picked the name "nfs", as most operating systems use "nfs" for the
client component and "nfsd" as the server component. If we want to add
stats for the NFS server as well, we'd better call such a collector
"nfsd".