For: #14355
This commit updates Prometheus to adopt stdlib's log/slog package in
favor of go-kit/log. As part of converting to use slog, several other
related changes are required to get prometheus working, including:
- removed unused logging util func `RateLimit()`
- forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger
- move some of the json file logging functionality to use prom/common package functionality
- refactored some of the new json file logging for scraping
- changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers
- updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition
- added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
Sidecar containers are a newish feature in k8s. They're implemented
similar to init containers but actually stay running and allow you to
delay startup of your application pod until the sidecar started (like
init containers always do).
This adds the ports of the sidecar container to the list of discovered
endpoint(slice), allowing you to target those containers as well.
The implementation is a copy of that of Pod discovery
fixes: #14927
Signed-off-by: bas smit <bsmit@bol.com>
This commit removes support for the following API versions:
* `discovery.k8s.io/v1beta1` API version of EndpointSlice (no longer
served as of v1.25).
* `networking.k8s.io/v1beta1` API version of Ingress (no longer served
as of v1.22).
Closes#12884
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
This commit adds 2 new metadata labels for the endpointslice role:
* `__meta_kubernetes_endpointslice_endpoint_node_name`
* `__meta_kubernetes_endpointslice_endpoint_zone`
The latter is only present when the `discovery.k8s.io/v1` API group is
available.
I also updated the configuration doc and added an entry for the
`__meta_kubernetes_endpointslice_endpoint_hostname` label which was
missing.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* fix(discovery/kubernetes/endpoints): react to changes on Pods because some modifications can occur on them without triggering an update on the related Endpoints (The Pod phase changing from Pending to Running e.g.).
---------
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
Co-authored-by: Guillermo Sanchez Gavier <gsanchez@newrelic.com>
SD Managers take over responsibility for SD metrics registration
---------
Signed-off-by: Paulin Todev <paulin.todev@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.
The exceptions that I have found in our codebase are just these two:
* The `if else` is followed by an additional statement before the next
condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
used. In this case, using `switch` would require tagging the `for`
loop, which probably tips the balance.
Why are `switch` statements more readable?
For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.
I'm sure the aforemention wise coders can list even more reasons.
In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.
Signed-off-by: beorn7 <beorn@grafana.com>
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.
There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).
I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.
I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.
Signed-off-by: beorn7 <beorn@grafana.com>
While originally the resync period also forced refreshing from Kubernetes API server, this has been removed for some years now because watching the API server got more stable [1]. Today, this just results in all entities being sent to the service discovery again, which is valid from a general Prometheus perspective, but results in unnecessary CPU load and also breaks service discovery metrics. In especially, this makes monitoring "do we actually observe changes from Kubernetes API server" impossible (receiving constant updates from Kubernetes service discovery is a pretty valid assumption, for example nodes get frequent status updates, ...).
Signed-off-by: Jens Erat <jens.erat@mercedes-benz.com>
A new API is available for AddEventHandlers, to get errors but also be
able to cancel handlers.
Doing the easy thing for the release, which is just to log errors.
We could see how to improve this in the future to handle the errors
properly and cancel the handlers.
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
This commits adds a __meta_kubernetes_pod_container_image as a new
metadata label. This can be used to alert on mismatched versions of
targets who don't have a build_info metric, as well as injecting it into
log lines for other consumers of discovery/kubernetes (e.g., Promtail).
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
The Kubernetes service discovery can only add node labels to
targets from the pod role.
This commit extends this functionality to the endpoints and
endpointslices roles.
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
The tests for Kubernetes SD rely on comparing target groups by first
serializing them to JSON. However, the target group MarshalJSON function
only serializes the __address__ label, which makes eliminates all other
labels from the comparison.
This commit implements a separate marshaling function intended for use in
Kubernetes SD tests. The function serializes all target labels, making
comparisons much more reliable. The commit also fixes all tests that
started to fail due to the newly introduced change.
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>