While originally the resync period also forced refreshing from Kubernetes API server, this has been removed for some years now because watching the API server got more stable [1]. Today, this just results in all entities being sent to the service discovery again, which is valid from a general Prometheus perspective, but results in unnecessary CPU load and also breaks service discovery metrics. In especially, this makes monitoring "do we actually observe changes from Kubernetes API server" impossible (receiving constant updates from Kubernetes service discovery is a pretty valid assumption, for example nodes get frequent status updates, ...).
Signed-off-by: Jens Erat <jens.erat@mercedes-benz.com>
A new API is available for AddEventHandlers, to get errors but also be
able to cancel handlers.
Doing the easy thing for the release, which is just to log errors.
We could see how to improve this in the future to handle the errors
properly and cancel the handlers.
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
This commits adds a __meta_kubernetes_pod_container_image as a new
metadata label. This can be used to alert on mismatched versions of
targets who don't have a build_info metric, as well as injecting it into
log lines for other consumers of discovery/kubernetes (e.g., Promtail).
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
The Kubernetes service discovery can only add node labels to
targets from the pod role.
This commit extends this functionality to the endpoints and
endpointslices roles.
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
The tests for Kubernetes SD rely on comparing target groups by first
serializing them to JSON. However, the target group MarshalJSON function
only serializes the __address__ label, which makes eliminates all other
labels from the comparison.
This commit implements a separate marshaling function intended for use in
Kubernetes SD tests. The function serializes all target labels, making
comparisons much more reliable. The commit also fixes all tests that
started to fail due to the newly introduced change.
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir
Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
Fail configuration unmarshalling if kubeconfig or api url are set with
"own namespace"
Only read namespace file if needed.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
When using Kubernetes service discovery on a Prometheus instance that's
not running inside Kubernetes, the creation of the service discovery
fails with a "no such file or directory" error as the special
`/var/run/secrets/kubernetes.io/serviceaccount/namespace` file is not
there. This commit moves the code that reads this file into the
if-branch where no `APIServer.URL` is given (that one basically makes
Prometheus assume it is running inside of a Kubernetes cluster).
Signed-off-by: Georg Gadinger <nilsding@nilsding.org>
When using Kubernetes service discovery on a Prometheus instance that's
not running inside Kubernetes, the creation of the service discovery
fails with a "no such file or directory" error as the special
`/var/run/secrets/kubernetes.io/serviceaccount/namespace` file is not
there. This commit moves the code that reads this file into the
if-branch where no `APIServer.URL` is given (that one basically makes
Prometheus assume it is running inside of a Kubernetes cluster).
Signed-off-by: Georg Gadinger <nilsding@nilsding.org>
This commit adds support for discovering targets from the same
Kubernetes namespace as the Prometheus pod itself. Own-namespace
discovery can be indicated by using "." as the namespace.
Fixes#9782
Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
When using Kubernetes on cloud providers, nodes will have the
spec.providerID field populated to contain the cloud provider specific
name of the EC2/GCE/... instance.
Let's expose this information as an additional label, so that it's
easier to annotate metrics and alerts to contain the cloud provider
specific name of the instance to which it pertains.
Signed-off-by: Ed Schouten <eschouten@apple.com>
When running tests in parallel, 10 milliseconds may not be enough for
all discoverers to register, which will make test flaky.
This commit changes the waiting logic to wait for number of discoverers
to stop increasing during given time frame, which should be large enough
for single discoverer to register in test environment.
A following run passes with this commit:
go test -failfast -race -count 100 -v ./discovery/kubernetes/
Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
We are re-enabling HTTP 2 again. There has been a few bugfixes upstream
in go, and we have also enabled ReadIdleTimeout.
Fix#7588Fix#9068
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
This change sets the scheme to https when a rule specified by Ingress
matches a wildcard DNS entry in the ingress TLS hosts
Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
This PR introduces support for follow_redirect, to enable users to
disable following HTTP redirects.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
Label selector can be
"set-based"(https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#set-based-requirement)
but such a selector causes Prometheus start failure with the "unexpected
error: parsing YAML file ...: invalid selector: 'foo in (bar,baz)';
can't understand 'baz)'"-like error.
This is caused by the `fields.ParseSelector(string)` function that
simply splits an expression as a CSV-list, so a comma confuses such a
parsing method and lead to the error.
Use `labels.Parse(string)` to use a valid lexer to parse a selector
expression.
Closes#8284.
Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com>