Fixes https://github.com/prometheus/common/issues/36
Move logic handling this into the labels package,
so all the cases are handled in one place and we're
less likely to have this come up again.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
This is an estimate of churn, with series being added to the cache being
considered churn. This will have both false positives (e.g. series
appearing and disappearing) and false negatives (e.g. series hit
sample_limit, but still created in head block), but should be generally
useful as-is.
Relevant docs live in another repo.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
From the documentation:
> The default HTTP client's Transport may not
> reuse HTTP/1.x "keep-alive" TCP connections if the Body is
> not read to completion and closed.
This effectively enable keep-alive for the fixed requests.
Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>
* Reload certificates from disk automatically
This change bumps github.com/prometheus/common to include
https://github.com/prometheus/common/pull/173
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* scrape: close idle connections on reload/stop
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* use v0.3.0 tag
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Now that we're not losing the scrape cache across failed
scrape, a scrape that continually failed but had varying
series or metadata (e.g. timestamps in metric names,
plus hitting smaple_limit) would grow the cache indefinitely.
Add some code to catch that, and flush the cache anyway.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
* scrape: Add global jitter for HA server
Covers issue in https://github.com/prometheus/prometheus/pull/4926#issuecomment-449039848
where the HA setup become a problem for targets unable to be scraped simultaneously.
The new jitter per server relies on the hostname and external labels which necessarily to be uniq.
As before, scrape offset will be calculated with regard the absolute time, so even
restart/reload doesn't change scrape time per scrape target + prometheus instance.
Use fqdn if possible, otherwise fall back to the hostname. It adds extra random seed
to calculate server hash to be distinguish on machines with the same hostname, but
different DC.
Signed-off-by: Aleksei Semiglazov <xjewer@gmail.com>
* scrape: catch errors when creating HTTP clients
This change makes sure that no scrape pool is created with a nil HTTP
client.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Address Tariq's comment
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Address Brian's comment
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* *: use latest release of staticcheck
It also fixes a couple of things in the code flagged by the additional
checks.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Use official release of staticcheck
Also run 'go list' before staticcheck to avoid failures when downloading packages.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* buf Has been declared in scrape.go in line 785, I think it is unnecessary to declare a new variable again here.
Signed-off-by: arugakiWei <arugaki.wei@daocloud.io>
* delete the buf in line 785 because it is never used.
Signed-off-by: arugakiWei <arugaki.wei@daocloud.io>
* *: remove use of golang.org/x/net/context
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* scrape: fix TestTargetScrapeScrapeCancel
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Pass content type down to text parser.
Add layer of indirection in front of text parser,
and rename to avoid future clashes.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
The scrape manage receiver's channel now just saves the target sets
and another backgorund runner updates the scrape loops every 5 seconds.
This is so that the scrape manager doesn't block the receiving channel
when it does the long background reloading of the scrape loops.
Active and dropped targets are now saved in each scrape pool instead of
the scrape manager. This is mainly to avoid races when getting the
targets via the web api.
When reloading the scrape loops now happens in parallel to speed up the
final disared state and this also speeds up the prometheus's shutting
down.
Also updated some funcs signatures in the web package for consistency.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
There are many more (mostly finalizers like Close/Stop/etc.), but most of
the others seemed like one couldn't do much about them anyway.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
This adds a per-target cache of scraped metadata. The metadata is only
available for the lifecycle of the attached target. An API endpoint allows
to select metadata by metric name and a label selection of targets.
Signed-off-by: Fabian Reinartz <freinartz@google.com>
Extends the parser to allow retrieving metadata.
The lexer now yields proper tokens that are fed into a hand-written
parser on top.
Signed-off-by: Fabian Reinartz <freinartz@google.com>
This commit avoids passing the full scrape configuration down to the
scrape loop to fix data races when the scrape configuration is being
reloaded.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Fix race by properly locking access to scrape pools. Use separate mutex for information needed by UI so that UI isn't blocked when targets are being updated.
The semantics of honor_labels are that if a target exposes
and empty label it will override the target labels. This PR
fixes that by once again distinguishing between empty labels
and missing labels in this one use case.
Beyond that empty labels should be pruned and not added to storage,
which this also fixes.
Fixes#3841