Commit graph

77 commits

Author SHA1 Message Date
Boqin Qin cdbd42393e
scrape: fix goroutine leak in test (#6812)
* scrape: fix goroutine leak in test

Signed-off-by: BurtonQin <bobbqqin@gmail.com>
2020-02-13 07:53:07 +00:00
Julien Pivotto 9c67fce6e0
Scrape: test samples_post_metric_relabeling when metrics are dropped (#6720)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-29 17:47:36 +00:00
gotjosh 8b49c9285d
scrape: Add metrics to track bytes and entries in the metadata cache (#6675)
Signed-off-by: gotjosh <josue@grafana.com>
2020-01-29 11:13:18 +00:00
Julien Pivotto fafb7940b1 Pass over scrape cache to the next scrape (#6670)
* Pass over scrape cache to the next scrape

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-22 12:13:47 +00:00
Julien Pivotto 46d18112a3 tsdb: error on series with duplicate labels (#6664)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-20 11:05:27 +00:00
Julien Pivotto 31700a05df Improve testutil.ErrorEqual (#6471)
Also improves TestPopulateLabels: testutil.ErrorEqual just returned a
bool without failing the test.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2019-12-17 21:11:33 +00:00
gotjosh 05842176a6 Make the scrape.metricMetadataStore interface public
To test the implementation of our metric metadata API, we need to represent various states of metadata in the scrape metadata store. That is currently not possible as the interface and method to set the store are private.

This changes the interface, list and get methods, and the SetMetadaStore function to be public.

Incidentally, the scrapeCache implementation needs to be renamed to match the new signature.

Signed-off-by: gotjosh <josue@grafana.com>
2019-12-05 10:29:58 +00:00
Geoffrey Beausire 5cb7987314 Fix relabaling collision when using exported label
When using both a label and the suffix+label in the
relabel config. It's possible that Prometheus remove
the suffx+label for no obvious reason. It's due to a
collision when merging labels from target and from
the sample.

Signed-off-by: Geoffrey Beausire <g.beausire@criteo.com>
2019-11-26 11:03:11 +01:00
Dustin Hooten ca60bf298c React UI: Implement /targets page (#6276)
* Add LastScrapeDuration to targets endpoint

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* Add Scrape job name to targets endpoint

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* Implement the /targets page in react

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* Add state query param to targets endpoint

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* Use state filter in api call

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* api feedback

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* pr feedback frontend

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* Implement and use localstorage hook

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* PR feedback

Signed-off-by: Dustin Hooten <dhooten@splunk.com>
2019-11-11 22:42:24 +01:00
Alex Dzyoba 1a38075f83 scrape: Move tests to testutil (#6187)
Part of the fix for #3242.

Signed-off-by: Alex Dzyoba <alex@dzyoba.com>
2019-11-04 16:43:42 -07:00
yuxiaobo 47e51c8b2b Correct spelling mistakes
Signed-off-by: yuxiaobo <yuxiaobogo@163.com>
2019-10-10 18:46:27 +08:00
johncming 4757c69157 scrape: close manager gracefully at end. (#6044)
Signed-off-by: johncming <johncming@yahoo.com>
2019-09-23 12:28:37 +02:00
johncming 1fa5a75a3a Ctx name (#5961)
* scrape: rename ctx name for readability

Signed-off-by: johncming <johncming@yahoo.com>

* scrape: use self ctx instead of parent ctx.

Signed-off-by: johncming <johncming@yahoo.com>
2019-08-28 15:55:09 +02:00
Julius Volz b5c833ca21
Update go.mod dependencies before release (#5883)
* Update go.mod dependencies before release

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Add issue for showing query warnings in promtool

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Revert json-iterator back to 1.1.6

It produced errors when marshaling Point values with special float
values.

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Fix expected step values in promtool tests after client_golang update

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Update generated protobuf code after proto dep updates

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2019-08-14 11:00:39 +02:00
Brian Brazil e62f30d497
Correctly handle empty labels from alert templates. (#5845)
Fixes https://github.com/prometheus/common/issues/36

Move logic handling this into the labels package,
so all the cases are handled in one place and we're
less likely to have this come up again.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2019-08-13 11:19:17 +01:00
Chris Marchbanks 529ccff07b
Remove all usages of stretchr/testify
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2019-08-08 19:49:27 -06:00
Chris Marchbanks 0685eb5395
Refactor testutil.NewStorage into a new package
This avoids a circular dependency between the testutil and storage
packages.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2019-08-08 19:43:04 -06:00
AllenZMC 9e47bb8b46 fix word 'parmeters' to 'parameters' (#5826)
Signed-off-by: czm <zhongming.chang@daocloud.io>
2019-08-02 14:52:15 +01:00
yeya24 b7bb278e95 make targets active parallel (#5740)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2019-07-29 17:08:54 +01:00
Simon Pasquier 9a1935d641
scrape: remove unused type (#5761)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-07-15 08:54:22 +02:00
Brian Brazil b98e818876
Add scrape_series_added per-scrape metric. (#5546)
This is an estimate of churn, with series being added to the cache being
considered churn. This will have both false positives (e.g. series
appearing and disappearing) and false negatives (e.g. series hit
sample_limit, but still created in head block), but should be generally
useful as-is.

Relevant docs live in another repo.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2019-05-08 22:24:00 +01:00
Simon Pasquier 45506841e6
*: enable all default linters (#5504)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-05-03 15:11:28 +02:00
Romain Baugue 95193fa027 Exhaust every request body before closing it (#5166) (#5479)
From the documentation:
> The default HTTP client's Transport may not
> reuse HTTP/1.x "keep-alive" TCP connections if the Body is
> not read to completion and closed.

This effectively enable keep-alive for the fixed requests.

Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>
2019-04-18 09:50:37 +01:00
Simon Pasquier c1682adb2f Bump prometheus/common to v0.3.0 (#5344)
* Reload certificates from disk automatically

This change bumps github.com/prometheus/common to include
https://github.com/prometheus/common/pull/173

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* scrape: close idle connections on reload/stop

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* use v0.3.0 tag

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-04-10 13:20:00 +01:00
Brian Brazil f7184978f4 Protect against memory exhaustion when scraping.
Now that we're not losing the scrape cache across failed
scrape, a scrape that continually failed but had varying
series or metadata (e.g. timestamps in metric names,
plus hitting smaple_limit) would grow the cache indefinitely.

Add some code to catch that, and flush the cache anyway.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2019-04-04 19:09:11 +01:00
Brian Brazil dd3073616c Don't lose the scrape cache on a failed scrape.
This avoids CPU usage increasing when the target comes back.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2019-04-04 19:09:11 +01:00
Tariq Ibrahim 8fdfa8abea refine error handling in prometheus (#5388)
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-26 00:01:12 +01:00
Tom Wilkie 807fd33ecc Review feedback.
- Update read path to use labels.Labels.
- Fix the tests.
- Remove pack.
- Remove unused function.
- Fix race in tests.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-03-18 20:31:12 +00:00
Simon Pasquier 23069b87dc scrape: fallback to hostname if lookup fails (#5366)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-03-15 12:02:16 +00:00
Julien Pivotto 4397916cb2 Add honor_timestamps (#5304)
Fixes #5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2019-03-15 10:04:15 +00:00
xjewer 0d1a69353e scrape: Add global jitter for HA server (#5181)
* scrape: Add global jitter for HA server

Covers issue in https://github.com/prometheus/prometheus/pull/4926#issuecomment-449039848
where the HA setup become a problem for targets unable to be scraped simultaneously.
The new jitter per server relies on the hostname and external labels which necessarily to be uniq.

As before, scrape offset will be calculated with regard the absolute time, so even
restart/reload doesn't change scrape time per scrape target + prometheus instance.

Use fqdn if possible, otherwise fall back to the hostname. It adds extra random seed
to calculate server hash to be distinguish on machines with the same hostname, but
different DC.

Signed-off-by: Aleksei Semiglazov <xjewer@gmail.com>
2019-03-12 10:46:15 +00:00
Julien Pivotto 04ce817c49 scrape: Rewrite scrape loop options as a struct (#5314)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2019-03-12 10:26:18 +00:00
Nguyen Hai Truong aed9ea144a Remove duplicated words in comments
Although it is spelling mistakes, it might make an affects
while reading.

Co-Authored-By: Kim Bao Long longkb@vn.fujitsu.com
Signed-off-by: Nguyen Hai Truong <truongnh@vn.fujitsu.com>
2019-02-20 17:41:02 -08:00
Simon Pasquier 12708acd15
scrape: catch errors when creating HTTP clients (#5182)
* scrape: catch errors when creating HTTP clients

This change makes sure that no scrape pool is created with a nil HTTP
client.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Address Tariq's comment

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Address Brian's comment

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-13 14:24:22 +01:00
JoeWrightss 4cb6c202ff Fix fmt.Errorf error message (#5199)
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-02-10 15:16:20 +05:30
Matt Layher 302148fd69 *: apply gofmt -s
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-01-16 17:28:14 -05:00
Simon Pasquier f678e27eb6
*: use latest release of staticcheck (#5057)
* *: use latest release of staticcheck

It also fixes a couple of things in the code flagged by the additional
checks.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use official release of staticcheck

Also run 'go list' before staticcheck to avoid failures when downloading packages.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-04 14:47:38 +01:00
Bartek Płotka 62c8337e77 Moved configuration into relabel package. (#4955)
Adapted top dir relabel to use pkg relabel structs.

Removal of this in a separate tracked here: https://github.com/prometheus/prometheus/issues/3647

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2018-12-18 11:26:36 +00:00
F4ncyMooN cd491e2d3a wait for interval-now%interval to make sure target will be collected with a fixed interval when restart prometheus (#4926)
Signed-off-by: hlv <hlv@freewheel.tv>
2018-12-05 09:58:39 +00:00
Ben Kochie c6399296dc
Fix spelling/typos (#4921)
* Fix spelling/typos

Fix spelling/typos reported by codespell/misspell.
* UK -> US spelling changes.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-11-27 17:44:29 +01:00
Brian Brazil d2f0f54d68
Pass through content-type for non-compressed output. (#4912)
Fixes #4911

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-11-26 13:05:07 +00:00
Wei Guo 996fd958ac fix deadlock in scrape manager (#4894)
Scrape manager will fall in deadlock when we reload configs frequently.
2018-11-23 11:23:55 +02:00
arugaki b98a5acb57 fix buf redeclared in scrapeloop (#4873)
* buf Has been declared in scrape.go in line 785, I think it is unnecessary to declare a new variable again here.

Signed-off-by: arugakiWei <arugaki.wei@daocloud.io>

* delete the buf in line 785 because  it is never used.

Signed-off-by: arugakiWei <arugaki.wei@daocloud.io>
2018-11-22 12:13:52 +08:00
Simon Pasquier ed19373a78
*: remove use of golang.org/x/net/context (#4869)
* *: remove use of golang.org/x/net/context

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* scrape: fix TestTargetScrapeScrapeCancel

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-19 12:31:16 +01:00
Brian Brazil 9c03e11c2c Hook OpenMetrics parser into scraping.
Extend metadata api to support units.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-10-18 13:58:00 +01:00
Brian Brazil ffe7efb411 Prepare for multiple text formats
Pass content type down to text parser.

Add layer of indirection in front of text parser,
and rename to avoid future clashes.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-10-18 13:58:00 +01:00
Will Hegedus 193ebe7e34 Updates to /targets and /rules (scrape duration, last evaluation time) (#4722)
* Add evaluationTimestamp (Last Evaluation) column to display on /rules
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Add lastScrapeDuration ("Scrape Duration") to display on /targets
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Updates based on Julius' feedback

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Update to set timestamp to when eval started (after eval completes)

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Update /rules to display time since last evaluation

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Re-order Last Eval/Eval Time to be consistent with targets page

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
2018-10-12 18:26:59 +02:00
Krasi Georgiev 47a673c3a0
process scrape loops reloading in parallel (#4526)
The scrape manage receiver's channel now just saves the target sets
and another backgorund runner updates the scrape loops every 5 seconds.
This is so that the scrape manager doesn't block the receiving channel
when it does the long background reloading of the scrape loops.

Active and dropped targets are now saved in each scrape pool instead of
the scrape manager. This is mainly to avoid races when getting the
targets via the web api.

When reloading the scrape loops now happens in parallel to speed up the
final disared state and this also speeds up the prometheus's shutting
down.

Also updated some funcs signatures in the web package for consistency.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-09-26 12:20:56 +03:00
Julius Volz 8fbe1b5133
Handle a bunch of unchecked errors (#4461)
There are many more (mostly finalizers like Close/Stop/etc.), but most of
the others seemed like one couldn't do much about them anyway.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-17 17:24:35 +02:00
Krasi Georgiev 9f2f6accba fix the TestManagerReloadNoChange test (#4267)
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-07-04 12:01:19 +01:00