Commit graph

8302 commits

Author SHA1 Message Date
gotjosh 4eca4dffb8
Allow metric metadata to be propagated via Remote Write. (#6815)
* Introduce a metadata watcher

Similarly to the WAL watcher, its purpose is to observe the scrape manager and pull metadata. Then, send it to a remote storage.

Signed-off-by: gotjosh <josue@grafana.com>

* Additional fixes after rebasing.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Rework samples/metadata metrics.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Use more descriptive variable names in MetadataWatcher collect.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Fix issues caused during rebasing.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Fix missing metric add and unneeded config code.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address some review comments.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Fix metrics and docs

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Replace assert with require

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Bring back max_samples_per_send metric

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

Co-authored-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-11-19 20:53:03 +05:30
Ganesh Vernekar 3ad25a6dc3
Merge pull request #8204 from prometheus/release-2.22
Backport v2.22.2 to master
2020-11-19 19:57:52 +05:30
Ganesh Vernekar 8cc94fac93
Update dependencies for v2.23 (#8203)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-11-19 19:32:19 +05:30
Marco Pracucci db19e05d93
Add option to customise head chunks write buffer size (#8201)
* Add option to customise head chunks write buffer size

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-11-19 18:30:47 +05:30
Peter Wu 1797192f02
Fix the alerting rules name description (#7083) (#8197)
commit 9875afc491 changed the type from
metric names to label values, we might as well adjust the description.
The alternative is to revert that commit and restrict names of alerting
rules again even if that was not really enforced.

Signed-off-by: Peter Wu <pwu@cloudflare.com>
2020-11-18 19:29:01 +00:00
Kuntal Majumder 3093755174
bump the minimum required go version to build to 1.4 (#8196)
Signed-off-by: Kuntal Majumder <kuntal@infracloud.io>
2020-11-18 17:16:33 +00:00
Björn Rabenstein 6e872dee43
Merge pull request #8189 from prometheus/beorn7/mixin
prometheus-mixin: Make PrometheusRemoteWriteBehind more generic
2020-11-17 13:40:24 +01:00
beorn7 638e99c814 prometheus-mixin: Make PrometheusRemoteWriteBehind more generic
Currently, it relies on `job, instance` being the labels completely
identifying a Prometheus instance. However, what's intended is to
simply not match on `remote_name, url`.

Signed-off-by: beorn7 <beorn@grafana.com>
2020-11-17 13:29:49 +01:00
Frederic Branczyk de1c1243f4
Merge pull request #8187 from brancz/cp-8176
Cut v2.22.2
2020-11-16 13:37:34 +01:00
Frederic Branczyk 84b186b3fd
*: Cut v2.22.2
Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
2020-11-16 10:50:14 +01:00
Brian Brazil 983a8eae4f
Protect sp.loops from concurrent access. (#8176)
Manager.reload takes the mutex that would make it safe, however
releases it before the goroutines spawned are finished with it.
Thus more explicit locking of scrapePool.Sync/stop/reload is needed.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-11-16 10:44:14 +01:00
Brian Brazil ebe0da7a72
Protect sp.loops from concurrent access. (#8176)
Manager.reload takes the mutex that would make it safe, however
releases it before the goroutines spawned are finished with it.
Thus more explicit locking of scrapePool.Sync/stop/reload is needed.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-11-12 16:06:25 +00:00
Björn Rabenstein 5c65cb066b
Merge pull request #8170 from prometheus/beorn7/mixin
prometheus-mixin: add HA-group aware alerts
2020-11-12 16:00:22 +01:00
Brian Brazil bef9d4e182
Don't include rendered expression in parse errors. (#8177)
For a sufficiently complex expression, having to render
thousands of nodes for every subquery type error can take
a while - so don't do that.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-11-12 15:25:52 +01:00
Julien Pivotto f97fba7cc9
React UI: Change "metrics autocomplete" with "autocomplete" (#8174)
- First, it is currently not only removing "metric" autocomplete, but
also "query history autocomplete", so the checkbox is confusing.
- Then, in the future, we will want also "functions" autocomplete.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-12 11:48:48 +01:00
beorn7 371ca9ff46 prometheus-mixin: add HA-group aware alerts
There is certainly a potential to add more of these. This is mostly
meant to introduce the concept and cover a few critical parts.

Signed-off-by: beorn7 <beorn@grafana.com>
2020-11-11 19:45:34 +01:00
Julien Pivotto cda52234eb
Fix panic with double close() of channel on /-/quit/ (#8166)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-10 00:09:39 +01:00
Bartlomiej Plotka 4513537034
Exposed DeletionIterator and CompactMetas functions. (#8161)
* Exposed DeletionIterator and CompactMetas functions.

Required for CLI for deletions in Thanos: https://github.com/thanos-io/thanos/pull/3421

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Removed Thanos usage mentions.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-11-09 16:51:25 +00:00
Ganesh Vernekar d30da66d77
Fix timestamp() method for vector selector inside paren (#8164)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-11-09 18:21:50 +05:30
Julien Pivotto f8ba0ed906
Rebase docker swarm test (#8163)
Fix master tests

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-09 18:08:08 +05:30
Julien Pivotto 3509647462
Docker swarm: add filtering of services (#8074)
* Docker swarm: add filtering of services

Add filters on all docker swarm roles (nodes, tasks and services).

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-09 12:41:02 +01:00
Julien Pivotto 386e5010b8
Make the brand link in classic UI come back to classic UI home (#8160)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-08 21:54:29 +01:00
Julien Pivotto 8dd0a3e9a3
Fix button display when there is no panels (#8155)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-06 16:34:35 +01:00
Julien Pivotto f81536d020
Merge pull request #8154 from roidelapluie/merge222
Merge 2.22 to master
2020-11-05 23:05:44 +01:00
Julien Pivotto 9bf48721d8 Merge remote-tracking branch 'origin/release-2.22' into master
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-05 23:00:32 +01:00
Björn Rabenstein b3a02c3d04
Merge pull request #8153 from prometheus/beorn7/release
Volunteer for v2.23 release
2020-11-05 19:51:46 +01:00
beorn7 b5513d3dee Add volunteers for releases 2.23 and 2.24
Signed-off-by: beorn7 <beorn@grafana.com>
2020-11-05 17:55:29 +01:00
Frederic Branczyk 00f16d1ac3
Merge pull request #8146 from brancz/cp-8104-8061
Cut v2.22.1
2020-11-05 14:57:50 +01:00
Frederic Branczyk 8e0e326f37
*: Cut v2.22.1
Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
2020-11-04 14:14:19 +01:00
Brian Brazil e7c6feabe1
More granular locking for scrapeLoop. (#8104)
Don't lock for all of Sync/stop/reload as that holds up /metrics and the
UI when they want a list of active/dropped targets. Instead take
advantage of the fact that Sync/stop/reload cannot be called
concurrently by the scrape Manager and lock just on the targets
themselves.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-11-04 09:08:04 +01:00
Ganesh Vernekar 601a3ef0bb
Read repair empty last file in chunks_head (#8061)
* Read repair empty file in chunks_head

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Refactor and introduce repairLastChunkFile

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Attempt windows test fix

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-11-04 09:08:04 +01:00
Frederic Branczyk f3da817de9
Merge pull request #8147 from roidelapluie/duc
Calculate head chunk size based on actual disk usage (#8139)
2020-11-04 09:06:53 +01:00
Tobias Schmidt 4ee2e1fd93
Install promu v0.7.0 (#8148)
Will start producing ZIP archives for Windows releases in addition to tarballs.

Signed-off-by: Tobias Schmidt <tobidt@gmail.com>
2020-11-04 00:33:59 +01:00
Julius Volz 3470ee1fbf
Make React UI the default, keep old UI under /classic (#8142)
The React app's assets are now served under /assets, while all old
custom web assets (including the ones for console templates) are now
served from /classic/static.

I tested different combinations of --web.external-url and
--web.route-prefix with proxies in front, and I couldn't find a problem
yet with the routing. Console templates also still work.

While migrating old endpoints to /classic, I noticed that /version was
being treated like a lot of the old UI pages, with readiness check
handler in front of it, etc. I kept it in /version and removed that
readiness wrapper, since it doesn't seem to be needed for that endpoint.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2020-11-03 14:51:48 +01:00
Julien Pivotto df6eed2358 Calculate head chunk size based on actual disk usage (#8139)
Cherry-picks 8bc369bf9b

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-03 13:02:17 +01:00
Julien Pivotto 8bc369bf9b
Calculate head chunk size based on actual disk usage (#8139)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-03 15:34:59 +05:30
Jarod Watkins 388a2d20c9
UI: Add toggle to enable/disable metric autocomplete (#8134)
* UI: Add toggle to enable/disable metric autocomplete

This change adds a toggle to enable or disable the metric autocomplete
functionality. By default it is enabled. This is a port of a change I
did in [Thanos][1].

[1]: https://github.com/thanos-io/thanos/pull/3381

Signed-off-by: Jarod Watkins <jarod@42lines.net>

* Adding full variable name

Signed-off-by: Jarod Watkins <jarod@42lines.net>
2020-11-02 16:16:54 +01:00
Julien Pivotto 7558e9d3c3
API: Make snapshot length always the same size (#8138)
Fixes #8135

Minor improvement, also we now use Int64 even on 32 bit systems.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-01 21:07:38 +01:00
Marco Pracucci 63be30dcee
Fixed WAL corruption on partial writes within a page (#8125)
* Fixed WAL corruption on partial writes

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Renamed variable

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed test

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Addressed review comments

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-10-29 16:07:03 +05:30
Julien Pivotto 318190021d
Enforce the usage of github.com/stretchr/testify/require (#8129)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-29 11:09:08 +01:00
Julien Pivotto 6c56a1faaa
Testify: move to require (#8122)
* Testify: move to require

Moving testify to require to fail tests early in case of errors.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* More moves

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-29 09:43:23 +00:00
Bartlomiej Plotka 3d8826a3d4
MultiError: Refactored MultiError for more concise and safe usage. (#8066)
* MultiError: Refactored MultiError for more concise and safe usage.

* Less lines
* Goland IDE was marking every usage of old MultiError "potential nil" error
* It was easy to forgot using Err() when error was returned, now it's safely assured on compile time.

NOTE: Potentially I would rename package to merrors. (: In different PR.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed review comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix after rebase.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-10-28 15:24:58 +00:00
johncming 28ca0965f0
tsdb/chunkenc: fix typo of return error. (#7670)
* tsdb/chunkenc: fix typo of return error.

Signed-off-by: johncming <johncming@yahoo.com>

* tsdb: fix typo of function in markdonw.

Signed-off-by: johncming <johncming@yahoo.com>
2020-10-28 12:03:11 +00:00
Jorge Luis Betancourt 4dc755cd27
Add a metric for tracking max_samples_per_send (#8102)
Currently there is no way of tracking the value of the
`max_samples_per_send` configuration option, which is commonly tweaked
when integrating with a remote write backend.

Signed-off-by: Jorge Luis Betancourt Gonzalez <jorge-luis.betancourt@trivago.com>
2020-10-28 11:39:36 +00:00
Ganesh Vernekar 3245b3267b
Don't use returned DB to close resources on TSDB startup error (#8113)
* Don't use returned DB to close resources on TSDB startup error

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add unit test and fix another panic

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comment

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-10-28 15:39:03 +05:30
Julien Pivotto daf382e4a9
Merge pull request #8112 from prometheus/release-2.22
Merge release 2.22 to master
2020-10-27 11:07:13 +01:00
Julien Pivotto 1282d1b39c
Refactor test assertions (#8110)
* Refactor test assertions

This pull request gets rid of assert.True where possible to use
fine-grained assertions.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-27 11:06:53 +01:00
Anthony D'Atri 2cbc0f9bfe
Various doc tweaks (#8111)
Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
2020-10-27 09:50:37 +00:00
Brian Brazil 3f8e51738c
More granular locking for scrapeLoop. (#8104)
Don't lock for all of Sync/stop/reload as that holds up /metrics and the
UI when they want a list of active/dropped targets. Instead take
advantage of the fact that Sync/stop/reload cannot be called
concurrently by the scrape Manager and lock just on the targets
themselves.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-10-26 14:46:20 +00:00
David Leadbeater e7e60623ff
promtool: Calculate mint and maxt per test (#8096)
* promtool: Calculate mint and maxt per test

Previously a single test that used a later eval time would make all
other tests in the file share the [mint, maxt] and potentially evaluate
far more samples than needed.

Fixes: #8019

Signed-off-by: David Leadbeater <dgl@dgl.cx>
2020-10-24 12:03:55 +01:00