Commit graph

7712 commits

Author SHA1 Message Date
Muhammad Falak R Wani 2d1a80aa82
rules: manager: clarify doc string for NewGroupMetrics (#7084)
* rules: manager: clarify doc string for NewGroupMetrics

Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com>
2020-04-07 14:06:01 +02:00
Brian Brazil cd73b3d33e
Reduce how much old WAL we keep around. (#7098)
Previously we were keeping up to around 6 hours of WAL around by
removing 1/3 every hours. This was excessive, so switch to removing 2/3
which will up to around 3 hours of WAL around.

This will roughly halve the size of the WAL and halve startup time for
those who are I/O bound. This may increase the checkpoint size for
those with certain churn patterns, but by much less than we're saving
from the segments.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-04-07 15:55:57 +05:30
Brad Walker 3348930df5
Replace fileutil.ReadDir with ioutil.ReadDir (#7029) (#7033)
* tsdb: Replace fileutil.ReadDir with ioutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>

* tsdb: Remove fileutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>
2020-04-06 19:04:20 +05:30
Julian Taylor 05442b31c8
register federation failure metrics (#7081)
Closes gh-7080

Signed-off-by: Julian Taylor <juliantaylor108@gmail.com>
2020-04-06 09:05:01 +01:00
Chris Marchbanks 62bd77bf93
Fix react tests (#7077)
https://github.com/facebook/create-react-app/issues/8689 is causing our
tests to fail in the CI pipeline. As the comments suggest, downgrading
to react-scripts 3.4.0 fixes the problem.

In addition, fix a test warning due to a missing id field.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2020-04-01 16:37:52 +02:00
Frederic Hemberger fe47c9c86e
[Docs] consul_sd_config: Add default value for allow_stale (#7075)
Ref: https://github.com/prometheus/prometheus/blob/master/discovery/consul/consul.go#L97
Signed-off-by: Frederic Hemberger <mail@frederic-hemberger.de>
2020-03-31 18:55:25 +01:00
MengZeLee a7982ffc0f
Fix typo (#7068)
Fix typo.

Signed-off-by: MengZn <adnt587@gmail.com>
2020-03-30 13:18:34 +05:30
Brian Brazil 7646cbca32
Use .UTC everywhere we use time.Unix (#7066)
time.Unix attaches the local timezone, which can then
leak out (e.g. in the alert json). While this is harmless,
we should be consistent.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-03-29 17:35:39 +01:00
Deepjyoti Mondal c38ca2ca95
Fix #6999 : Add architecture meta label for EC2 (#7000)
This PR adds architecture meta labels for EC2 instances

Signed-off-by: Deepjyoti Mondal <djmdeveloper060796@gmail.com>
2020-03-28 20:41:37 +00:00
Julien Pivotto 0c4ec8d9dd
Merge pull request #6911 from mjtrangoni/remove-buildnametocertificate
scrape/target_test.go: remove deprecated function BuildNameToCertificate()
2020-03-27 17:00:19 +01:00
Julien Pivotto 9057decce2
Merge pull request #7060 from prometheus/release-2.17
Release 2.17
2020-03-27 15:57:07 +01:00
Julien Pivotto ae041f97cf
Release 2.17.1 (#7055)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-26 17:14:12 +01:00
Julien Pivotto 635fff9ea4
Merge pull request #7051 from roidelapluie/revertopt
Revert head posting optimization
2020-03-26 14:54:53 +01:00
Bartlomiej Plotka 104a1313d4
testutil: Enriched Equals with diff on error. (#7053)
## Example:

```

            exp: []chunks.Meta{chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e160), MinTime:1, MaxTime:2}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e180), MinTime:3, MaxTime:3}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e1a0), MinTime:5, MaxTime:5}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e1c0), MinTime:24, MaxTime:6}}

            got: []chunks.Meta{chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e280), MinTime:1, MaxTime:2}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e2a0), MinTime:3, MaxTime:3}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e220), MinTime:5, MaxTime:5}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e240), MinTime:6, MaxTime:6}}

```

Now:

```

            exp: []chunks.Meta{chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e740), MinTime:1, MaxTime:2}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e760), MinTime:3, MaxTime:3}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e780), MinTime:5, MaxTime:5}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e7a0), MinTime:24, MaxTime:6}}

            got: []chunks.Meta{chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e800), MinTime:1, MaxTime:2}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e820), MinTime:3, MaxTime:3}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e860), MinTime:5, MaxTime:5}, chunks.Meta{Ref:0x0, Chunk:(*chunkenc.XORChunk)(0xc00000e880), MinTime:6, MaxTime:6}}

            Diff:
            --- Expected
            +++ Actual
            @@ -50,3 +50,3 @@
               }),
            -  MinTime: (int64) 24,
            +  MinTime: (int64) 6,
               MaxTime: (int64) 6


```
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-26 11:48:23 +02:00
Callum Styan c453def8c5
Separate scrape add error checking out into it's own function. (#6930)
* Separate scrape add error checking out into it's own function.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* pass sampleLimitError to checkAddError instead of returning an error

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Return bool, error from checkAddError so we can properly handle
ErrNotFound for AddFast. This should in theory never happen, but the
previous code path handled this case. Adds a test for this, which master
passes and the previous commit fails.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address comment changes.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Move sampleAdded inside the loop iteration within append, since that's
the only block the variable is used in.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2020-03-25 19:31:48 -07:00
Julien Pivotto ceef10cee4 Reset comment
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-26 00:17:56 +01:00
Julien Pivotto 73228b1b68 Those links should not be reverted
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-25 20:37:26 +01:00
Julien Pivotto 653f343547 Revert head posting optimization
This reverts commit 52630ad0c7.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-25 20:19:33 +01:00
Bartlomiej Plotka b1fcfcf9c4
release: Volunteering for 2.18 release. (#7046)
👋 

I am happy to do next release. This is because we spent a lot time recent weeks in query optimizations on Prometheus, Thanos, and Cortex, so 2.18 release might have significant changes like:

* Chunk Iterators finally ❤️ 
* @pstibrany and @pracucci major optimizations to postings
* Potentially https://github.com/prometheus/prometheus/issues/6878 with @mkabischev help. 

I know I was a release Shephard just in December, but given that I have full context for those I feel like I can help in releasing 2.18 a lot.
2020-03-25 18:43:42 +00:00
Callum Styan be13a4ba7e
Compare querier storage to primary storage via reflect.DeepEqual. (#7050)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2020-03-25 18:42:32 +00:00
Julien Pivotto 12d53dde55
Merge pull request #7044 from prometheus/release-2.17
Backport Release 2.17.0 into master
2020-03-24 21:41:05 +01:00
Bartlomiej Plotka d5c33877f9
storage: Added Chunks{Queryable/Querier/SeriesSet/Series/Iteratable. Added generic Merge{SeriesSet/Querier} implementation. (#7005)
* storage: Added Chunks{Queryable/Querier/SeriesSet/Series/Iteratable. Added generic Merge{SeriesSet/Querier} implementation.

## Rationales:

In many places (e.g. chunk Remote read, Thanos Receive fetching chunk from TSDB), we operate on encoded chunks not samples.
This means that we unnecessary decode/encode, wasting CPU, time and memory.
This PR adds chunk iterator interfaces and makes the merge code to be reused between both seriesSets

I will make the use of it in following PR inside tsdb itself. For now fanout implements it and mergers.

All merges now also allows passing series mergers. This opens doors for custom deduplications other than TSDB vertical ones (e.g. offline one we have in Thanos).

## Changes

* Added Chunk versions of all iterating methods. It all starts in Querier/ChunkQuerier. The plan is that
Storage will implement both chunked and samples.
* Added Seek to chunks.Iterator interface for iterating over chunks.
* NewMergeChunkQuerier was added; Both this and NewMergeQuerier are now using generigMergeQuerier to share the code. Generic code was added.
* Improved tests.
* Added some TODO for further simplifications in next PRs.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Moved s/Labeled/SeriesLabels as per Krasi suggestion.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Krasi's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Second iteration of Krasi comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Another round of comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-24 20:15:47 +00:00
Julien Pivotto 39e01b369d
Release 2.17.0 (#7034)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-24 17:30:41 +01:00
Ben Kochie fac7a4a050
Merge pull request #7037 from prometheus/bjk/golint
Enable golint in CI
2020-03-24 09:20:08 +01:00
Ben Kochie 269e7c8091
Fix golint issues.
Signed-off-by: Ben Kochie <superq@gmail.com>
2020-03-23 20:38:43 +01:00
Ganesh Vernekar 6fdc852813
Fix TestHeadDeleteSimple to test reloaded Head too (#7021)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-23 16:55:25 +02:00
Ben Kochie 51057daaa4
Enable golint in CI
Enable the golint module in golangci-lint.

Fixes: https://github.com/prometheus/prometheus/issues/4125

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-03-23 15:32:37 +01:00
Ganesh Vernekar e64a149984
Close Head in DBReadOnly.FlushWAL (#7022)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-23 14:49:44 +05:30
zhulongcheng e813f60fd6
tsdb: fix sequence check for WAL segments (#7032)
Signed-off-by: zhulongcheng <zhulongcheng.dev@gmail.com>
2020-03-23 13:16:28 +05:30
Ben Kochie 24ecae9956
Update yarn deps (#7027)
Fix security warnings.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-03-22 20:38:48 +01:00
zhulongcheng dbb8f5861d
tsdb: add tombstonesHeaderSize constant (#7028)
Signed-off-by: zhulongcheng <zhulongcheng.dev@gmail.com>
2020-03-22 12:59:35 +05:30
Julien Pivotto f1984bb007
Merge pull request #7025 from prometheus/release-2.17
Merge release 2.17 into master
2020-03-21 21:39:07 +01:00
Julien Pivotto d47bdb9d12
Release 2.17.0-rc.4 (#7016)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-21 20:04:16 +01:00
Bartlomiej Plotka 85a2e94987
Merge pull request #7018 from roidelapluie/backport6971
Backport #6971 to release-2.17
2020-03-21 15:31:32 +00:00
johncming 51c824543b
fix bug missing an error. (#7020)
Signed-off-by: johncming <johncming@yahoo.com>
2020-03-21 12:05:19 +00:00
johncming bdc45c2b9e
remove unused code. (#7019)
Signed-off-by: johncming <johncming@yahoo.com>
2020-03-21 07:15:51 +00:00
beorn7 526cff39b9 Fix tests that were broken by #7009
Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-20 21:22:58 +01:00
Bartlomiej Plotka c4eefd1b3a storage: Removed SelectSorted method; Simplified interface; Added requirement for remote read to sort response.
This is technically BREAKING CHANGE, but it was like this from the beginning: I just notice that we rely in
Prometheus on remote read being sorted. This is because we use selected data from remote reads in MergeSeriesSet
which rely on sorting.

I found during work on https://github.com/prometheus/prometheus/pull/5882 that
we do so many repetitions because of this, for not good reason. I think
I found a good balance between convenience and readability with just one method.
Smaller the interface = better.

Also I don't know what TestSelectSorted was testing, but now it's testing sorting.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-20 21:14:43 +01:00
Julien Pivotto d6ad5551c9
Scrape: do not put staleness marker when cache is reused (#7011)
* Scrape: do not put staleness marker when cache is reused

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-20 17:43:26 +01:00
Callum Styan f802f1e8ca
Fix bug with WAL watcher and Live Reader metrics usage. (#6998)
* Fix bug with WAL watcher and Live Reader metrics usage.

Calling NewXMetrics when creating a Watcher or LiveReader results in a
registration error, which we're ignoring, and as a result other than the
first Watcher/Reader created, we had no metrics for either. So we would
only have metrics like Watcher Records Read for the first remote write
config in a users config file.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2020-03-20 17:34:15 +01:00
Brian Brazil 445d48f4ce
Fix small docs typo (#7014)
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-03-20 12:11:32 +01:00
qinng e31b7b2679
[Doc] Fix wrong description in kubernetes expamle (#7012)
Signed-off-by: guoruyi1 <guoruyi1@xiaomi.com>

Co-authored-by: guoruyi1 <guoruyi1@xiaomi.com>
2020-03-20 08:03:43 +00:00
Bartlomiej Plotka 8fa4ada9ae
Merge pull request #7010 from prometheus/beorn7/fix-test
Fix tests that were broken by #7009
2020-03-19 17:41:02 +00:00
Ganesh Vernekar e50fdbc70c
Live m-mapping of chunks on disk (#6830)
* Live m-mapping of chunks on disk

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments Part 2

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments Part 3

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments Part 4

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Attempt to fix windows bug

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-19 22:03:44 +05:30
beorn7 c0ecbb38af Fix tests that were broken by #7009
Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-19 16:28:23 +01:00
Björn Rabenstein 1da83305be
Merge pull request #7009 from prometheus/release-2.17
Merge release-2.17 into master
2020-03-19 13:46:28 +01:00
johncming bbacd2dd09
remove needless break. (#7008)
Signed-off-by: johncming <johncming@yahoo.com>
2020-03-19 11:21:00 +00:00
Julien Pivotto 7920305e4e
release 2.17.0-rc.3 (#7006)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-18 22:36:59 +01:00
Julien Pivotto 7f86017126
Release 2.17.0-rc.2 (#7003)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-18 17:32:31 +01:00
zhulongcheng 5f5c7a4477
tsdb: sort checkpoints by segment number (#6987)
Signed-off-by: zhulongcheng <zhulongcheng.dev@gmail.com>
2020-03-18 20:40:41 +05:30