Previously we were keeping up to around 6 hours of WAL around by
removing 1/3 every hours. This was excessive, so switch to removing 2/3
which will up to around 3 hours of WAL around.
This will roughly halve the size of the WAL and halve startup time for
those who are I/O bound. This may increase the checkpoint size for
those with certain churn patterns, but by much less than we're saving
from the segments.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
https://github.com/facebook/create-react-app/issues/8689 is causing our
tests to fail in the CI pipeline. As the comments suggest, downgrading
to react-scripts 3.4.0 fixes the problem.
In addition, fix a test warning due to a missing id field.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
time.Unix attaches the local timezone, which can then
leak out (e.g. in the alert json). While this is harmless,
we should be consistent.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Separate scrape add error checking out into it's own function.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* pass sampleLimitError to checkAddError instead of returning an error
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Return bool, error from checkAddError so we can properly handle
ErrNotFound for AddFast. This should in theory never happen, but the
previous code path handled this case. Adds a test for this, which master
passes and the previous commit fails.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address comment changes.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Move sampleAdded inside the loop iteration within append, since that's
the only block the variable is used in.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
👋
I am happy to do next release. This is because we spent a lot time recent weeks in query optimizations on Prometheus, Thanos, and Cortex, so 2.18 release might have significant changes like:
* Chunk Iterators finally ❤️
* @pstibrany and @pracucci major optimizations to postings
* Potentially https://github.com/prometheus/prometheus/issues/6878 with @mkabischev help.
I know I was a release Shephard just in December, but given that I have full context for those I feel like I can help in releasing 2.18 a lot.
* storage: Added Chunks{Queryable/Querier/SeriesSet/Series/Iteratable. Added generic Merge{SeriesSet/Querier} implementation.
## Rationales:
In many places (e.g. chunk Remote read, Thanos Receive fetching chunk from TSDB), we operate on encoded chunks not samples.
This means that we unnecessary decode/encode, wasting CPU, time and memory.
This PR adds chunk iterator interfaces and makes the merge code to be reused between both seriesSets
I will make the use of it in following PR inside tsdb itself. For now fanout implements it and mergers.
All merges now also allows passing series mergers. This opens doors for custom deduplications other than TSDB vertical ones (e.g. offline one we have in Thanos).
## Changes
* Added Chunk versions of all iterating methods. It all starts in Querier/ChunkQuerier. The plan is that
Storage will implement both chunked and samples.
* Added Seek to chunks.Iterator interface for iterating over chunks.
* NewMergeChunkQuerier was added; Both this and NewMergeQuerier are now using generigMergeQuerier to share the code. Generic code was added.
* Improved tests.
* Added some TODO for further simplifications in next PRs.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
* Addressed Brian's comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
* Moved s/Labeled/SeriesLabels as per Krasi suggestion.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
* Addressed Krasi's comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
* Second iteration of Krasi comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
* Another round of comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
This is technically BREAKING CHANGE, but it was like this from the beginning: I just notice that we rely in
Prometheus on remote read being sorted. This is because we use selected data from remote reads in MergeSeriesSet
which rely on sorting.
I found during work on https://github.com/prometheus/prometheus/pull/5882 that
we do so many repetitions because of this, for not good reason. I think
I found a good balance between convenience and readability with just one method.
Smaller the interface = better.
Also I don't know what TestSelectSorted was testing, but now it's testing sorting.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
* Fix bug with WAL watcher and Live Reader metrics usage.
Calling NewXMetrics when creating a Watcher or LiveReader results in a
registration error, which we're ignoring, and as a result other than the
first Watcher/Reader created, we had no metrics for either. So we would
only have metrics like Watcher Records Read for the first remote write
config in a users config file.
Signed-off-by: Callum Styan <callumstyan@gmail.com>