Commit graph

492 commits

Author SHA1 Message Date
Ganesh Vernekar 23ce9ad9f0
Introduce evaluation delay for rule groups (#155)
* Allow having evaluation delay for rule groups

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Move the option to ManagerOptions

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Include evaluation_delay in the group config

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-03-14 13:20:07 +00:00
Marco Pracucci f644c5867f
Make linter happy
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2022-02-10 15:32:02 +01:00
Marco Pracucci 583e746a82
Fix error reported by asyncBlockWriter.addSeries()
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2022-02-10 15:07:13 +01:00
Peter Štibraný cd86e92b74
Merge pull request #132 from grafana/reintroduce-old-chunk-mapper
Partial revert of PR 109 – This reintroduces old chunk disk mapper without a queue, that is used when queue size is configured to 0.
2022-02-10 14:33:35 +01:00
Mauro Stettler ff2443c6b9
addressing PR feedback
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
2022-01-12 20:06:32 +00:00
Mauro Stettler 33a0fb58d5
fixing merge mistakes
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
2022-01-12 19:21:46 +00:00
Mauro Stettler 6ed72dadca
fixing merge mistakes
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
2022-01-12 18:49:24 +00:00
Mauro Stettler f4d628d419
resolving merge conflicts 2022-01-10 19:41:06 +00:00
Mauro Stettler 0df3489275
Write chunks via queue, predicting the refs (#10051)
* Write chunks via queue, predicting the refs

Our load tests have shown that there is a latency spike in the
remote write handler whenever the head chunks need to be written,
because chunkDiskMapper.WriteChunk() blocks until the chunks are written
to disk.

This adds a queue to the chunk disk mapper which makes the WriteChunk()
method non-blocking unless the queue is full. Reads can still be served
from the queue.

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* address PR feeddback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* initialize metrics without .Add(0)

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* change isRunningMtx to normal lock

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* do not re-initialize chunkrefmap

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update metric outside of lock scope

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* add benchmark for adding job to chunk write queue

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* remove unnecessary "success" var

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* gofumpt -extra

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* avoid WithLabelValues call in addJob

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* format comments

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* addressing PR feedback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* rename cutExpectRef to cutAndExpectRef

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* use head.Init() instead of .initTime()

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* address PR feedback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* PR feedback

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update test according to PR feedback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* replace callbackWg -> awaitCb

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* better test of truncation with empty files

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* replace callbackWg -> awaitCb

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2022-01-10 13:36:45 +00:00
Nick Pillitteri fc643a4854 Merge remote-tracking branch 'upstream/main' into 56quarters/upstream-update 2022-01-07 10:18:20 -05:00
Oleg Zaytsev a83d46ee9c
Tidy postingsWithIndexHeap (#10123)
Unexported postingsWithIndexHeap's methods that don't need to be
exported, and added detailed comments.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2022-01-06 16:03:44 +05:30
Bryan Boreham 82860a770c
tsdb: use simpler map key to improve exemplar ingest performance (#10111)
* tsdb: fix exemplar benchmarks

Go benchmarks are expected to do an amount of work that varies with
the `b.N` parameter. Previously these benchmarks would report a result
like 0.01 ns/op, which is nonsense.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* tsdb: use simpler map key to improve exemplar perf

Prometheus holds an index of exemplars so it can discard the oldest one
for a series when a new one is added.
Since the keys are not for human eyes, we can use a simpler format
and save the effort of quoting label values.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Exemplars: allocate index map with estimated size

This avoids Go having to re-size the map several times as it grows.
16 exemplars per series is a guess; if it is too low then the map will
be sparse, while if it is too high then the map will have to resize once
or twice.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-01-06 15:58:58 +05:30
Oleg Zaytsev 701545286d
Pop intersected postings heap without popping (#10092)
See this comment for detailed explanation:

https://github.com/prometheus/prometheus/pull/9907#issuecomment-1002189932

TL;DR: if we don't call Pop() on the heap implementation, we don't need
to return our param as an `interface{}` so we save an allocation.
This would be popped for every label value, so it can be thousands of
saved allocations here (see benchmarks).

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2022-01-05 16:16:43 +05:30
Peter Štibraný 61e6173900 Merge remote-tracking branch 'upstream/main' into upgrade-prometheus 2022-01-05 10:44:37 +01:00
Peter Štibraný e51a17b501
CompactBlockMetas should produce correct mint/maxt for overlapping blocks. (#10108)
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
2022-01-05 15:10:00 +05:30
Oleg Zaytsev 3f510ea6c9 Merge prometheus/a961062 2022-01-03 12:00:39 +01:00
Oleg Zaytsev 3947238ce0
Label values with matchers by intersecting postings (#9907)
* LabelValues w/matchers by intersecting postings

Instead of iterating all matched series to find the values, this
checks if each one of the label values is present in the matched series
(postings).

Pending to be benchmarked.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Benchmark labelValuesWithMatchers

name                                                                   old time/op    new time/op
Querier/Head/labelValuesWithMatchers/i_with_n="1"                         157ms ± 0%        48ms ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                      1.80s ± 0%       0.46s ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"                144ms ± 0%        57ms ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"      304ms ± 0%       111ms ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                      761ms ± 0%       164ms ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                        6.11µs ± 0%      6.62µs ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                        117ms ± 0%        62ms ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                     1.44s ± 0%       0.24s ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"              92.1ms ± 0%      70.3ms ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"     196ms ± 0%       115ms ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                     1.23s ± 0%       0.21s ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                       1.06ms ± 0%      0.88ms ± 0%

name                                                                   old alloc/op   new alloc/op
Querier/Head/labelValuesWithMatchers/i_with_n="1"                        29.5MB ± 0%      26.9MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                     46.8MB ± 0%     251.5MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"               29.5MB ± 0%      22.3MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"     46.8MB ± 0%      23.9MB ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                     10.3kB ± 0%  138535.2kB ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                        5.54kB ± 0%      7.09kB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                       39.1MB ± 0%      28.5MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                     287MB ± 0%       253MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"              34.3MB ± 0%      23.9MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"    51.6MB ± 0%      25.5MB ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                     144MB ± 0%       139MB ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                       6.43kB ± 0%      8.66kB ± 0%

name                                                                   old allocs/op  new allocs/op
Querier/Head/labelValuesWithMatchers/i_with_n="1"                          104k ± 0%        500k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                       204k ± 0%        600k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"                 104k ± 0%        500k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"       204k ± 0%        500k ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                       66.0 ± 0%       255.0 ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                          61.0 ± 0%       205.0 ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                         304k ± 0%        600k ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                     5.20M ± 0%       0.70M ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"                204k ± 0%        600k ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"      304k ± 0%        600k ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                     3.00M ± 0%       0.00M ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                         61.0 ± 0%       247.0 ± 0%

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Don't expand postings to intersect them

Using a min heap we can check whether matched postings intersect with
each one of the label values postings. This avoid expanding postings
(and thus having all of them in memory at any point).

Slightly slower than the expanding postings version for some cases, but
definitely pays the price once the cardinality grows.

Still offers 10x latency improvement where previous latencies were
reaching 1s.

Benchmark results:

name \ time/op                                                         old.txt      intersect.txt    intersect_noexpand.txt
Querier/Head/labelValuesWithMatchers/i_with_n="1"                       157ms ± 0%        48ms ± 0%              110ms ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                    1.80s ± 0%       0.46s ± 0%              0.18s ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"              144ms ± 0%        57ms ± 0%              125ms ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"    304ms ± 0%       111ms ± 0%              177ms ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                    761ms ± 0%       164ms ± 0%              134ms ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                      6.11µs ± 0%      6.62µs ± 0%             4.29µs ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                      117ms ± 0%        62ms ± 0%              120ms ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                   1.44s ± 0%       0.24s ± 0%              0.15s ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"            92.1ms ± 0%      70.3ms ± 0%            125.4ms ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"   196ms ± 0%       115ms ± 0%              170ms ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                   1.23s ± 0%       0.21s ± 0%              0.14s ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                     1.06ms ± 0%      0.88ms ± 0%             0.92ms ± 0%

name \ alloc/op                                                        old.txt      intersect.txt    intersect_noexpand.txt
Querier/Head/labelValuesWithMatchers/i_with_n="1"                      29.5MB ± 0%      26.9MB ± 0%             19.1MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                   46.8MB ± 0%     251.5MB ± 0%             36.3MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"             29.5MB ± 0%      22.3MB ± 0%             19.1MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"   46.8MB ± 0%      23.9MB ± 0%             20.7MB ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                   10.3kB ± 0%  138535.2kB ± 0%              6.4kB ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                      5.54kB ± 0%      7.09kB ± 0%             4.30kB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                     39.1MB ± 0%      28.5MB ± 0%             20.7MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                   287MB ± 0%       253MB ± 0%               38MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"            34.3MB ± 0%      23.9MB ± 0%             20.7MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"  51.6MB ± 0%      25.5MB ± 0%             22.3MB ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                   144MB ± 0%       139MB ± 0%                0MB ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                     6.43kB ± 0%      8.66kB ± 0%             5.86kB ± 0%

name \ allocs/op                                                       old.txt      intersect.txt    intersect_noexpand.txt
Querier/Head/labelValuesWithMatchers/i_with_n="1"                        104k ± 0%        500k ± 0%               300k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                     204k ± 0%        600k ± 0%               400k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"               104k ± 0%        500k ± 0%               300k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"     204k ± 0%        500k ± 0%               300k ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                     66.0 ± 0%       255.0 ± 0%              139.0 ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                        61.0 ± 0%       205.0 ± 0%               87.0 ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                       304k ± 0%        600k ± 0%               400k ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                   5.20M ± 0%       0.70M ± 0%              0.50M ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"              204k ± 0%        600k ± 0%               400k ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"    304k ± 0%        600k ± 0%               400k ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                   3.00M ± 0%       0.00M ± 0%              0.00M ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                       61.0 ± 0%       247.0 ± 0%              129.0 ± 0%

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Apply comment suggestions from the code review

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Change else { if } to else if

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Remove sorting of label values

We were not sorting them before, so no need to sort them now

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-12-28 15:59:03 +01:00
Ganesh Vernekar 6c7577177c
Merge pull request #10056 from charlesxsh/fix-TestWALRestoreCorrupted
fix potential goroutine leaks at TestWALRestoreCorrupted
2021-12-21 14:54:45 +05:30
Shihao Xia 3696d7dedb fix potential goroutine leaks
Signed-off-by: Shihao Xia <charlesxsh@hotmail.com>
2021-12-17 18:35:30 -05:00
Julien Pivotto 27343277fa
Merge release-2.32 forward into main (#10032)
* storage: expose bug in iterators #10027

Signed-off-by: beorn7 <beorn@grafana.com>

* storage: fix bug #10027 in iterators' Seek method

Signed-off-by: beorn7 <beorn@grafana.com>

* Append reporting metrics without limit

If reporting metrics fails due to reaching the limit, this makes the
target appear as UP in the UI, but the metrics are missing.

This commit bypasses that limit for report metrics.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Remove check against cfg so interval/ timeout are always set (#10023) (#10031)

Signed-off-by: Nicholas Blott <blottn@tcd.ie>

Co-authored-by: Nicholas Blott <blottn@tcd.ie>

* Cut v2.32.1

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Apply suggestions from code review

Signed-off-by: Julius Volz <julius.volz@gmail.com>

Co-authored-by: Levi Harrison <git@leviharrison.dev>

Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Co-authored-by: Nicholas Blott <blottn@tcd.ie>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Co-authored-by: Levi Harrison <git@leviharrison.dev>
2021-12-17 23:18:38 +01:00
beorn7 0ede6ae321 storage: fix bug #10027 in iterators' Seek method
Signed-off-by: beorn7 <beorn@grafana.com>
2021-12-16 12:07:35 +01:00
Ganesh Vernekar fbe3167fda
Support having an empty queue for writing m-map chunks immediately (#84)
* Support having an empty queue for flushing m-map chunks

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Use old ChunkDiskMapper for 0 length

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-12-10 17:30:34 +05:30
Marco Pracucci 0b925be33d
Nits after PR 57 merge (#85)
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-12-10 15:40:04 +05:30
Ganesh Vernekar df36efb98a
Merge remote-tracking branch 'upstream/main' into syncp 2021-12-09 16:07:02 +05:30
Ganesh Vernekar 227d155e3f
Fix queries after a failed snapshot replay (#9980)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-12-09 15:15:27 +05:30
Ganesh Vernekar 05d4d97bcd
Fix queries after a failed snapshot replay (#9980)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-12-08 15:32:14 +00:00
Mauro Stettler cc9ddf50c9
Write m-map chunks in async via a queue while predicting the m-map chunk refs (#57)
* write chunks via queue, predicting chunkrefs

* fixes

* linter

* cleanup

* better test coverage

* deduplicate operations when cutting file

* better comments

* fix race

* improvements based on PR feedback

* comments

* concurrent correctness test of write queue

* linter

* use isStarted flag in chunk write queue

* separate mutex for isStarted

* fixing tests

* fix race in test

* PR feedback

* add license headers

* PR feedback

* pr feedback

* dont recreate workerCtrl channel

* address PR feedback in high concurrency read and write test

* refactor the high concurrency write queue test

* pr feedback

* use require.Eventually

* typo

* Fixing comment

Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>

* use buffered chan for chunk write queue

* fix races

* address feedback on PR

* better comment

* less type casts

* dont reeimplement varint

* move mtx above property it protects

* improve tests by using wg instead of inspecting queue

* add comment

* add comment

* writeChunkF comment

* rename isStarted -> isRunning

* prevent panic when double stopping

* fix variable name in comment

* delete outdated comment

* naming and comment fixes in TestChunkWriteQueue_WrappingAroundSizeLimit

* use require.False instead of require.Equal(t, false

* always enable chunk write queue

* dont call callback with lock held

* undo rename of curFileSequence to curFileSeq

* fix accidental change

* fix comment

* Better syntax

Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>

* Better wording in comment

Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>

* better comments to explain the use of coordinator chans in test

* fix test which broke to the async writing of chunks

* undo previous renaming of shadowing variable

* rename var threadID to workerID

* rename variables based on PR feedback

* rename pos -> ts

* better implementation of whileNotCanceled

* update querySeriesRef before using it

* rename labels -> lbls

* initialize the chunk write queue operations metric with all possible labels

* dont return error from CutNewFile

* recreate chunk ref map regularly to free

* limit how often we recreate the chunk ref map

* fix typo in comment

* better comment

Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
2021-12-08 13:14:26 +05:30
Nick Pillitteri 084bd70708
Convert atomic Int64 to native type when logging value (#9938)
The atomic Int64 type wasn't able to be represented when logging
via go-kit log and ended up as `"unsupported value type"`.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
2021-12-06 22:25:22 +01:00
Peter Štibraný c31dd6c8b5
Compactor: Open blocks concurrently (#67)
* Open blocks concurrently.
2021-12-02 11:42:29 +00:00
Ganesh Vernekar 2fe806f93f
Register the out of order histogram (#68)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-12-02 11:41:16 +00:00
Peter Štibraný cc9bc8fe9f
Introduced some options for compactor concurrency (#66)
* Tool for CLI compactions.
* Use concurrency when populating symbols for multiple blocks.
* Use concurrency when writing to multiple output blocks.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
2021-12-02 10:34:52 +01:00
Robert Fratto 4cbddb41eb
tsdb/agent: Synchronize appender code with grafana/agent main (#9664)
* tsdb/agent: synchronize appender code with grafana/agent main

This commit synchronize the appender code with grafana/agent main. This
includes adding support for appending exemplars.

Closes #9610

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: fix build error

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: introduce some exemplar tests, refactor tests a little

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: address review feedback

- Re-use hash when creating a new series
- Fix typo in exemplar append error

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: remove unused AddFast method

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: close wal reader after test

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: add out-of-order tracking, change series TS in commit

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* address review feedback

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2021-11-30 21:14:40 +05:30
Dieter Plaetinck 757f57e509
track OOO-ness in tsdb into a histogram (#56)
* track OOO-ness in tsdb into a histogram

This should help with building an understanding of customer requirements
as far as how far back samples tend to go.
Note: this has already been discussed, reviewed and approved on:
https://github.com/grafana/mimir/pull/488

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* update the test files which were not in mimir/vendor
2021-11-29 16:44:35 +05:30
Björn Rabenstein b866db009b
storage: Fix and improve the Seek method of various iterators (#9878)
There was a subtle and nasty bug in listSeriesIterator.Seek.

In addition, the Seek call is defined to be a no-op if the current
position of the iterator is already pointing to a suitable
sample. This commit adds fast paths for this case to several
potentially expensive Seek calls.

Another bug was in concreteSeriesIterator.Seek. It always searched the
whole series and not from the current position of the iterator.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-29 15:17:56 +05:30
Marco Pracucci 9c4410a4b9
Merge remote-tracking branch 'prometheus/main' into upgrade-upstream 2021-11-25 12:01:25 +01:00
Bryan Boreham 1b74a3812e
Fix panic, out of order chunks, and race warning during WAL replay (#9856)
* Fix panic on WAL replay

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Refactor: introduce walSubsetProcessor

walSubsetProcessor packages up the `processWALSamples()` function and
its input and output channels, helping to clarify how these things
relate.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Refactor: extract more methods onto walSubsetProcessor

This makes the main logic easier to follow.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix race warning by locking processWALSamples

Although we have waited for the processor to finish, we still get a
warning from the race detector because it doesn't know how the different
parts relate.

Add a lock round each batch of samples, so the race detector can see
that we never access series owned by the processor outside of a lock.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Added test to reproduce issue 9859

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove redundant unit test

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix out of order chunks during WAL replay

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix nits

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
2021-11-25 13:36:14 +05:30
Marco Pracucci 9662f60f13
Add debug log for out-of-order chunks (#51)
* Add log to debug out-of-order chunks when compacting

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Revert test changes

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Do not use golang 1.17 specific functions

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-24 16:18:24 +01:00
Oleg Zaytsev 5e746e4e88
Check postings bytes length when decoding (#9766)
Added validation to expected postings length compared to the bytes slice
length. With 32bit postings, we expect to have 4 bytes per each posting.
If the number doesn't add up, we know that the input data is not
compatible with our code (maybe it's cut, or padded with trash, or even
written in a different coded).

This is needed in downstream projects to correctly identify cached
postings written with an unknown codec, but it's also a good idea to
validate it here.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2021-11-24 15:26:37 +05:30
Peter Štibraný f93d38aca7 Merge remote-tracking branch 'upstream/main' into merge-upstream 2021-11-19 12:11:26 +01:00
Darshan Chaudhary 9dcf8b2208
Add the ability to disable tsdb isolation (#9270)
* Disable isolation in isolation struct

Signed-off-by: darshanime <deathbullet@gmail.com>

* Run tsdb tests with isolation disabled

Signed-off-by: darshanime <deathbullet@gmail.com>

* Check for isolation disabled in isoState.Close()

Signed-off-by: darshanime <deathbullet@gmail.com>

* use t.Skip to skip isolation tests when disabled

Signed-off-by: darshanime <deathbullet@gmail.com>

* address review comments

Signed-off-by: darshanime <deathbullet@gmail.com>

* fix test for defaultIsolationState

Signed-off-by: darshanime <deathbullet@gmail.com>

* Change flag name. Set flag in DB. Do not init txRing. Close isoState.

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Test disabled isolation in CircleCI test_go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Skip isolation related tests in db_test.go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-19 15:41:32 +05:30
Peter Štibraný c73dcb6259 Fix test. 2021-11-18 16:21:16 +01:00
Peter Štibraný 294c155bb6 Merge remote-tracking branch 'upstream/main' into merge-upstream 2021-11-18 15:48:40 +01:00
Dieter Plaetinck 067efc3725
clarify HeadChunkID type and usage (#9726)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-11-17 18:35:10 +05:30
Sunil Thaha a484a83d4a
fix: panic when checkpoint directory is empty (#9687)
Calling `wal.NewSegmentBufReader()` without any segments would cause a
`panic` resulting in prometheus crashing. This patch fixes the panic by
making segmentBufReader return a EOF if there are not segments.

This also means an empty checkpoint directory which should never be the
case unless it has been tampered with (or has issues due to the
underlying filesystem e.g. NFS) would be ignored by Prometheus and would
continue to run instead of the current behaviour which is to panic.

Fixes: https://github.com/prometheus/prometheus/issues/9605

Signed-off-by: Sunil Thaha <sthaha@redhat.com>
2021-11-17 16:39:04 +05:30
Dieter Plaetinck 0fac9bb859
Add basic initial developer docs for TSDB (#9451)
* Add basic initial developer docs for TSDB

There's a decent amount of content already out there (blog posts,
conference talks, etc), but:
* when they get stale, they don't tend to get updated
* they still leave me with questions that I'ld like to answer
  for developers (like me) who want to use, or work with, TSDB

What I propose is developer docs inside the prometheus
repository.  Easy to find and harness the power of the community
to expand it and keep it up to date.

* perfect is the enemy of good.  Let's have a base and incrementally improve
* Markdown docs should be broad but not too deep.  Source code comments
  can complement them, and are the ideal place for implementation details.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* use example code that works out of the box

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* more docs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Update tsdb/docs/usage.md

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* final tweaks

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* workaround docs versioning issue

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Move example code to real executable, testable example.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* cleanup example test and make sure it always reproduces

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* obtain temp dir in a way that works with older Go versions

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Fix Ganesh's comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-17 15:51:27 +05:30
Marco Pracucci 6525385b30
Merge pull request #29 from grafana/add-jitter-to-chunk-end
Add jitter to head chunks flushing
2021-11-16 11:05:07 +01:00
Mauro Stettler 8a4f659126 fix error message
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
2021-11-12 21:45:46 +01:00
Robert Fratto 72a9f7fee9
Share TSDB locker code with agent (#9623)
* share tsdb db locker code with agent

Closes #9616

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* add flag to disable lockfile for agent

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* use agentOnlySetting instead of PreAction

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: address review feedback

1. Rename Locker to DirLocker
2. Move DirLocker to tsdb/tsdbutil
3. Name metric using fmt.Sprintf
4. Refine error checking in DirLocker test

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: create test utilities to assert expected DirLocker behavior

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/tsdbutil: fix lint errors

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: fix windows test failure

Use new DB variable instead of overriding the old one.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2021-11-11 11:45:25 -05:00
Peter Štibraný 422e7839d4
Add more size checks when writing individual sections in the index. (#9710)
* Add more size checks when writing individual sections in the index.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Use uint and add comment about it.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
2021-11-11 15:44:28 +05:30
Mateusz Gozdek 83086aee00 tsdb/agent: use unique registry per tests
So tests can run in parallel.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-11 01:37:24 +01:00