Commit graph

23 commits

Author SHA1 Message Date
Bryan Boreham 00ec720c29
tsdb: extract functions to encode and decode labels (#11045)
* tsdb/record: Extract functions to encode and decode labels

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* tsdb: make use of Encode/Decode Labels

Simplify the code by re-using routines from tsdb/record.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-07-26 20:12:00 +05:30
Paschalis Tsilias d1122e0743
Introduce TSDB changes for appending metadata to the WAL (#10972)
* Append metadata to the WAL

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Remove extra whitespace; Reword some docstrings and comments

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Use RLock() for hasNewMetadata check

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Use single byte for metric type in RefMetadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Update proposed WAL format for single-byte type metadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Implementa MetadataAppender interface for the Agent

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Address first round of review comments

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Amend description of metadata in wal.md

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Correct key used to retrieve metadata from cache

When we're setting metadata entries in the scrapeCace, we're using the
p.Help(), p.Unit(), p.Type() helpers, which retrieve the series name and
use it as the cache key. When checking for cache entries though, we used
p.Series() as the key, which included the metric name _with_ its labels.
That meant that we were never actually hitting the cache. We're fixing
this by utiling the __name__ internal label for correctly getting the
cache entries after they've been set by setHelp(), setType() or
setUnit().

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Put feature behind a feature flag

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix AppendMetadata docstring

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Reorder WAL format document

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Change error message of AppendMetadata; Fix access of s.meta in AppendMetadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Reuse temporary buffer in Metadata encoder

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Only keep latest metadata for each refID during checkpointing

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix test that's referencing decoding metadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Avoid creating metadata block if no new metadata are present

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Add tests for corrupt metadata block and relevant record type

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix CR comments

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Extract logic about changing metadata in an anonymous function

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Implement new proposed WAL format and amend relevant tests

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Use 'const' for metadata field names

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Apply metadata to head memSeries in Commit, not in AppendMetadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Add docstring and rename extracted helper in scrape.go

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Add tests for tsdb-related cases

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix linter issues vol1

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix linter issues vol2

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix Windows test by closing WAL reader files

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Use switch instead of two if statements in metadata decoding

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix review comments around TestMetadata* tests

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Add code for replaying WAL; test correctness of in-memory data after a replay

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Remove scrape-loop related code from PR

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Address first round of comments

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Simplify tests by sorting slices before comparison

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix test to use separate transactions

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Empty out buffer and record slices after encoding latest metadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix linting issue

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Update calculation for DroppedMetadata metric

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Rename MetadataAppender interface and AppendMetadata method to MetadataUpdater/UpdateMetadata

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Reuse buffer when encoding latest metadata for each series

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Fix review comments; Check all returned error values using two helpers

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Simplify use of helpers

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Satisfy linter

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
2022-07-19 10:58:52 +02:00
Jesus Vazquez 6cfe44d7fd
WaitUntilIdle optimize idling time (#10878)
Relates to @bboreham optimization in https://github.com/prometheus/prometheus/pull/10859

Bryan did reduce the sleep time improving the deltas on the benchmark by
quite a lot. However I've been working on a similar implementation for
out of order and I noticed that we actually get into this method
thousands of times.

@ywwg had the brilliant idea of not always sleeping before the select
but actually make it a case in the select so we only sleep if we need
to.

The benchmark deltas are amazing

```
❯ benchstat old_implementation.txt new_implementation_using_time_after.txt
name                                                                                                     old time/op  new time/op  delta
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=0,mmappedChunkT=0-8        521ms ±25%   253ms ± 6%  -51.47%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=36,mmappedChunkT=0-8       773ms ± 3%   369ms ±31%  -52.23%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=72,mmappedChunkT=0-8       592ms ±28%   297ms ±28%  -49.80%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=360,mmappedChunkT=0-8      547ms ± 2%  999ms ±187%     ~     (p=0.690 n=5+5)
LoadWAL/batches=10,seriesPerBatch=10000,samplesPerSeries=50,exemplarsPerSeries=0,mmappedChunkT=0-8        11.3s ± 4%    1.3s ±44%  -88.48%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=10000,samplesPerSeries=50,exemplarsPerSeries=2,mmappedChunkT=0-8        11.1s ± 1%    1.2s ±20%  -89.08%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=0,mmappedChunkT=0-8        1.24s ± 3%   0.18s ± 7%  -85.76%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=2,mmappedChunkT=0-8        1.24s ± 2%   0.18s ± 5%  -85.24%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=5,mmappedChunkT=0-8        1.23s ± 5%   0.27s ±33%  -77.73%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=24,mmappedChunkT=0-8       1.28s ± 1%   0.36s ± 7%  -71.51%  (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=0,mmappedChunkT=3800-8    12.1s ± 1%    3.1s ± 6%  -74.33%  (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=2,mmappedChunkT=3800-8    12.1s ± 1%    3.4s ± 4%  -71.94%  (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=5,mmappedChunkT=3800-8    12.1s ± 1%    3.8s ±17%  -68.35%  (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=24,mmappedChunkT=3800-8   12.4s ± 1%    4.0s ±18%  -67.71%  (p=0.008 n=5+5)
```

Benchmarked on Linux
```
goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/tsdb
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
```

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
2022-06-30 15:00:04 +02:00
Bryan Boreham 542b9ecdbd
tsdb: reduce sleep time when reading WAL (#10859)
The code sleeps for a short time to allow goroutines to finish, however
it seems the duration can be reduced a lot, speeding up the reading
process.

I checked using some WAL data from production, and the queue is almost
always empty at the time we enter `waitForIdle()` so there is no danger
of spinning in the tight loop.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-06-12 11:54:11 +05:30
Matthieu MOREL e2ede285a2
refactor: move from io/ioutil to io and os packages (#10528)
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir

Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
2022-04-27 11:24:36 +02:00
chavacava 0b41fd6e71
Fix data races in WAL replay (#10571)
Signed-off-by: chavacava <salvadorcavadini+github@gmail.com>
2022-04-12 16:00:20 +05:30
Ganesh Vernekar 05d4d97bcd
Fix queries after a failed snapshot replay (#9980)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-12-08 15:32:14 +00:00
Bryan Boreham 1b74a3812e
Fix panic, out of order chunks, and race warning during WAL replay (#9856)
* Fix panic on WAL replay

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Refactor: introduce walSubsetProcessor

walSubsetProcessor packages up the `processWALSamples()` function and
its input and output channels, helping to clarify how these things
relate.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Refactor: extract more methods onto walSubsetProcessor

This makes the main logic easier to follow.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix race warning by locking processWALSamples

Although we have waited for the processor to finish, we still get a
warning from the race detector because it doesn't know how the different
parts relate.

Add a lock round each batch of samples, so the race detector can see
that we never access series owned by the processor outside of a lock.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Added test to reproduce issue 9859

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove redundant unit test

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix out of order chunks during WAL replay

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix nits

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
2021-11-25 13:36:14 +05:30
beorn7 c954cd9d1d Move packages out of deprecated pkg directory
This creates a new `model` directory and moves all data-model related
packages over there:
  exemplar labels relabel rulefmt textparse timestamp value

All the others are more or less utilities and have been moved to `util`:
  gate logging modetimevfs pool runtime

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-09 08:03:10 +01:00
Dieter Plaetinck cda025b5b5
TSDB: demistify SeriesRefs and ChunkRefs (#9536)
* TSDB: demistify seriesRefs and ChunkRefs

The TSDB package contains many types of series and chunk references,
all shrouded in uint types.  Often the same uint value may
actually mean one of different types, in non-obvious ways.

This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.

Concretely:

* Use appropriately named types and document their semantics and
  relations.
* Make multiplexing and demuxing of types explicit
  (on the boundaries between concrete implementations and generic
  interfaces).
* Casting between different types should be free.  None of the changes
  should have any impact on how the code runs.

TODO: Implement BlockSeriesRef where appropriate (for a future PR)

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* agent: demistify seriesRefs and ChunkRefs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-11-06 15:40:04 +05:30
Mateusz Gozdek b7bdf6fab2 Fix imports formatting
According to
2829908806 (r58457095).

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Mateusz Gozdek 1a6c2283a3 Format Go source files using 'gofumpt -w -s -extra'
Part of #9557

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Ganesh Vernekar 10bc6e80ee
Fix panic on failed snapshot replay and don't hard fail replay on disabled exemplars (#9438)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-05 10:51:25 +05:30
Ganesh Vernekar b30db03f35
Cut v2.30.2 (#9426)
* Don't error on overlapping m-mapped chunks during WAL replay (#9381)

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Reduce log level during WAL replay on overlapping m-map chunks (#9425)

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Cut v2.30.2

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-01 17:00:22 +05:30
Ganesh Vernekar 8944520ccc
Fix deletion of old snapshots (#9314)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-09-08 19:53:44 +05:30
Ganesh Vernekar 35b1a82594
Exemplars in snapshot (#9255)
* Exemplars in snapshot

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add docs

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-30 19:34:38 +05:30
Callum Styan cc55e57c1b
Fix a data race in the loadWAL function caused by reusing the same error var in multiple goroutines (#9259)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2021-08-27 11:49:34 +05:30
Ganesh Vernekar 328a74ca36
Fix bugs and add enhancements to the chunk snapshot (#9185)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-17 18:08:16 +01:00
Bryan Boreham 040ef175eb
Optimise WAL loading by removing extra map and caching min-time (#9160)
* BenchmarkLoadWAL: close WAL after use

So that goroutines are stopped and resources released

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* BenchmarkLoadWAL: make series IDs co-prime with #workers

Series are distributed across workers by taking the modulus of the
ID with the number of workers, so multiples of 100 are a poor choice.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* BenchmarkLoadWAL: simulate mmapped chunks

Real Prometheus cuts chunks every 120 samples, then skips those samples
when re-reading the WAL. Simulate this by creating a single mapped chunk
for each series, since the max time is all the reader looks at.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix comment

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Remove series map from processWALSamples()

The locks that is commented to reduce contention in are now sharded
32,000 ways, so won't be contended. Removing the map saves memory and
goes just as fast.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* loadWAL: Cache the last mmapped chunk time

So we can skip calling append() for samples it will reject.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Improvements from code review

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Full stops and capitals on comments

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Cache max time in both places mmappedChunks is updated

Including refactor to extract function `setMMappedChunks`, to reduce
code duplication.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Update head min/max time when mmapped chunks added

This ensures we have the correct values if no WAL samples are added for
that series.

Note that `mSeries.maxTime()` was always `math.MinInt64` before, since
that function doesn't consider mmapped chunks.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-08-10 14:53:31 +05:30
Bryan Boreham 7407457243
Avoid deadlock when processing duplicate series record (#9170)
* Avoid deadlock when processing duplicate series record

`processWALSamples()` needs to be able to send on its output channel
before it can read the input channel, so reads to allow this in case the
output channel is full.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* processWALSamples: update comment

Previous text seems to relate to an earlier implementation.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-08-10 13:32:42 +05:30
Ganesh Vernekar ee7e0071d1
Snapshot in-memory chunks on shutdown for faster restarts (#7229)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-06 17:51:01 +01:00
Ganesh Vernekar 848cb5a6d6
Enhanced WAL replay for duplicate series record (#7438)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-03 20:03:54 +05:30
Ganesh Vernekar 8002a3ab80
Breakdown tsdb/head.go into multiple files (#9147)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-03 14:14:26 +02:00