* Introduce out-of-order TSDB support
This implementation is based on this design doc:
https://docs.google.com/document/d/1Kppm7qL9C-BJB1j6yb6-9ObG3AbdZnFUBYPNNWwDBYM/edit?usp=sharing
This commit adds support to accept out-of-order ("OOO") sample into the TSDB
up to a configurable time allowance. If OOO is enabled, overlapping querying
are automatically enabled.
Most of the additions have been borrowed from
https://github.com/grafana/mimir-prometheus/
Here is the list ist of the original commits cherry picked
from mimir-prometheus into this branch:
- 4b2198d7ec
- 2836e5513f
- 00b379c3a5
- ff0dc75758
- a632c73352
- c6f3d4ab33
- 5e8406a1d4
- abde1e0ba1
- e70e769889
- df59320886
Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Dieter Plaetinck <dieter@grafana.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* gofumpt files
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Add license header to missing files
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix OOO tests due to existing chunk disk mapper implementation
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix truncate int overflow
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Add Sync method to the WAL and update tests
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* remove useless sync
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Update minOOOTime after truncating Head
* Update minOOOTime after truncating Head
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix lint
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Add a unit test
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Load OutOfOrderTimeWindow only once per appender
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix OOO Head LabelValues and PostingsForMatchers
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix replay of OOO mmap chunks
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Remove unnecessary err check
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Prevent panic with ApplyConfig
Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Run OOO compaction after restart if there is OOO data from WBL
Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Apply Bartek's suggestions
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Refactor OOO compaction
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Address comments and TODOs
- Added a comment explaining why we need the allow overlapping
compaction toggle
- Clarified TSDBConfig OutOfOrderTimeWindow doc
- Added an owner to all the TODOs in the code
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Run go format
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix remaining review comments
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix tests
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Change wbl reference when truncating ooo in TestHeadMinOOOTimeUpdate
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix TestWBLAndMmapReplay test failure on windows
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Address most of the feedback
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Refactor the block meta for out of order
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix windows error
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix review comments
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Dieter Plaetinck <dieter@grafana.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
The chunk pool belongs to the head not to the series. Pass it down where
required, and remove the copy of the pointer that `memSeries` was
holding.
`safeChunk` also needs to hold it, because in scenarios where it is used
we don't have a reference to the head. However it was already holding
`chunkDiskMapper` for the same reason, so no big change.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Append metadata to the WAL
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Remove extra whitespace; Reword some docstrings and comments
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Use RLock() for hasNewMetadata check
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Use single byte for metric type in RefMetadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Update proposed WAL format for single-byte type metadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Implementa MetadataAppender interface for the Agent
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Address first round of review comments
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Amend description of metadata in wal.md
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Correct key used to retrieve metadata from cache
When we're setting metadata entries in the scrapeCace, we're using the
p.Help(), p.Unit(), p.Type() helpers, which retrieve the series name and
use it as the cache key. When checking for cache entries though, we used
p.Series() as the key, which included the metric name _with_ its labels.
That meant that we were never actually hitting the cache. We're fixing
this by utiling the __name__ internal label for correctly getting the
cache entries after they've been set by setHelp(), setType() or
setUnit().
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Put feature behind a feature flag
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix AppendMetadata docstring
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Reorder WAL format document
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Change error message of AppendMetadata; Fix access of s.meta in AppendMetadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Reuse temporary buffer in Metadata encoder
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Only keep latest metadata for each refID during checkpointing
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix test that's referencing decoding metadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Avoid creating metadata block if no new metadata are present
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Add tests for corrupt metadata block and relevant record type
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix CR comments
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Extract logic about changing metadata in an anonymous function
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Implement new proposed WAL format and amend relevant tests
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Use 'const' for metadata field names
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Apply metadata to head memSeries in Commit, not in AppendMetadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Add docstring and rename extracted helper in scrape.go
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Add tests for tsdb-related cases
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix linter issues vol1
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix linter issues vol2
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix Windows test by closing WAL reader files
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Use switch instead of two if statements in metadata decoding
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix review comments around TestMetadata* tests
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Add code for replaying WAL; test correctness of in-memory data after a replay
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Remove scrape-loop related code from PR
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Address first round of comments
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Simplify tests by sorting slices before comparison
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix test to use separate transactions
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Empty out buffer and record slices after encoding latest metadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix linting issue
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Update calculation for DroppedMetadata metric
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Rename MetadataAppender interface and AppendMetadata method to MetadataUpdater/UpdateMetadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Reuse buffer when encoding latest metadata for each series
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix review comments; Check all returned error values using two helpers
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Simplify use of helpers
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Satisfy linter
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
"Labels is a sorted set of labels. Order has to be guaranteed upon
instantiation." says the comment, so fix all the tests that break this
rule.
For `BenchmarkLabelValuesWithMatchers()` and
`BenchmarkHeadLabelValuesWithMatchers()` the amount of work done changes
significantly if you put the labels in order, because all series refs
get neatly partitioned by the `tens` label, so I renamed the labels
to maintain the previous behaviour.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir
Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
* Add a test with variable samples rate append
This test overflows the chunk created in memseries, and the total amount
of samples in the (only) mmapped chunk is 29, instead of the 65565
appended ones.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Cut new chunk when rate prediction was wrong
When appending samples at a slow rate, and then appending at a higher
rate, the prediction we made to cut a new chunk is no longer valid.
Sometimes this can even cause an overflow in the chunk, if more samples
than uint16 can hold are appended.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Improve comment on 2*samplesPerChunk
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Assert that all chunks have less than 240 samples
Also, trigger new chunk at 240, not at more than 240
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Write chunks via queue, predicting the refs
Our load tests have shown that there is a latency spike in the
remote write handler whenever the head chunks need to be written,
because chunkDiskMapper.WriteChunk() blocks until the chunks are written
to disk.
This adds a queue to the chunk disk mapper which makes the WriteChunk()
method non-blocking unless the queue is full. Reads can still be served
from the queue.
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* address PR feeddback
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* initialize metrics without .Add(0)
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* change isRunningMtx to normal lock
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* do not re-initialize chunkrefmap
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* update metric outside of lock scope
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* add benchmark for adding job to chunk write queue
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* remove unnecessary "success" var
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* gofumpt -extra
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* avoid WithLabelValues call in addJob
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* format comments
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* addressing PR feedback
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* rename cutExpectRef to cutAndExpectRef
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* use head.Init() instead of .initTime()
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* address PR feedback
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* PR feedback
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* update test according to PR feedback
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* replace callbackWg -> awaitCb
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* better test of truncation with empty files
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* replace callbackWg -> awaitCb
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* Disable isolation in isolation struct
Signed-off-by: darshanime <deathbullet@gmail.com>
* Run tsdb tests with isolation disabled
Signed-off-by: darshanime <deathbullet@gmail.com>
* Check for isolation disabled in isoState.Close()
Signed-off-by: darshanime <deathbullet@gmail.com>
* use t.Skip to skip isolation tests when disabled
Signed-off-by: darshanime <deathbullet@gmail.com>
* address review comments
Signed-off-by: darshanime <deathbullet@gmail.com>
* fix test for defaultIsolationState
Signed-off-by: darshanime <deathbullet@gmail.com>
* Change flag name. Set flag in DB. Do not init txRing. Close isoState.
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Test disabled isolation in CircleCI test_go
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Skip isolation related tests in db_test.go
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
This creates a new `model` directory and moves all data-model related
packages over there:
exemplar labels relabel rulefmt textparse timestamp value
All the others are more or less utilities and have been moved to `util`:
gate logging modetimevfs pool runtime
Signed-off-by: beorn7 <beorn@grafana.com>
* TSDB: demistify seriesRefs and ChunkRefs
The TSDB package contains many types of series and chunk references,
all shrouded in uint types. Often the same uint value may
actually mean one of different types, in non-obvious ways.
This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.
Concretely:
* Use appropriately named types and document their semantics and
relations.
* Make multiplexing and demuxing of types explicit
(on the boundaries between concrete implementations and generic
interfaces).
* Casting between different types should be free. None of the changes
should have any impact on how the code runs.
TODO: Implement BlockSeriesRef where appropriate (for a future PR)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* feedback
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* agent: demistify seriesRefs and ChunkRefs
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* Use dedicated Ref type
Throughout the code base, there are reference types masked as
regular integers. Let's use dedicated types. They are
equivalent, but clearer semantically.
This also makes it trivial to find where they are used,
and from uses, find the centralized docs.
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* postpone some work until after possible return
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* clarify
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* rename feedback
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* skip header is up to caller
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* Decrement active_appenders metric when no samples added
Also add a test that the metric is incremented and decremented as
expected with and without samples.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Fix comment
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
This saves memory, effort and locking.
Since every symbol is also added to postings, `Symbols()` can be
implemented there instead. This now has to build a map for
deduplication, but `Symbols()` is only called for compaction, and `gc()`
used to rebuild the symbols map after every compaction so not an
additional cost.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* BenchmarkLoadWAL: close WAL after use
So that goroutines are stopped and resources released
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* BenchmarkLoadWAL: make series IDs co-prime with #workers
Series are distributed across workers by taking the modulus of the
ID with the number of workers, so multiples of 100 are a poor choice.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* BenchmarkLoadWAL: simulate mmapped chunks
Real Prometheus cuts chunks every 120 samples, then skips those samples
when re-reading the WAL. Simulate this by creating a single mapped chunk
for each series, since the max time is all the reader looks at.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Fix comment
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Remove series map from processWALSamples()
The locks that is commented to reduce contention in are now sharded
32,000 ways, so won't be contended. Removing the map saves memory and
goes just as fast.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* loadWAL: Cache the last mmapped chunk time
So we can skip calling append() for samples it will reject.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Improvements from code review
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Full stops and capitals on comments
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Cache max time in both places mmappedChunks is updated
Including refactor to extract function `setMMappedChunks`, to reduce
code duplication.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Update head min/max time when mmapped chunks added
This ensures we have the correct values if no WAL samples are added for
that series.
Note that `mSeries.maxTime()` was always `math.MinInt64` before, since
that function doesn't consider mmapped chunks.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Push the matchers for LabelNames all the way into the index.
NB This doesn't actually implement it in the index, just plumbs it through for now...
Signed-off-by: Tom Wilkie <tom@grafana.com>
* Hack it up. Does not work.
Signed-off-by: Tom Wilkie <tom@grafana.com>
* Revert changes I don't understand
Can't see why do we need to hold a mutex on symbols, and the purpose of
the LabelNamesFor method.
Maybe I'll need to re-add this later.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Implement LabelNamesFor
This method provides the label names that appear in the postings
provided. We do that deeper than the label values because we know
beforehand that most of the label names we'll be the same across
different postings, and we don't want to go down an up looking up the
same symbols for all different series.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Mutex on symbols should be unlocked
However, I still don't understand why do we need a mutex here.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Fix head.LabelNamesFor
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Implement mockIndex LabelNames with matchers
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Nitpick on slice initialisation
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Add tests for LabelNamesWithMatchers
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Fix the mutex mess on head.LabelValues/LabelNames
I still don't see why we need to grab that unrelated mutex, but at least
now we're grabbing it consistently
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Check error after iterating postings
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Use the error from posting when there was en error in postings
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Update storage/interface.go comment
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* Update tsdb/index/index.go comment
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* Update tsdb/index/index.go wrapped error msg
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* Update tsdb/index/index.go wrapped error msg
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* Update tsdb/index/index.go warpped error msg
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* Remove unneeded comment
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Add testcases for LabelNames w/matchers in api.go
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Use t.Cleanup() instead of defer in tests
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Tom Wilkie <tom@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* Create experimental circular buffer resize method, benchmarks
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Optimize exemplar resize to only replay as many exemplars as needed
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* More comments, benchmark AddExemplar
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* optimizations
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* comment
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Slight refactor of resize benchmark + make use of resize via runtime
reloadable storage config.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Some more config related changes.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address some review comments.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address more review comments.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Refactor to remove usage of noopExemplarStorage and avoid race condition
when resizing from Head code.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Fix or add comments to clarify some of the new behaviour.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* fix potential panics related to negative exemplar buffer lengths
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Callum Styan <callumstyan@gmail.com>
* Added walreplay API endpoint
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Added starting page to react-ui
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Documented the new endpoint
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed typos
Signed-off-by: Levi Harrison <git@leviharrison.dev>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
* Removed logo
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed isResponding to isUnexpected
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed width of progress bar
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed width of progress bar
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Added DB stats object
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Updated starting page to work with new fields
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil (pt. 2)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil (pt. 3)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil (and also implementing a method this time) (pt. 4)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil (and also implementing a method this time) (pt. 5)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed const to let
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil (pt. 6)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Remove SetStats method
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Added comma
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed api
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed to triple equals
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed data response types
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Don't return pointer
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed version
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed interface issue
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed pointer
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed copying lock value error
Signed-off-by: Levi Harrison <git@leviharrison.dev>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
* Write exemplars to the WAL and send them over remote write.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Update example for exemplars, print data in a more obvious format.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Add metrics for remote write of exemplars.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Fix incorrect slices passed to send in remote write.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* We need to unregister the new metrics.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address review comments
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Order of exemplar append vs write exemplar to WAL needs to change.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Several fixes to prevent sending uninitialized or incorrect samples with an exemplar. Fix dropping exemplar for missing series. Add tests for queue_manager sending exemplars
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Store both samples and exemplars in the same timeseries buffer to remove the alloc when building final request, keep sub-slices in separate buffers for re-use
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Condense sample/exemplar delivery tests to parameterized sub-tests
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Rename test methods for clarity now that they also handle exemplars
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Rename counter variable. Fix instances where metrics were not updated correctly
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Add exemplars to LoadWAL benchmark
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* last exemplars timestamp metric needs to convert value to seconds with
ms precision
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Process exemplar records in a separate go routine when loading the WAL.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address review comments related to clarifying comments and variable
names. Also refactor sample/exemplar to enqueue prompb types.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Regenerate types proto with comments, update protoc version again.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Put remote write of exemplars behind a feature flag.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address some of Ganesh's review comments.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Move exemplar remote write feature flag to a config file field.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address Bartek's review comments.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Don't allocate exemplar buffers in queue_manager if we're not going to
send exemplars over remote write.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Add ValidateExemplar function, validate exemplars when appending to head
and log them all to WAL before adding them to exemplar storage.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address more reivew comments from Ganesh.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Add exemplar total label length check.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address a few last review comments
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Martin Disibio <mdisibio@gmail.com>
* Add range query test cases
This includes a couple of failing ones that double count some points due
to the iterator seek bug.
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
* Add Seek() implementation for memSafeIterator
Previously, calling memSafeIterator.Seek() would call the Seek() method
on its embedded iterator. This was causing the embedded iterator and the
memSafeIterator to get out of sync because when the embedded Seek()
moved to the next element of the embedded iterator, memSafeIterator
didn't "know" about it. memSafeIterator has to "know" when the embedded
iterator has moved to be able to work out when it should be reading from
its buffer rather than the embedded iterator.
Used same logic as for xorIterator.Seek() (which in runtime is used as
the embedded iterator) - return false if the iterator has an error and
try to move to next element if the required time hasn't been reached, or
if no elements have been read yet. The memSafeIterator.Next() method is
being called so memSafeIterator.i is always accurate.
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
* Add tsdb package test
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
This moves the label lookup into TSDB, whilst still keeping the cached-ref optimisation for repeated Appends.
This makes the API easier to consume and implement. In particular this change is motivated by the scrape-time-aggregation work, which I don't think is possible to implement without it as it needs access to label values.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
* Set the min time of Head properly after truncation
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* Fix lint
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* Enhance compaction plan logic for completely deleted small block
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* Fix review comments
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* Testify: move to require
Moving testify to require to fail tests early in case of errors.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* More moves
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Refactor test assertions
This pull request gets rid of assert.True where possible to use
fine-grained assertions.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
As we're looking to expand what's in the WAL,
having old Prometheus servers ignore the new record types
rather than treating them as corruption allows for better
upgrade/downgrade paths.
Adjust some tests accordingly, so they're still testing what they're
meant to test.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>