This implementation is based on this design doc:
https://docs.google.com/document/d/1Kppm7qL9C-BJB1j6yb6-9ObG3AbdZnFUBYPNNWwDBYM/edit?usp=sharing
This commit adds support to accept out-of-order ("OOO") sample into the TSDB
up to a configurable time allowance. If OOO is enabled, overlapping querying
are automatically enabled.
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Dieter Plaetinck <dieter@grafana.com>
"Labels is a sorted set of labels. Order has to be guaranteed upon
instantiation." says the comment, so fix all the tests that break this
rule.
For `BenchmarkLabelValuesWithMatchers()` and
`BenchmarkHeadLabelValuesWithMatchers()` the amount of work done changes
significantly if you put the labels in order, because all series refs
get neatly partitioned by the `tens` label, so I renamed the labels
to maintain the previous behaviour.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir
Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
* Add a test with variable samples rate append
This test overflows the chunk created in memseries, and the total amount
of samples in the (only) mmapped chunk is 29, instead of the 65565
appended ones.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Cut new chunk when rate prediction was wrong
When appending samples at a slow rate, and then appending at a higher
rate, the prediction we made to cut a new chunk is no longer valid.
Sometimes this can even cause an overflow in the chunk, if more samples
than uint16 can hold are appended.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Improve comment on 2*samplesPerChunk
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Assert that all chunks have less than 240 samples
Also, trigger new chunk at 240, not at more than 240
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
This commit disables some unused workflows on our CI.
Also uses grafana/regexp instead of regexp which is blackisted.
Also updates head_test TestHeadReadWriterRepair increasing
ChunkWriteQueueSize to 1 so that the chunk disk mapper uses the async
queue. This seems to be default behaviour in upstream prometheus and
without this option our test fails.
* Chunks replay skips chunks with unknown encodings
We've changed the logic of loadMmappedChunks to skip chunks that have
unknown encodings. To do so we've modified IterateAllChunks to accept an
extra encoding argument in the callback function.
Also added unit tests in the head and chunk disk mapper.
* Also add an unit test for the old chunk diskmapper
* s/createUnsupportedChunk/writeUnsupportedChunk/g
* Write chunks via queue, predicting the refs
Our load tests have shown that there is a latency spike in the
remote write handler whenever the head chunks need to be written,
because chunkDiskMapper.WriteChunk() blocks until the chunks are written
to disk.
This adds a queue to the chunk disk mapper which makes the WriteChunk()
method non-blocking unless the queue is full. Reads can still be served
from the queue.
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* address PR feeddback
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* initialize metrics without .Add(0)
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* change isRunningMtx to normal lock
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* do not re-initialize chunkrefmap
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* update metric outside of lock scope
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* add benchmark for adding job to chunk write queue
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* remove unnecessary "success" var
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* gofumpt -extra
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* avoid WithLabelValues call in addJob
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* format comments
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* addressing PR feedback
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* rename cutExpectRef to cutAndExpectRef
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* use head.Init() instead of .initTime()
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* address PR feedback
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* PR feedback
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* update test according to PR feedback
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* replace callbackWg -> awaitCb
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* better test of truncation with empty files
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
* replace callbackWg -> awaitCb
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* write chunks via queue, predicting chunkrefs
* fixes
* linter
* cleanup
* better test coverage
* deduplicate operations when cutting file
* better comments
* fix race
* improvements based on PR feedback
* comments
* concurrent correctness test of write queue
* linter
* use isStarted flag in chunk write queue
* separate mutex for isStarted
* fixing tests
* fix race in test
* PR feedback
* add license headers
* PR feedback
* pr feedback
* dont recreate workerCtrl channel
* address PR feedback in high concurrency read and write test
* refactor the high concurrency write queue test
* pr feedback
* use require.Eventually
* typo
* Fixing comment
Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
* use buffered chan for chunk write queue
* fix races
* address feedback on PR
* better comment
* less type casts
* dont reeimplement varint
* move mtx above property it protects
* improve tests by using wg instead of inspecting queue
* add comment
* add comment
* writeChunkF comment
* rename isStarted -> isRunning
* prevent panic when double stopping
* fix variable name in comment
* delete outdated comment
* naming and comment fixes in TestChunkWriteQueue_WrappingAroundSizeLimit
* use require.False instead of require.Equal(t, false
* always enable chunk write queue
* dont call callback with lock held
* undo rename of curFileSequence to curFileSeq
* fix accidental change
* fix comment
* Better syntax
Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
* Better wording in comment
Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
* better comments to explain the use of coordinator chans in test
* fix test which broke to the async writing of chunks
* undo previous renaming of shadowing variable
* rename var threadID to workerID
* rename variables based on PR feedback
* rename pos -> ts
* better implementation of whileNotCanceled
* update querySeriesRef before using it
* rename labels -> lbls
* initialize the chunk write queue operations metric with all possible labels
* dont return error from CutNewFile
* recreate chunk ref map regularly to free
* limit how often we recreate the chunk ref map
* fix typo in comment
* better comment
Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
* track OOO-ness in tsdb into a histogram
This should help with building an understanding of customer requirements
as far as how far back samples tend to go.
Note: this has already been discussed, reviewed and approved on:
https://github.com/grafana/mimir/pull/488
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* update the test files which were not in mimir/vendor
* Disable isolation in isolation struct
Signed-off-by: darshanime <deathbullet@gmail.com>
* Run tsdb tests with isolation disabled
Signed-off-by: darshanime <deathbullet@gmail.com>
* Check for isolation disabled in isoState.Close()
Signed-off-by: darshanime <deathbullet@gmail.com>
* use t.Skip to skip isolation tests when disabled
Signed-off-by: darshanime <deathbullet@gmail.com>
* address review comments
Signed-off-by: darshanime <deathbullet@gmail.com>
* fix test for defaultIsolationState
Signed-off-by: darshanime <deathbullet@gmail.com>
* Change flag name. Set flag in DB. Do not init txRing. Close isoState.
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Test disabled isolation in CircleCI test_go
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Skip isolation related tests in db_test.go
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
This creates a new `model` directory and moves all data-model related
packages over there:
exemplar labels relabel rulefmt textparse timestamp value
All the others are more or less utilities and have been moved to `util`:
gate logging modetimevfs pool runtime
Signed-off-by: beorn7 <beorn@grafana.com>
* TSDB: demistify seriesRefs and ChunkRefs
The TSDB package contains many types of series and chunk references,
all shrouded in uint types. Often the same uint value may
actually mean one of different types, in non-obvious ways.
This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.
Concretely:
* Use appropriately named types and document their semantics and
relations.
* Make multiplexing and demuxing of types explicit
(on the boundaries between concrete implementations and generic
interfaces).
* Casting between different types should be free. None of the changes
should have any impact on how the code runs.
TODO: Implement BlockSeriesRef where appropriate (for a future PR)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* feedback
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* agent: demistify seriesRefs and ChunkRefs
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* Use dedicated Ref type
Throughout the code base, there are reference types masked as
regular integers. Let's use dedicated types. They are
equivalent, but clearer semantically.
This also makes it trivial to find where they are used,
and from uses, find the centralized docs.
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* postpone some work until after possible return
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* clarify
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* rename feedback
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* skip header is up to caller
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* Fix tests not closing head/block properly
Close Head properly in TestWalRepair_DecodingError
Close Block properly in TestReadIndexFormatV1
Co-authored-by: Christian Simon <simon@swine.de>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Implement PostingsCloner
PostingsCloner allows obtaining independend clones of a Postings, that
can be used for subsequent calls.
Co-authored-by: Christian Simon <simon@swine.de>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* IndexReader.PostingsForMatchers() and implement Cache
IndexReader now offers the PostingsForMatchers functionality, which is
provided by PostingsForMatchersCache.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
* Decrement active_appenders metric when no samples added
Also add a test that the metric is incremented and decremented as
expected with and without samples.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Fix comment
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
This saves memory, effort and locking.
Since every symbol is also added to postings, `Symbols()` can be
implemented there instead. This now has to build a map for
deduplication, but `Symbols()` is only called for compaction, and `gc()`
used to rebuild the symbols map after every compaction so not an
additional cost.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>