* Added test to reproduce panic on TSDB head chunks truncated while querying
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Added test for Querier too
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Stop the bleed on mmap-ed head chunks panic
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Lower memory pressure in tests to ensure it doesn't OOM
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Skip TestQuerier_ShouldNotPanicIfHeadChunkIsTruncatedWhileReadingQueriedChunks
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Experiment to not trigger runtime.GC() continuously
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Try to fix test in CI
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Do not call runtime.GC() at all
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* I have no idea why it's failing in CI, skipping tests
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Snappy cannot encode records larger than ~3.7 GB and will panic if an
encoding is attempted. Check to make sure that the record is smaller
than this before encoding.
In the future, we could improve this behavior to still compress large
records (or break them up into smaller records), but this avoids the
panic for users with very large single scrape targets.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
* Add range query test cases
This includes a couple of failing ones that double count some points due
to the iterator seek bug.
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
* Add Seek() implementation for memSafeIterator
Previously, calling memSafeIterator.Seek() would call the Seek() method
on its embedded iterator. This was causing the embedded iterator and the
memSafeIterator to get out of sync because when the embedded Seek()
moved to the next element of the embedded iterator, memSafeIterator
didn't "know" about it. memSafeIterator has to "know" when the embedded
iterator has moved to be able to work out when it should be reading from
its buffer rather than the embedded iterator.
Used same logic as for xorIterator.Seek() (which in runtime is used as
the embedded iterator) - return false if the iterator has an error and
try to move to next element if the required time hasn't been reached, or
if no elements have been read yet. The memSafeIterator.Next() method is
being called so memSafeIterator.i is always accurate.
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
* Add tsdb package test
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
The purpose of GetRef() is to allow Append() to be called without
the caller needing to copy the labels. To avoid a race where a series
is removed from TSDB between the calls to GetRef() and Append(), we
return TSDB's copy of the labels.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Add method to get reference number for TSDB Appender
In situations where we need to copy labels before calling Add(),
GetRef() allows to check first, then call AddFast() in the case that the series
is already known.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Add explicit interface for GetRef() method
Suggested in code review by @bwplotka
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Rename OptionalGetRef to GetRef
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Simplify return value of GetRef()
0 can be relied on to mean 'no reference'
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
The main branch tests are not passing due to the fact that #8489 was not
rebased on top of #8007.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
This moves the label lookup into TSDB, whilst still keeping the cached-ref optimisation for repeated Appends.
This makes the API easier to consume and implement. In particular this change is motivated by the scrape-time-aggregation work, which I don't think is possible to implement without it as it needs access to label values.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Right now a new segment might be created unnecessarily if the
uncompressed record would not fit, but after compression (typically
reducing record size in half) it would.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
* CleanupTombstones refactored, now reloading blocks after every compaction.
The goal is to remove deletable blocks after every compaction and, thus, decrease disk space used when cleaning tombstones.
Signed-off-by: arthursens <arthursens2005@gmail.com>
* Protect DB against parallel reloads
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
* Fix typos
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
In the previous version, 1.18.0, the "megacheck" linter paid attention
to the '//lint:ignore' comment, but that is no longer there.
Newer version pay attention to '//nolint:<linter>,<linter>,...'
comments, optionally followed by a "second" comment introduced by '//'.
Update the directives to use this style.
This is related to prometheus/blackbox_exporter#738 and
prometheus/blackbox_exporter#745.
Signed-off-by: Marcelo E. Magallon <marcelo.magallon@grafana.com>
We're seeing compactions that are taking hours in Cortex which this is
missing. I know while it is not common in Prometheus, I am pretty sure
there are setups where compaction takes longer than 512s. On our own
Prometheus the average compaction duration is 566s.
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
* Fix TSDB head struct dump on querier error
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Added mint/maxt to RangeHead.String()
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* test: cleanup tempdir for TestBlockWriter
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
* test: cleanup tempdir for TestLogPartialWrite
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
* fix: remove pre-2.21 tmp blocks on start
Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>
* fix: commenting
Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>
* tsdb: Expose total number of label pairs in head
Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>
* fix: add comment for NumLabelPairs
Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>
* fix: remove comment
Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>
* Logging added for when compaction takes more than the block time range
Signed-off-by: arthursens <arthursens2005@gmail.com>
* Log only if no errors were already logged
Signed-off-by: arthursens <arthursens2005@gmail.com>
* Log duration as human readable string
Signed-off-by: arthursens <arthursens2005@gmail.com>
* Move logging from compactHead() to Compact()
Signed-off-by: arthursens <arthursens2005@gmail.com>
* Compute duration of all head compactions plus wal truncation
Signed-off-by: arthursens <arthursens2005@gmail.com>
* Remove named return added os first commits
Signed-off-by: arthursens <arthursens2005@gmail.com>
* Address nits
Signed-off-by: arthursens <arthursens2005@gmail.com>
* Change miliseconds to seconds to make fuzzit tests happy
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
* Set the min time of Head properly after truncation
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* Fix lint
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* Enhance compaction plan logic for completely deleted small block
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* Fix review comments
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>