* Additional logging in compact.go - logged time needed for writing blocks to disk
Signed-off-by: Radoslaw Lesniewski <Radoslaw.Lesniewski@sabre.com>
* Additional logging in compact.go - code formatted
Signed-off-by: Radoslaw Lesniewski <Radoslaw.Lesniewski@sabre.com>
a failed reload immediately after a compaction should delete the
resulting block to avoid creating blocks with the same time range.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
Added methods needed to retain data based on a byte limitation rather than time. Limitation is only applied if the flag is set (defaults to 0). Both blocks that are older than the retention period and the blocks that make the size of the storage too large are removed.
2 new metrics for keeping track of the size of the local storage folder and the amount of times data has been deleted because the size restriction was exceeded.
Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>
Changes:
* Make `NewReader` method useful. It was impossible to use it, because closer was always nil.
* ReadSymbols, TOC and ReadOffsetTable are not public functions (used by Thanos).
* decbufXXX are now functions.
* More verbose errors.
* Removed unused crc32 field.
* Some var name changes to make it more verbose:
* symbols -> allocatedSymbols
* symbolsSlice -> symbolsV1
* symbols -> symbolsV2
*
* Pre-calculate symbolsTableSize.
* Initialized symbols for Symbols() method with valid length.
* Added test for Symbol method.
* Made Decoder LookupSymbol method public. Kept Decode public as it is useful as helper from index package.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Avoid a tree of merge objects, which can result in
what I suspect is n^2 calls to Seek when using Without.
With 100k metrics, and a regex of ^$ in BenchmarkHeadPostingForMatchers:
Before:
BenchmarkHeadPostingForMatchers-8 1 51633185216 ns/op 29745528 B/op 200357 allocs/op
After:
BenchmarkHeadPostingForMatchers-8 10 108924996 ns/op 25715025 B/op 101748 allocs/op
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
With 1M series:
Before:
BenchmarkHeadPostingForMatchers-8 1 3501996117 ns/op 61311520 B/op 78 allocs/op
After:
BenchmarkHeadPostingForMatchers-8 1 1403072952 ns/op 69261568 B/op 72 allocs/op
This works out as 3X faster, as the above time includes other things.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
It is easy to forget to close the block returned by createPopulatedBlock
which causes failures for windows so instead it returns the block dir
and which can be used by OpenBlock explicitly.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
The WALFlushInterval is not used anywhere in the code base.
The WAL is not an interface anymore to save some lookup time so can't use NopWAL in the tests. Instead can just pass nil as the code checks for that and it is essentially a noop.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
This reports the cardinality of each label,
the total number of label pairs,
and how much series worth of time is "uncovered"
by series data. Which is basically how much churn there is.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
closes https://github.com/prometheus/tsdb/issues/471
after implementing the new WAL this metric was missing so adding it again.
Also added it in a test to make sure it works as expected.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* refactor NewSegmentsRangeReader to take multi WAL ranges
In case of an error when checkpointing the WAL the error doesn't show
the exact WAL index that is corrupter. this is because it uses
MultiReader to read multiply WAL files.
This refactoring allows the NewSegmentsRangeReader to take more than a
single WAL range and it reads all of the ranges by iterating each one.
this changes the logs from
create checkpoint: read segments: corruption after 4841144384 bytes:...
to
create checkpoint: read segments: corruption in segment
data/wal/00017351 at 123142208: ...
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* repair wal when the record cannot be decoded
Currently repair is run only when the error happens in the reader.
A corruption can occur after the record is read and when it is decoded.
This change wraps the error at decoding as a CorruptionErr as this error
is expected to trigger a repair.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* return an error when the last wal segment record is torn.
this ensures that a repair will be run when the last record in a segment
is torn.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* *: support Go modules
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Update go.mod and Makefile.common
Signed-off-by: Simon Pasquier <spasquie@redhat.com>