closes https://github.com/prometheus/tsdb/issues/471
after implementing the new WAL this metric was missing so adding it again.
Also added it in a test to make sure it works as expected.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* refactor NewSegmentsRangeReader to take multi WAL ranges
In case of an error when checkpointing the WAL the error doesn't show
the exact WAL index that is corrupter. this is because it uses
MultiReader to read multiply WAL files.
This refactoring allows the NewSegmentsRangeReader to take more than a
single WAL range and it reads all of the ranges by iterating each one.
this changes the logs from
create checkpoint: read segments: corruption after 4841144384 bytes:...
to
create checkpoint: read segments: corruption in segment
data/wal/00017351 at 123142208: ...
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* repair wal when the record cannot be decoded
Currently repair is run only when the error happens in the reader.
A corruption can occur after the record is read and when it is decoded.
This change wraps the error at decoding as a CorruptionErr as this error
is expected to trigger a repair.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
unexported NewMemTombstones as this returns unexported memTombstones
type which will not be shows in godoc.
Added missing comments for exported methods.
Removed unused RecordLogger,RecordReader interfaces.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
Calculating the modulus in each worker was a hotspot,
and meant that you had more work to do the more cores you had.
This cuts CPU usage (on my 8 core, 4 real core machine) by
33%, and walltime by 3%
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
This is read far more than it changes.
This cuts ~14% off walltme and ~27% off CPU for WAL reading.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
With the various goroutines running, the locking
in getByID is notable. This cuts cpu usage by ~25%
and walltime by ~20%.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
Some benchmarks for HEAD and allocate the correct slice size in LabelValues , we already know what it'll be
This is ~15% time improvement, and ~25% allocation improvement:
```
benchmark old ns/op new ns/op delta
BenchmarkHeadPostingForMatchers-4 74452 63514 -14.69%
benchmark old allocs new allocs delta
BenchmarkHeadPostingForMatchers-4 20 13 -35.00%
benchmark old bytes new bytes delta
BenchmarkHeadPostingForMatchers-4 5425 3137 -42.18%
```
Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
This reverts commit 98fe30438c.
After some discussion, it was concluded that we want the full
`prometheus_tsdb_...` prefix hardcoded in the library.
Signed-off-by: beorn7 <beorn@soundcloud.com>
This fixes various issues when initializing the head time range
under different starting conditions.
Signed-off-by: Fabian Reinartz <freinartz@google.com>
Blocks are half-open intervals [a, b), while all other intervals
(chunks, head, ...) are closed intervals [a, b].
Make that distinction explicit by defining `OverlapsClosedInterval()`
methods for blocks and chunks, and using them in place of the more
generic `intervalOverlap()` function.
This change also fixes `db.Querier()` and `db.Delete()`, which could
previously return one extraneous block at the end of the specified
interval.
Signed-off-by: Benoît Knecht <benoit.knecht@fsfe.org>
This has been a frequent source of debugging pain since errors are
potentially delayed to a much later point. They bubble up in an
unrelated execution path.