* Add unittests for PostingsForMatcher.
* Selector methods are all stateless, don't need a reference.
* Be smarter in how we look at matchers.
Look at all matchers to see if a label can be empty.
Optimise Not handling, so i!="2" is a simple lookup
rather than an inverse postings list.
All all the Withouts together, rather than
having to subtract each from all postings.
Change the pre-expand the postings logic to always do it before doing a
Without only. Don't do that if it's already a list.
The initial goal here was that the oft-seen pattern
i=~"something.+",i!="foo",i!="bar" becomes more efficient.
benchmark old ns/op new ns/op delta
BenchmarkHeadPostingForMatchers/n="1"-4 5888 6160 +4.62%
BenchmarkHeadPostingForMatchers/n="1",j="foo"-4 7190 6640 -7.65%
BenchmarkHeadPostingForMatchers/j="foo",n="1"-4 6038 5923 -1.90%
BenchmarkHeadPostingForMatchers/n="1",j!="foo"-4 6030884 4850525 -19.57%
BenchmarkHeadPostingForMatchers/i=~".*"-4 887377940 230329137 -74.04%
BenchmarkHeadPostingForMatchers/i=~".+"-4 490316101 319931758 -34.75%
BenchmarkHeadPostingForMatchers/i=~""-4 594961991 130279313 -78.10%
BenchmarkHeadPostingForMatchers/i!=""-4 537542388 318751015 -40.70%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo"-4 10460243 8565195 -18.12%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo"-4 44964267 8561546 -80.96%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo"-4 42244885 29137737 -31.03%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo"-4 35285834 32774584 -7.12%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo"-4 8951047 8379024 -6.39%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo"-4 63813335 30672688 -51.93%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo"-4 45381112 44924397 -1.01%
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
since golang 1.12 no special handling is required for file.Sync()
@pborzenkov thanks for the pointer.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
Testing that createBlock creates blocks that can be opened.
and checking the os.RemoveAll for errors will catch errors for un-closed files under windows.
Many missing .Close() calls were added for fixing failing os.RemoveAll
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
Use a heap for Next for merges, and
pre-compute if there's many postings on the
unset path.
Add posting lookup benchmarks
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
Added methods needed to retain data based on a byte limitation rather than time. Limitation is only applied if the flag is set (defaults to 0). Both blocks that are older than the retention period and the blocks that make the size of the storage too large are removed.
2 new metrics for keeping track of the size of the local storage folder and the amount of times data has been deleted because the size restriction was exceeded.
Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>
Changes:
* Make `NewReader` method useful. It was impossible to use it, because closer was always nil.
* ReadSymbols, TOC and ReadOffsetTable are not public functions (used by Thanos).
* decbufXXX are now functions.
* More verbose errors.
* Removed unused crc32 field.
* Some var name changes to make it more verbose:
* symbols -> allocatedSymbols
* symbolsSlice -> symbolsV1
* symbols -> symbolsV2
*
* Pre-calculate symbolsTableSize.
* Initialized symbols for Symbols() method with valid length.
* Added test for Symbol method.
* Made Decoder LookupSymbol method public. Kept Decode public as it is useful as helper from index package.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
This change also uses the latest staticcheck version which comes with
new verifications, hence some clean up in the code.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Avoid a tree of merge objects, which can result in
what I suspect is n^2 calls to Seek when using Without.
With 100k metrics, and a regex of ^$ in BenchmarkHeadPostingForMatchers:
Before:
BenchmarkHeadPostingForMatchers-8 1 51633185216 ns/op 29745528 B/op 200357 allocs/op
After:
BenchmarkHeadPostingForMatchers-8 10 108924996 ns/op 25715025 B/op 101748 allocs/op
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
This saves memory, about a quarter of the size of the postings map
itself with high-cardinality labels (not including the post ids).
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
This reduces memory by only having to store the string's 16
bytes+map overheard once per label name, rather than duplicating it in every
entry for the label value.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
Reuse the string already allocated for symbols
in the posting tables.
Use a slice for symbols in v2 format.
Move symbol size logic into the index code.
Avoid duplication of lookupSymbol logic.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
fixes: https://github.com/prometheus/tsdb/issues/426
Using `filepath.Join()` instead of strings containing forward slash path delimiters (needed for non-*nix OSes), as suggested by @krasi-georgiev
more meaningful names for serializedStringTuples and stringTuples structs
Signed-off-by: knrt10 <tripathi.kautilya@gmail.com>
Co-authored-by: Krasi Georgiev <kgeorgie@redhat.com>
Currently the offsets are cast into uint32 even though the index can
grow larger than 4GiB.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>