* Limit FastRegexMatcher by size (bytes) and add a TTL to each entry
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Add TestNewFastRegexMatcher_CacheSizeLimit
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Tolerate ristretto goroutines when checking goroutine leaks
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Tolerate ristretto goroutines when checking goroutine leaks
Signed-off-by: Marco Pracucci <marco@pracucci.com>
---------
Signed-off-by: Marco Pracucci <marco@pracucci.com>
This is a method used by some downstream projects; it was created to
optimize the implementation in `labels_string.go` but we should have one
for both implementations so the same code works with either.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Deleted labels are remembered, even if they were not in `base` or were
removed from `add`, so `base+add-del` could go negative.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Use a prefix trie for long alternate lists
* Add test for non terminal node
* Fix panic in FuzzFastRegexMatcher_WithFuzzyRegularExpressions when the fuzzy regex is invalid
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Address PR feedback
* Update model/labels/regexp_test.go
Co-authored-by: Marco Pracucci <marco@pracucci.com>
* Replace trie with slice or map depending on input size
* Fix tests
* Pull in tests from @pracucci's branch
* Add setMatches back in
* Use stringMatcher when it's faster
* Fix linter
* Estimate alternates ahead of time
* Simplify construction with `IndexByte`
* Add test and early return for empty regexp.
* Fix race conditions in tests
---------
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Go spends some time initializing all the elements of these arrays to
zero, so reduce the size from 1024 to 128. This is still much bigger
than we ever expect for a set of labels.
(If someone does have more than 128 labels it will still work, but via
heap allocation.)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
It took a `Labels` where the memory could be re-used, but in practice
this hardly ever benefitted. Especially after converting `relabel.Process`
to `relabel.ProcessBuilder`.
Comparing the parameter to `nil` was a bug; `EmptyLabels` is not `nil`
so the slice was reallocated multiple times by `append`.
Lastly `Builder.Labels()` now estimates that the final size will depend
on labels added and deleted.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Although we had a different slice, the underlying memory was the same so
any changes meant we could skip some values.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Save work converting between Builder and Labels.
Also expose ProcessBuilder, so callers can supply a Builder.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This lets relabelling work on a `Builder` rather than converting to and
from `Labels` on every rule.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
The difference is modest, but we've used `slices.Sort` in lots of other
places so why not here.
name old time/op new time/op delta
Builder 1.04µs ± 3% 0.95µs ± 3% -8.27% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Builder 312B ± 0% 288B ± 0% -7.69% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Builder 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This makes the buffer the correct size for the common case that labels
have only been added. It will be too large for the case that labels are
changed, but the current buffer resize logic in `appendLabelTo` doubles
the buffer, so a small over-estimate is better.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This change removes restrictions to allow adding exemplars
to all time series. It also contains some improvements in test values
so that it is easier to track what is tested.
The advantage of doing this is having a little less error-prone tests:
"yy" is not really descriptive but "counter-test" can give people
a better idea about what is tested so it is harder to make mistakes.
Closes gh-11982
Signed-off-by: Jonatan Ivanov <jonatan.ivanov@gmail.com>
* Optimized very long case insensitive alternations
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Run common regexps in BenchmarkFastRegexMatcher
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Modify BenchmarkNewFastRegexMatcher to benchmark the NewFastRegexMatcher() function
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Reduced allocations by optimizeEqualStringMatchers()
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Fixed typo in comments
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Fixed typo in test case name
Signed-off-by: Marco Pracucci <marco@pracucci.com>
---------
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Do not optimize regexps with being/end text anchors inside
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Explicit case for begin/end text in stringMatcherFromRegexpInternal()
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Added more test cases
Signed-off-by: Marco Pracucci <marco@pracucci.com>
---------
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Optimized FastRegexMatcher when the regex contains a case insensitive alternation made with concats too
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Do not use a pointer to hold whether the matches are case sensitive
Signed-off-by: Marco Pracucci <marco@pracucci.com>
* Improved unit tests based on review feedback
Signed-off-by: Marco Pracucci <marco@pracucci.com>
---------
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Previous code was effectively doing BigEndian.Uint64, so call that and save time.
An md5.Sum result is always 16 bytes. The first 8 are not used in the result, just as before.
Signed-off-by: Renning Bruns <ren@renmail.net>
These benchmarks were testing things related to what Prometheus does, but not testing actual Prometheus code.
Moved the label-copying benchmark into the labels package.
This commit adds an alternate implementation for `labels.Labels`, behind
a build tag `stringlabels`.
Instead of storing label names and values as individual strings, they
are all concatenated into one string in this format:
[len][name0][len][value0][len][name1][len][value1]...
The lengths are varint encoded so usually a single byte.
The previous `[]string` had 24 bytes of overhead for the slice and 16
for each label name and value; this one has 16 bytes overhead plus 1
for each name and value.
In `ScratchBuilder.Overwrite` and `Labels.Hash` we use an unsafe
conversion from string to byte slice. `Overwrite` is explicitly unsafe,
but for `Hash` this is a pure performance hack.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Parsing errors in the Prometheus HTTP format parser are very hard to
investigate since they only approximately indicate what is going wrong
in the parser and don't provide any information about the incorrect
input. As such it is very hard to tell what is wrong in the format
exposed by the application.
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
Sometimes label matchers know that they match values with a specific
prefix. This information can be valuable in some downstream storage
implementations.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>