Commit graph

67 commits

Author SHA1 Message Date
Oleg Zaytsev 4086a5f042 Merge branch 'main' into prometheus-2023-04-03-3923e83 2023-04-13 09:15:24 +02:00
Patrick Oyarzun ae170f644c
Optimize long alternate lists (#463)
* Use a prefix trie for long alternate lists

* Add test for non terminal node

* Fix panic in FuzzFastRegexMatcher_WithFuzzyRegularExpressions when the fuzzy regex is invalid

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Address PR feedback

* Update model/labels/regexp_test.go

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Replace trie with slice or map depending on input size

* Fix tests

* Pull in tests from @pracucci's branch

* Add setMatches back in

* Use stringMatcher when it's faster

* Fix linter

* Estimate alternates ahead of time

* Simplify construction with `IndexByte`

* Add test and early return for empty regexp.

* Fix race conditions in tests

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
2023-04-01 08:35:35 +02:00
Marco Pracucci 9fe1bd2d63
Improve TestAnalyzeRealQueries (#464)
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-03-31 08:27:43 +00:00
Marco Pracucci 05a3a79015
Cache optimized regexp matchers (#465)
* Cache optimized regexp matchers

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added BenchmarkNewFastRegexMatcher_CacheMisses

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved benchmark

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved benchmark

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Use LRU cache v2

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Run gofumpt

Signed-off-by: Marco Pracucci <marco@pracucci.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-03-31 04:05:26 +02:00
Bryan Boreham ee1157c14a labels: shrink stack arrays in Builder.Range
Go spends some time initializing all the elements of these arrays to
zero, so reduce the size from 1024 to 128. This is still much bigger
than we ever expect for a set of labels.

(If someone does have more than 128 labels it will still work, but via
heap allocation.)

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-22 17:14:43 +00:00
Ganesh Vernekar 41649ceb1b
Merge remote-tracking branch 'upstream/main' into codesome/sync-prom
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2023-03-22 08:35:08 +05:30
Bryan Boreham 3743d87c56 labels: cope with mutating Builder during Range call
Although we had a different slice, the underlying memory was the same so
any changes meant we could skip some values.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-16 13:28:15 +00:00
Bryan Boreham 3c4ab7a069 labels: add test for Builder.Range
Including mutating the Builder being Ranged over.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-16 13:25:55 +00:00
Ganesh Vernekar 7e74f73733
Merge remote-tracking branch 'upstream/main' into sync-prom
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2023-03-13 12:38:59 +05:30
Julien Pivotto 5583c77b3a
Merge pull request #12095 from damnever/unnecessary-sort
Remove unnecessary sort
2023-03-09 13:12:02 +01:00
Marco Pracucci fa57574183
Merge pull request #449 from grafana/yuri/merge-upstream
Merging remotes/prometheus/main into origin/main
2023-03-09 11:17:40 +01:00
Marco Pracucci 242e82b8e6
Optimize regex star operation (#448)
* Optimize .* regex matcher

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Consistent benchmark runs for BenchmarkFastRegexMatcher

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed TestParseExpressions

Signed-off-by: Marco Pracucci <marco@pracucci.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-03-09 09:38:41 +01:00
Yuri Nikolic c7d730f549 Fixing conflicts with commit c9b85afd93 2023-03-08 17:27:44 +01:00
Yuri Nikolic 88d9726b20 Fixing conflicts with commit 666f61a4d5 2023-03-08 16:54:40 +01:00
Xiaochao Dong (@damnever) 36fc1158b5 Remove unnecessary sort
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2023-03-08 15:36:02 +08:00
Bryan Boreham d740abf0c6 model/labels: add Get and Range to Builder
This lets relabelling work on a `Builder` rather than converting to and
from `Labels` on every rule.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-07 17:20:54 +00:00
Bryan Boreham ff993b279a
Merge pull request #12073 from bboreham/slices-sort2
labels: use slices.Sort for better performance
2023-03-07 09:31:50 +00:00
Bryan Boreham 38c6d3da9f labels: use slices.Sort for better performance
The difference is modest, but we've used `slices.Sort` in lots of other
places so why not here.

name     old time/op    new time/op    delta
Builder    1.04µs ± 3%    0.95µs ± 3%   -8.27%  (p=0.008 n=5+5)

name     old alloc/op   new alloc/op   delta
Builder      312B ± 0%      288B ± 0%   -7.69%  (p=0.008 n=5+5)

name     old allocs/op  new allocs/op  delta
Builder      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.008 n=5+5)

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-06 18:22:49 +00:00
Bryan Boreham a07a0be024 Add benchmark for labels.Builder
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-06 18:21:58 +00:00
Bryan Boreham 30297f0d9b stringlabels: size buffer for added labels
This makes the buffer the correct size for the common case that labels
have only been added. It will be too large for the case that labels are
changed, but the current buffer resize logic in `appendLabelTo` doubles
the buffer, so a small over-estimate is better.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-06 16:29:21 +00:00
Marco Pracucci 1e7ad0ec11
Optimized very long case insensitive alternations (#444)
* Optimized very long case insensitive alternations

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Run common regexps in BenchmarkFastRegexMatcher

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Modify BenchmarkNewFastRegexMatcher to benchmark the NewFastRegexMatcher() function

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Reduced allocations by optimizeEqualStringMatchers()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed typo in comments

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed typo in test case name

Signed-off-by: Marco Pracucci <marco@pracucci.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-03-02 17:20:52 +01:00
Marco Pracucci 383ea59ce1
Add TestAnalyzeRealQueries (#443)
* Add TestAnalyzeRealQueries

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add nolint directive

Signed-off-by: Marco Pracucci <marco@pracucci.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-03-01 15:50:04 +01:00
Marco Pracucci eeecfee885
Do not optimize regexps with begin/end text anchors inside (#433)
* Do not optimize regexps with being/end text anchors inside

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Explicit case for begin/end text in stringMatcherFromRegexpInternal()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added more test cases

Signed-off-by: Marco Pracucci <marco@pracucci.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-03-01 14:50:26 +01:00
Marco Pracucci 2e0ecc013f
Fix containsStringMatcher() when the text contains multiple occurrences of a substring (#431)
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-03-01 11:18:30 +00:00
Marco Pracucci c77900d58e
Optimized FastRegexMatcher when the regex contains a case insensitive alternation made with concats too (#430)
* Optimized FastRegexMatcher when the regex contains a case insensitive alternation made with concats too

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Do not use a pointer to hold whether the matches are case sensitive

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved unit tests based on review feedback

Signed-off-by: Marco Pracucci <marco@pracucci.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-03-01 10:49:25 +01:00
Bryan Boreham 35026fb26d
Merge pull request #11746 from prometheus/remove-microbenchmarks
These benchmarks were testing things related to what Prometheus does, but not testing actual Prometheus code. 
Moved the label-copying benchmark into the labels package.
2023-02-23 12:33:24 +01:00
Bryan Boreham f03b8d0968 Add benchmark copying labels
Taken from previous tsdb/test/BenchmarkLabelsClone.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-02-22 16:36:45 +00:00
Bryan Boreham 6136ae67e0 labels: shrink by making internals a single string
This commit adds an alternate implementation for `labels.Labels`, behind
a build tag `stringlabels`.

Instead of storing label names and values as individual strings, they
are all concatenated into one string in this format:

    [len][name0][len][value0][len][name1][len][value1]...

The lengths are varint encoded so usually a single byte.

The previous `[]string` had 24 bytes of overhead for the slice and 16
for each label name and value; this one has 16 bytes overhead plus 1
for each name and value.

In `ScratchBuilder.Overwrite` and `Labels.Hash` we use an unsafe
conversion from string to byte slice. `Overwrite` is explicitly unsafe,
but for `Hash` this is a pure performance hack.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-02-22 15:34:23 +00:00
Marco Pracucci 950c177c72
Hardcode the labels stable hash function instead of taking it as an option
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-01-30 14:21:18 +01:00
Oleg Zaytsev 2512c019d3 Expose Matcher.Prefix()
Sometimes label matchers know that they match values with a specific
prefix. This information can be valuable in some downstream storage
implementations.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2023-01-19 17:21:28 +01:00
György Krajcsovits 103c4fd289 Merge remote-tracking branch 'upstream/main' into main
# Conflicts:
#	.github/workflows/ci.yml
#	tsdb/block.go
#	tsdb/compact.go
#	tsdb/compact_test.go
#	tsdb/head_read.go
#	tsdb/index/index.go
#	tsdb/ooo_head_read.go
#	tsdb/querier_test.go
2023-01-08 14:55:44 +01:00
Bryan Boreham 10b27dfb84 Simplify IndexReader.Series interface
Instead of passing in a `ScratchBuilder` and `Labels`, just pass the
builder and the caller can extract labels from it. In many cases the
caller didn't use the Labels value anyway.

Now in `Labels.ScratchBuilder` we need a slightly different API: one
to assign what will be the result, instead of overwriting some other
`Labels`. This is safer and easier to reason about.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-12-19 15:22:09 +00:00
Bryan Boreham b10fd9aea3 model/labels: add a basic test for ScratchBuilder
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-12-19 15:22:09 +00:00
Bryan Boreham cbf432d2ac Update package labels tests for new labels.Labels type
Re-did the FromStrings test to avoid assumptions about how it works.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-12-19 15:22:09 +00:00
Bryan Boreham 617bee60f1 labels: use ScratchBuilder in ReadLabels
Instead of relying on being able to append to it like a slice.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-12-19 15:22:09 +00:00
Bryan Boreham 2b8b8d9ac7 labels: new methods to work without access to internals
Without changing the definition of `labels.Labels`, add methods which
enable code using it to work without knowledge of the internals.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-12-19 15:22:09 +00:00
Bryan Boreham ea7345a09c labels: improve comment on Builder.Set
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-12-19 15:22:09 +00:00
Bryan Boreham a19b369f9e labels: avoid lint warning on New()
This code is a bit cleaner.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-12-19 15:22:09 +00:00
Julien Pivotto bb323db613
Merge pull request #11074 from damnever/fix/datamodelvalidation
Validate the metric name and label names
2022-12-08 14:31:12 +01:00
Xiaochao Dong (@damnever) 9979024a30 Report error if the series contains invalid metric names or labels during scrape
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2022-12-08 20:01:20 +08:00
Bryan Boreham 8d4140a06e labels: note that Hash may change
For performance reasons we may use a different implementation of Hash()
in future, so note this so callers can be warned.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-11-28 16:17:32 +00:00
George Krajcsovits 71fee62838
Comment fixes (#364)
* matcher.go: restore comment from upstream

There's no reason to remove this comment.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* head_append.go: Cortex to Mimir

This repo is used in Mimir.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2022-11-23 12:26:20 +00:00
Ganesh Vernekar 83d9ee3ab7
Merge remote-tracking branch 'upstream/main' into sync-prom
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-10-05 20:33:36 +05:30
Bryan Boreham 5421c778ba labels: in tests use labels.FromStrings
Replacing code which assumes the internal structure of `Labels`.

Add a convenience function `EmptyLabels()` which is more efficient than
calling `New()`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-09-09 13:34:49 +02:00
Cosrider bef6556ca5
delete redundant alias (#11180)
Signed-off-by: Cosrider <cosrider7@gmail.com>

Signed-off-by: Cosrider <cosrider7@gmail.com>
2022-08-31 15:50:38 +02:00
Bryan Boreham 8b863c42dd
Optimise relabeling by re-using memory (#11147)
* model/relabel: Add benchmark

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* model/relabel: re-use Builder across relabels

Saves memory allocations.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* labels.Builder: allow re-use of result slice

This reduces memory allocations where the caller has a suitable slice available.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* model/relabel: re-use source values slice

To reduce memory allocations.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Unwind one change causing test failures

Restore original behaviour in PopulateLabels, where we must not overwrite the input set.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* relabel: simplify values optimisation

Use a stack-based array for up to 16 source labels, which will be the
vast majority of cases.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* lint

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-08-19 15:27:52 +05:30
Bryan Boreham a7f19b5775 labels: add a test for JSON and YAML marshalling
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-08-03 11:18:02 +02:00
Bryan Boreham 10699c37a3 labels: test BytesWithoutLabels does not remove __name__ by default
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-08-03 11:18:02 +02:00
Bryan Boreham d46ef0aa8e labels: tweak BenchmarkLabels_Get()
So the benchmark works without requiring `Labels` internals to be a slice.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-08-03 11:18:02 +02:00
Bryan Boreham 24ebff9c4a labels: don't test that Hash() works on unordered labels
Labels are required to be ordered.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-08-03 11:18:02 +02:00