Commit graph

125 commits

Author SHA1 Message Date
Patrick Oyarzun 84841a4c4e
Fix regexp set matches for literal matchers
I was surprised to find out that posting lookups for `foo=~"(bar|bar)"`
are faster than `foo=~"bar"`. It turns out we introduced a performance
regression in https://github.com/grafana/mimir-prometheus/pull/463.

When we added the `optimizeAlternatingLiterals` function, we subtly
broke one edge case. A regexp matcher which matches a single literal,
like `foo=~"bar"` used to return `bar` from `SetMatches()`, but
currently does not. The implication is that the tsdb will first do a
LabelValues call to get all values for `foo`, then match them against
the regexp `bar`. This PR restores the previous behavior which is able
to directly lookup postings for `foo="bar"` instead.
2024-01-02 13:42:01 -06:00
Marco Pracucci b894301c3e
Fix FastRegexMatcher to skip nested capture groups
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-12-20 13:00:59 +01:00
Marco Pracucci 0a92839843
Use slices.Clone()
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-12-07 11:50:08 +01:00
Marco Pracucci 986ca1a16a
Ensure manipulating the returned FastRegexMatcher.SetMatches() doesn't pollute the matcher's internal state
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-12-07 11:34:45 +01:00
Arve Knudsen f047647bb0 Merge remote-tracking branch 'prometheus/main' into arve/sync-upstream
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-11-24 14:36:13 +01:00
Bryan Boreham a3e02f35d6 labels: extract common code between slice and stringlabels
This reduces bulk and should avoid issues if a fix is made in one file
and not the other.

A few methods now call `Range()` instead of `range`, but nothing
performance-sensitive.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-11-17 18:20:03 +00:00
Bryan Boreham 1bfb3ed062
Labels: reduce allocations when creating from TSDB WAL (#13044)
* Labels: reduce allocations when creating from TSDB

When reading the WAL, by passing references into the buffer we can avoid
copying strings under `-tags stringlabels`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-11-14 11:36:35 +00:00
György Krajcsovits 1149f7e9e1 Fix lint errors: dot at comment end
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2023-11-02 21:09:38 +01:00
György Krajcsovits d42e296516 Merge remote-tracking branch 'upstream/main' into krajo/merge-upstream 2023-11-02 20:45:05 +01:00
Dimitar Dimitrov 601a2f91ed
Attempt to remove unused unused linter directives 2023-11-01 14:19:50 +01:00
Oleksandr Redko fa90ca46e5 ci(lint): enable godot; append dot at the end of comments
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2023-10-31 19:53:38 +02:00
Arve Knudsen 35ab75918a Merge remote-tracking branch 'prometheus/main' into arve/upgrade-exp
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-10-06 16:11:40 +02:00
Oleg Zaytsev cd91345b76
Fix BenchmarkOptimizeEqualStringMatchers
The logic has changed, and we create a slice-based matcher for smaller
number of alternations. This fixes the benchmark.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2023-10-02 12:00:18 +02:00
Marco Pracucci 5d7eb6d8f9
Make linter happy
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-09-29 15:33:38 +02:00
Marco Pracucci 0df8e0fac1
Clarify ordering
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-09-29 15:33:38 +02:00
Marco Pracucci 79640b3eaf
Optimize .? in the FastRegexMatcher
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-09-29 15:33:36 +02:00
Marco Pracucci 3ed2ca3c4c
Address offline feedback
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-09-29 15:14:37 +02:00
Marco Pracucci 5ae9c194ce
Fix
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-09-29 14:17:17 +02:00
Marco Pracucci c28d940234
Improved regexp matcher in TestAnalyzeRealQueries
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-09-29 12:48:55 +02:00
Marco Pracucci 20c56ae9e0
Revert change
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-09-29 12:26:00 +02:00
Marco Pracucci fe9124ffa3
Optimize regexp matcher with list of alternations ending with .*
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-09-29 12:21:30 +02:00
Goutham Veeramachaneni 86729d4d7b
Update exp package (#12650) 2023-09-21 22:53:51 +02:00
Dimitar Dimitrov 77ac7ad40a
Merge remote-tracking branch 'upstream/main' into dimitar/pull-upstream 2023-09-05 16:19:00 +02:00
Bryan Boreham d6e1b1acdb
Merge pull request #12681 from prometheus/labels-unused-code
labels: remove some unused code
2023-08-14 15:48:17 +01:00
Bryan Boreham ce260b1fe1 labels: remove some unused code
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-13 15:14:53 +01:00
Bryan Boreham b5c6807fea Labels.Has quick check on first character
Exit early if we've gone past - labels are sorted in order.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-13 15:11:22 +01:00
Bryan Boreham 33aab1b2cc labels: extend benchmark for Has()
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-13 14:55:50 +01:00
Bryan Boreham b72f01414a lint
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-07-04 15:01:02 +00:00
Oleg Zaytsev 65d38b6771 mv labels_string.go labels_stringlabels.go (#12328)
This is a minor cosmetical change, but my IDE (and I guess many of them)
nests `labels_string.go` under `labels.go` because it assumes it's the
file generated by the `stringer` tool, which follows that naming
pattern.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2023-07-04 15:01:01 +00:00
Michael Hoffmann 7f9fe4cb5f feat: dont compile regex matcher if we know its a literal (#12434)
labels: dont compile regex matcher if we know its a literal

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

Co-authored-by: Sharad <sharadgaur@gmail.com>
2023-07-04 15:00:51 +00:00
Bryan Boreham 1e16c60bae labels: improve Get method for stringlabels build (#12485)
Inline one call to `decodeString`, and skip decoding the value string
until we find a match for the name.
Do a quick check on the first character in each string,
and exit early if we've gone past - labels are sorted in order.

Also improve tests and benchmark:
* labels: test Get with varying lengths - it's not typical for Prometheus labels to all be the same length.
* extend benchmark with label not found

---------

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-07-03 20:34:35 +01:00
Bryan Boreham 19c9db3c07 labels: faster Compare function when using -tags stringlabels (#12451)
Instead of unpacking every individual string, we skip to the point
where there is a difference, going 8 bytes at a time where possible.

Add benchmark for Compare; extend tests too.

---------

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
2023-07-03 20:34:29 +01:00
Bryan Boreham e1115ae58d
labels: improve Get method for stringlabels build (#12485)
Inline one call to `decodeString`, and skip decoding the value string
until we find a match for the name.
Do a quick check on the first character in each string,
and exit early if we've gone past - labels are sorted in order.

Also improve tests and benchmark:
* labels: test Get with varying lengths - it's not typical for Prometheus labels to all be the same length.
* extend benchmark with label not found

---------

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-06-26 18:35:22 +01:00
Bryan Boreham 87d08abe11
labels: faster Compare function when using -tags stringlabels (#12451)
Instead of unpacking every individual string, we skip to the point
where there is a difference, going 8 bytes at a time where possible.

Add benchmark for Compare; extend tests too.

---------

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
2023-06-20 20:58:47 +01:00
Oleg Zaytsev 6a18962cfa
mv labels_string.go labels_stringlabels.go (#12328)
This is a minor cosmetical change, but my IDE (and I guess many of them)
nests `labels_string.go` under `labels.go` because it assumes it's the
file generated by the `stringer` tool, which follows that naming
pattern.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2023-06-13 09:38:00 +01:00
Michael Hoffmann 344c8ff97c
feat: dont compile regex matcher if we know its a literal (#12434)
labels: dont compile regex matcher if we know its a literal

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

Co-authored-by: Sharad <sharadgaur@gmail.com>
2023-06-07 21:54:30 +01:00
Jeanette Tan 1be0816b46 Merge remote-tracking branch 'upstream/main' 2023-05-23 00:20:36 +08:00
Bryan Boreham a073e04a9b
Merge pull request #12366 from prometheus/release-2.44
Merge release 2.44 back to main
2023-05-16 18:06:29 +01:00
Oleg Zaytsev 1f14d27d40
mv sharding_string.go sharding_stringlabels.go
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2023-05-05 12:53:56 +02:00
Oleg Zaytsev 864e582a14
Add TestStableHash
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2023-05-04 16:54:39 +02:00
Bryan Boreham 7a48a266b6
labels: respect Set after Del in Builder (#12322)
* labels: respect Set after Del in Builder

The implementations are not symmetric between `Set()` and `Del()`, so
we must be careful. Add tests for this, both in labels and in relabel
where the issue was reported.

Also make the slice implementation consistent re `slices.Contains`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-03 11:59:27 +01:00
Bryan Boreham 2a7a27f91a labels: add stringlabels version of StableHash
Inlined the `Range()` implementation to avoid adding overhead.
2023-05-02 15:51:17 +01:00
György Krajcsovits 65b8edbed4 Merge remote-tracking branch 'upstream/main' into sync-upstream-28-apr-2023 2023-04-28 18:04:02 +02:00
Jeanette Tan a064ec6fc1 Fix lint 2023-04-26 22:22:31 +08:00
Jeanette Tan 0fccba0db9 Merge remote-tracking branch 'upstream/main' 2023-04-26 21:25:21 +08:00
cui fliter 276ca6a883 fix some comments
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-04-25 14:19:16 +08:00
beorn7 5b53aa1108 style: Replace else if cascades with switch
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.

The exceptions that I have found in our codebase are just these two:

* The `if else` is followed by an additional statement before the next
  condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
  used. In this case, using `switch` would require tagging the `for`
  loop, which probably tips the balance.

Why are `switch` statements more readable?

For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.

I'm sure the aforemention wise coders can list even more reasons.

In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:22:31 +02:00
Đurica Yuri Nikolić d162bb51b6
Merge pull request #485 from grafana/yuri/bring-prometheus-upstream
Synch with Prometheus up to 2023-04-18 (b028112)
2023-04-18 15:07:26 +02:00
George Krajcsovits 21131bf6ad
Merge pull request #480 from grafana/sync-upstream-14-apr-2023
Sync with Prometheus up to 2023-04-14 (7309ac27)
2023-04-18 10:21:17 +02:00
Dhanu Saputra ad4dc380d5
track-number-of-optimized-regexp-label-matchers - make isRegexOptimized (#481)
public

Signed-off-by: dhanu <andreasdhanu@gmail.com>
2023-04-18 07:24:41 +00:00