Jorge Creixell
e9e3d64b7c
PromQL engine: Delay deletion of __name__ label to the end of the query evaluation ( #14477 )
...
PromQL engine: Delay deletion of __name__ label to the end of the query evaluation
- This change allows optionally preserving the `__name__` label via the `label_replace` and `label_join` functions, and helps prevent the dreaded "vector cannot contain metrics with the same labelset" error.
- The implementation extends the `Series` and `Sample` structs with a boolean flag indicating whether the `__name__` label should be deleted at the end of the query evaluation.
- The `label_replace` and `label_join` functions can still access the value of the `__name__` label, even if it has been previously marked for deletion. If `__name__` is used as target label, it won't be dropped at the end of the query evaluation.
- Fixes https://github.com/prometheus/prometheus/issues/11397
- See https://github.com/jcreixell/prometheus/pull/2 for previous discussion, including the decision to create this PR and benchmark it before considering other alternatives (like refactoring `labels.Labels`).
- See https://github.com/jcreixell/prometheus/pull/1 for an alternative implementation using a special label instead of boolean flags.
- Note: a feature flag `promql-delayed-name-removal` has been added as it changes the behavior of some "weird" queries (see https://github.com/prometheus/prometheus/issues/11397#issuecomment-1451998792 )
Example (this always fails, as `__name__` is being dropped by `count_over_time`):
```
count_over_time({__name__!=""}[1m])
=> Error executing query: vector cannot contain metrics with the same labelset
```
Before:
```
label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)")
=> Error executing query: vector cannot contain metrics with the same labelset
```
After:
```
label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)")
=>
count_go_gc_cycles_automatic_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1
count_go_gc_cycles_forced_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1
...
```
Signed-off-by: Jorge Creixell <jcreixell@gmail.com>
---------
Signed-off-by: Jorge Creixell <jcreixell@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
2024-08-29 15:50:39 +02:00
Björn Rabenstein
849215d90c
Merge pull request #14585 from fatsheep9146/covert-TestNativeHistogram_Sum_Count_Add_AvgOperator-to-framework
...
convert TestNativeHistogram_Sum_Count_Add_AvgOperator into testing framework
2024-08-28 17:21:32 +02:00
Björn Rabenstein
7fad1ec8ee
Merge pull request #14655 from suntala/suntala/sort-by-label-enhancement
...
promql: Fall back to full label sets when sorting by label
2024-08-21 12:28:55 +02:00
Ziqi Zhao
8f828d45c1
convert TestNativeHistogram_Sum_Count_Add_AvgOperator into testing framework
...
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2024-08-21 09:24:50 +08:00
Charles Korn
52818a97e2
Merge branch 'main' into sum-and-avg-over-mixed-custom-exponential-histograms
...
# Conflicts:
# promql/promqltest/testdata/native_histograms.test
2024-08-14 07:52:08 +10:00
György Krajcsovits
386fc8b9f6
Update from review comments.
...
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-13 15:26:07 +02:00
György Krajcsovits
6aee5b4b38
fix typo
...
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-12 12:04:45 +02:00
György Krajcsovits
06a8886b94
Native histograms: define behavior when rate is null.
...
Histogram quantile returns NaN in this case, which might be
surprising, so add a unit test that clarifies that this is
intentional.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-12 10:40:21 +02:00
suntala
fd2f44af7f
Fall back to comparing by label set when sorting by label desc
...
Co-authored-by: Aleks Fazlieva <britishrum@users.noreply.github.com>
Signed-off-by: suntala <arati.rana@grafana.com>
2024-08-11 21:44:03 +02:00
suntala
94ad489328
Fall back to comparing by label set when sorting by label
...
Co-authored-by: Aleks Fazlieva <britishrum@users.noreply.github.com>
Signed-off-by: suntala <arati.rana@grafana.com>
2024-08-11 21:44:03 +02:00
Charles Korn
f992f81bd0
Merge branch 'main' into sum-and-avg-over-mixed-custom-exponential-histograms
...
Signed-off-by: Charles Korn <charleskorn@users.noreply.github.com>
2024-08-09 13:58:54 +10:00
Charles Korn
5cfdde327c
Address PR feedback: add extra test case
...
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-09 13:57:37 +10:00
Charles Korn
82bb35fabb
Address PR feedback: fix typo and rename variable
...
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-09 13:51:31 +10:00
Charles Korn
f07b3ae67b
Fix issue where avg
over mixed exponential and custom buckets, or incompatible custom buckets, produces incorrect results or panics
...
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-07 15:32:35 +10:00
Charles Korn
5ee94f49a2
Fix issue where sum
over mixed exponential and custom buckets, or incompatible custom buckets, produces incorrect results
...
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-07 15:30:01 +10:00
Charles Korn
424cefcf5e
Fix "cannot reduce resolution to custom buckets schema" panic in rate
over native histograms with mix of custom and exponential buckets
...
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-07 14:45:02 +10:00
Björn Rabenstein
ee5bba07c0
Merge pull request #14413 from prometheus/beorn7/promql
...
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Build Prometheus for common architectures (0) (push) Waiting to run
CI / Build Prometheus for common architectures (1) (push) Waiting to run
CI / Build Prometheus for common architectures (2) (push) Waiting to run
CI / Build Prometheus for all architectures (0) (push) Waiting to run
CI / Build Prometheus for all architectures (1) (push) Waiting to run
CI / Build Prometheus for all architectures (10) (push) Waiting to run
CI / Build Prometheus for all architectures (11) (push) Waiting to run
CI / Build Prometheus for all architectures (2) (push) Waiting to run
CI / Build Prometheus for all architectures (3) (push) Waiting to run
CI / Build Prometheus for all architectures (4) (push) Waiting to run
CI / Build Prometheus for all architectures (5) (push) Waiting to run
CI / Build Prometheus for all architectures (6) (push) Waiting to run
CI / Build Prometheus for all architectures (7) (push) Waiting to run
CI / Build Prometheus for all architectures (8) (push) Waiting to run
CI / Build Prometheus for all architectures (9) (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
promql: more Kahan summation (avg) and less incremental mean calculation (avg, avg_over_time)
2024-08-06 19:56:32 +02:00
Charles Korn
aadec25faf
promql: Fix issue where some native histogram-related annotations are not emitted by rate
( #14575 )
...
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-06 09:10:40 +01:00
beorn7
c39776c5b5
promql: Add NHCB tests
...
This adds equivalent NHCB tests to the existing classic histogram
tests.
Signed-off-by: beorn7 <beorn@grafana.com>
2024-07-16 12:20:43 +02:00
beorn7
cff0429b1a
promql: make avg_over_time faster and more precise
...
Same idea as for the avg aggregator before: Most of the time, there is
no overflow, so we don't have to revert to the more expensive and less
precise incremental calculation of the mean value.
Signed-off-by: beorn7 <beorn@grafana.com>
2024-07-10 19:20:24 +02:00
beorn7
c46074f4dd
promql: make avg aggregation more precise and less expensive
...
The basic idea here is that the previous code was always doing
incremental calculation of the mean value, which is more costly and
can be less precise. It protects against overflows, but in most cases,
an overflow doesn't happen anyway.
The other idea applied here is to expand on #14074 , where Kahan
summation was applied to sum().
With this commit, the average is calculated in a conventional way
(adding everything up and divide in the end) as long as the sum isn't
overflowing float64. This is combined with Kahan summation so that the
avg aggregation, in most cases, is really equivalent to the sum
aggregation with a following division (which is the user's expectation
as avg is supposed to be syntactic sugar for sum with a following
divison).
If the sum hits ±Inf, the calculation reverts to incremental
calculation of the mean value. Kahan summation is also applied here,
although it cannot fully compensate for the numerical errors
introduced by the incremental mean calculation. (The tests added in
this commit would fail if incremental mean calculation was always
used.)
Signed-off-by: beorn7 <beorn@grafana.com>
2024-07-10 19:20:24 +02:00
darshanime
bd4ea118e9
Allow durations for number rule
...
Signed-off-by: darshanime <deathbullet@gmail.com>
2024-07-10 15:50:22 +02:00
darshanime
cfad8ff3b2
Deprecate duration token
...
Signed-off-by: darshanime <deathbullet@gmail.com>
2024-07-10 15:49:58 +02:00
darshanime
8c8860d2d6
Allow number literals as duration
...
Signed-off-by: darshanime <deathbullet@gmail.com>
2024-07-10 15:49:30 +02:00
Filip Petkovski
acb6c1ae4b
Fix decoding buckets for native histograms in binops
...
The optimizer which detects cases where histogram buckets can be skipped
does not take into account binary expressions. This can lead to buckets
not being decoded if a metric is used with both histogram_fraction/quantile and
histogram_sum/count in the same expression.
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-07-10 11:55:29 +02:00
JuanJo Ciarlante
c94c5b64c3
feat: add limitk() and limit_ratio() operators ( #12503 )
...
* rebase 2024-07-01, picks previous renaming to `limitk()` and `limit_ratio()`
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* gofumpt -d -extra
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* more lint fixes
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* more lint fixes+
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* put limitk() and limit_ratio() behind --enable-feature=promql-experimental-functions
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* EnableExperimentalFunctions for TestConcurrentRangeQueries() also
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* use testutil.RequireEqual to fix tests, WIP equivalent thingie for require.Contains
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* lint fix
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* moar linting
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* rebase 2024-06-19
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* re-add limit(2, metric) testing for N=2 common series subset
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* move `ratio = param` to default switch case, for better readability
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* gofumpt -d -extra util/testutil/cmp.go
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* early break when reaching k elems in limitk(), should have always been so (!)
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* small typo fix
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* no-change small break-loop rearrange for readability
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* remove IsNan(ratio) condition in switch-case, already handled as input validation
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* no-change adding some comments
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* no-change simplify fullMatrix() helper functions used for tests
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* add `limitk(-1, metric)` testcase, which is handled as any k < 1 case
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* engine_test.go: no-change create `requireCommonSeries() helper func (moving code into it) for readability
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* rebase 2024-06-21
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* engine_test.go: HAPPY NOW about its code -> reorg, create and use simpleRangeQuery() function, less lines and more readable ftW \o/
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* move limitk(), limit_ratio() testing to promql/promqltest/testdata/limit.test
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* remove stale leftover after moving tests from engine_test.go to testdata/
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* fix flaky `limit_ratio(0.5, ...)` test case
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* Update promql/engine.go
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* Update promql/engine.go
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* Update promql/engine.go
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* fix AddRatioSample() implementation to use a single conditional (instead of switch/case + fallback return)
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* docs/querying/operators.md: document r < 0
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* add negative limit_ratio() example to docs/querying/examples.md
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* move more extensive docu examples to docs/querying/operators.md
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* typo
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* small docu fix for poor-mans-normality-check, add it to limit.test ;)
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* limit.test: expand "Poor man's normality check" to whole eval range
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* restore mistakenly removed existing small comment
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* expand poors-man-normality-check case(s)
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* Revert "expand poors-man-normality-check case(s)"
This reverts commit f69e1603b2ebe69c0a100197cfbcf6f81644b564, indeed too
flaky 0:)
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* remove humor from docs/querying/operators.md
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* fix signoff
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* add web/ui missing changes
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* expand limit_ratio test cases, cross-fingering they'll not be flaky
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* remove flaky test
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* add missing warnings.Merge(ws) in instant-query return shortcut
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* add missing LimitK||LimitRatio case to codemirror-promql/src/parser/parser.ts
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* fix ui-lint
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
* actually fix returned warnings :]
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
---------
Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
2024-07-03 22:18:57 +02:00
Ganesh Vernekar
3d54bcc018
Merge pull request #14362 from charleskorn/charleskorn/sum-infinity
2024-07-03 01:05:03 -04:00
Charles Korn
fd6bdf5230
Fix issue where summation of +/- infinity returns NaN instead of infinity
...
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-06-28 11:26:54 +10:00
Björn Rabenstein
2e58d46522
Merge pull request #13662 from prometheus/nhcb
...
Native histograms custom buckets storage
2024-06-27 21:44:20 +02:00
Bryan Boreham
b6aba4ff14
Merge pull request #14074 from bboreham/kahan-sum-sum
...
[ENHANCEMENT] PromQL: use Kahan summation for sum()
2024-06-24 11:13:26 +01:00
Jeanette Tan
dda5f48c9e
Merge branch 'main' into nhcb-review-2
2024-06-20 22:50:00 +08:00
Jeanette Tan
fc9dc72028
remove eval_with_nhcb
...
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-06-20 22:49:00 +08:00
Jeanette Tan
a6d788b1be
update missed suggested change from code review
...
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-06-20 22:49:00 +08:00
Charles Korn
aeec30f082
Convert TestTimestampFunction_StepsMoreOftenThanSamples
...
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-06-17 16:56:56 +10:00
Charles Korn
987fa5c6a2
Convert range query test cases to test scripting language
...
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-06-17 16:43:01 +10:00
Jeanette Tan
14f8dded39
Merge branch 'main' into nhcb
...
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-06-07 19:17:14 +08:00
beorn7
c7fdfe8004
promql: Add tests for histogram counter reset only in bucket
...
This also exercises the "fast path" (only decoding count and sum),
i.e. where the counter reset isn't visible at all in the decoded data.
Signed-off-by: beorn7 <beorn@grafana.com>
2024-06-06 17:47:38 +02:00
Jeanette Tan
f028496133
Merge branch 'main' into nhcb
...
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-05-14 16:20:15 +08:00
Neeraj Gartia
661856cb65
removes the added tests from engine_test.go
...
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-05-13 22:58:25 +05:30
Neeraj Gartia
6119124d0e
some nits
...
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-05-13 22:55:38 +05:30
Neeraj Gartia
adf5a36c1e
adds test for sum, count, stddev, stdvar, quantile and fraction func to promql testing framework
...
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-05-13 22:55:38 +05:30
Neeraj Gartia
8b838a05d9
adds test for native histogram rate func in promql testing framework
...
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-05-13 22:55:38 +05:30
Neeraj Gartia
548bd9d6fb
adds TestNativeHistogramRate func to promql test framework
...
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-05-13 22:55:38 +05:30
Bryan Boreham
ea82b49c33
[ENHANCEMENT] PromQL: use Kahan summation for sum()
...
This can give a more precise result, by keeping a separate running
compensation value to accumulate small errors.
See https://en.wikipedia.org/wiki/Kahan_summation_algorithm
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-05-09 14:29:38 +01:00
Bryan Boreham
11b27d5d22
test: move test files into new promqltest package
...
So that promql package does not bring in test-only dependencies.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-05-08 13:42:55 +01:00