Commit graph

999 commits

Author SHA1 Message Date
Bryan Boreham 773170f372
Merge pull request #13822 from dgl/promtool-test-errors
promtool: Avoid using testify for user rule tests
2024-03-23 09:42:34 +01:00
György Krajcsovits a3d1a46eda Merge branch 'main' into nhcb 2024-03-22 14:51:48 +01:00
Jeanette Tan 9d32754bc0 add unit tests with all negative values for histogram_stddev and var
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-03-22 03:42:50 +08:00
David Leadbeater 7ec4a11472 promtool: Avoid using testify for user rule tests
Using testify outside of unit tests results in panics rather than a
useful error for the user.

Fixes #13703

Signed-off-by: David Leadbeater <dgl@dgl.cx>
2024-03-21 22:08:10 +11:00
tdakkota f6834c347a
promql: validate label_join destination label
Signed-off-by: tdakkota <tanc13@yandex.ru>
2024-03-20 22:02:10 +03:00
Bartlomiej Plotka 312e3fd728
Merge pull request #13713 from charleskorn/query-engine-interface
rules: allow using alternative PromQL engines for rule evaluation by callers using Prometheus as a lib.
2024-03-13 14:45:42 +01:00
Björn Rabenstein af3618fd35
Merge pull request #13667 from prometheus/beorn7/promql
Improve TestQueryStatistics and fix bugs exposed by it
2024-03-12 16:17:11 +01:00
Charles Korn 26262a1eb7
Remove unnecessary SetQueryLogger method on QueryEngine interface
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-03-12 22:01:01 +11:00
carrychair 856f6e49c8 fix function and struct name
Signed-off-by: carrychair <linghuchong404@gmail.com>
2024-03-09 17:53:17 +08:00
beorn7 7f912db15a promql: Fix limiting of extrapolation to negative values
This is a bit tough to explain, but I'll try:

`rate` & friends have a sophisticated extrapolation algorithm.
Usually, we extrapolate the result to the total interval specified in
the range selector. However, if the first sample within the range is
too far away from the beginning of the interval, or if the last sample
within the range is too far away from the end of the interval, we
assume the series has just started half a sampling interval before the
first sample or after the last sample, respectively, and shorten the
extrapolation interval correspondingly. We calculate the sampling
interval by looking at the average time between samples within the
range, and we define "too far away" as "more than 110% of that
sampling interval".

However, if this algorithm leads to an extrapolated starting value
that is negative, we limit the start of the extrapolation interval to
the point where the extrapolated starting value is zero.

At least that was the intention.

What we actually implemented is the following: If extrapolating all
the way to the beginning of the total interval would lead to an
extrapolated negative value, we would only extrapolate to the zero
point as above, even if the algorithm above would have selected a
starting point that is just half a sampling interval before the first
sample and that starting point would not have an extrapolated negative
value. In other word: What was meant as a _limitation_ of the
extrapolation interval yielded a _longer_ extrapolation interval in
this case.

There is an exception to the case just described: If the increase of
the extrapolation interval is more than 110% of the sampling interval,
we suddenly drop back to only extrapolate to half a sampling interval.

This behavior can be nicely seen in the testcounter_zero_cutoff test,
where the rate goes up all the way to 0.7 and then jumps back to 0.6.

This commit changes the behavior to what was (presumably) intended
from the beginning: The extension of the extrapolation interval is
only limited if actually needed to prevent extrapolation to negative
values, but the "limitation" never leads to _more_ extrapolation
anymore.

The difference is subtle, and probably it never bothered anyone.
However, if you calculate a rate of a classic histograms, the old
behavior might create non-monotonic histograms as a result (because of
the jumps you can see nicely in the old version of the
testcounter_zero_cutoff test). With this fix, that doesn't happen
anymore.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-03-07 01:20:33 +01:00
Charles Korn 4e77e8e5ef
Allow using alternative PromQL engines for rule evaluation
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-03-06 14:54:33 +11:00
beorn7 f48c7a5503 promql: Add histograms to TestQueryStatistics
Also, fix the bugs exposed by the tests.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-02-29 19:02:40 +01:00
beorn7 f46dd34982 promql: Add code comment
Signed-off-by: beorn7 <beorn@grafana.com>
2024-02-29 19:02:40 +01:00
beorn7 7d364c0451 promql: remove redundant line
Signed-off-by: beorn7 <beorn@grafana.com>
2024-02-29 19:02:40 +01:00
Björn Rabenstein d5f0a240fa
Merge pull request #13674 from prometheus/owilliams/dupeok
fix: restore ability to match __name__ multiple times in selector
2024-02-29 17:49:15 +01:00
Björn Rabenstein 9187bcbdd5
Merge pull request #13536 from bboreham/faster-label-replace
promql: faster range-query of label_replace and label_join
2024-02-29 17:03:00 +01:00
Owen Williams e01e7d36e2 fix: restore ability to match __name__ multiple times in selector
Add tests to cover this case.

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2024-02-29 10:46:31 -05:00
machine424 f477e0539a
Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21
Prevent adding back golang.org/x/exp/slices.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-02-28 14:54:53 +01:00
György Krajcsovits 5d0a0a7542 Add custom buckets to native histogram model (#13592)
* add custom buckets to native histogram model
* simple copy for custom bounds
* return errors for unsupported add/sub operations
* add test cases for string and update appendhistogram in scrape to account for new schema
* check fields which are supposed to be unused but may affect results in equals
* allow appending custom buckets histograms regardless of max schema

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-02-28 14:06:43 +01:00
Björn Rabenstein 11932cd345
Merge pull request #13637 from bboreham/agg-warning
PromQL: improve warning for mixed values in aggregations
2024-02-28 13:52:30 +01:00
Owen Williams ac51a8024c tests(utf8): confirm that other quote marks are handled correctly in promql
Signed-off-by: Owen Williams <owen.williams@grafana.com>
2024-02-27 15:47:47 -05:00
Bryan Boreham 0347148628 promql: fuzz test needs symbol table for parser
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Bryan Boreham 22890b1eb3 PromQL: improve warning for mixed values in aggregations
Aggregations discard the metric name, so don't try to
include it in the error message.

Add a test that generates this warning.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-25 19:57:06 +00:00
Owen Williams a28d7865ad UTF-8: Add support for parsing UTF8 metric and label names
This adds support for the new grammar of `{"metric_name", "l1"="val"}` to promql and some of the exposition formats.
This grammar will also be valid for non-UTF-8 names.
UTF-8 names will not be considered valid unless model.NameValidationScheme is changed.

This does not update the go expfmt parser in text_parse.go, which will be addressed by https://github.com/prometheus/common/issues/554/.

Part of https://github.com/prometheus/prometheus/issues/13095

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2024-02-15 14:34:37 -05:00
Bryan Boreham ff6c83269c
Merge pull request #13452 from bboreham/go-cmp
Tests: Use DeepEqual replacement using go-cmp, which is more flexible
2024-02-12 15:40:08 +01:00
Bryan Boreham 17f48f2b3b Tests: use replacement DeepEquals in more places
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-08 19:32:33 +00:00
Bryan Boreham 39af788dbd Tests: use replacement DeepEquals using go-cmp
Use DeepEqual replacement using go-cmp, which is more flexible.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-08 19:30:20 +00:00
beorn7 86d7618d84 promql: Fix wrongly scoped range vectors
Fixes #11708.

If a range vector is fixen in time with the @ modifier, it gets still
moved around for different steps in a range query. Since no additional
points are retrieved from the TSDB, this leads to steadily emptying
the range, leading to the weird behavior described in isse #11708.

This only happens for functions listed in `AtModifierUnsafeFunctions`,
and the only of those that takes a range vector is `predict_linear`,
which is the reason why we see it only for this particular function.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-02-07 18:07:51 +01:00
beorn7 384ab025e0 promql: Expose issue #11708
The shorter range in the test makes early points drop out of the range
in range queries, exposing the issue.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-02-07 18:06:05 +01:00
Bryan Boreham 98c4889029
Merge pull request #9298 from Creatone/creatone/use-testify
tests: Move from t.Errorf and others.
2024-02-04 16:27:57 +01:00
Bryan Boreham d3c1f0d8e0 promql: can now remove regex field from EvalNodeHelper
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-04 11:32:26 +01:00
Bryan Boreham fdd5b85e06 promql: faster range-query of label_replace and label_join
These functions act on the labels only, so don't need to go step by step
over the samples in a range query.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-04 11:12:55 +01:00
Faustas Butkus 6feffeb92e
promql: add histogram_avg function (#13467)
Add histogram_avg function

---------

Signed-off-by: Faustas Butkus <faustas.butkus@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
2024-02-01 18:28:42 +01:00
Alan Protasio c006c57efc
Proposal to improve FPointSlice and HPointSlice allocation. (#13448)
* Reusing points slice from previous series when the slice is under utilized
* Adding comments on the bench test

Signed-off-by: Alan Protasio <alanprot@gmail.com>
2024-02-01 16:22:38 +00:00
Paweł Szulik 1a47c7d59b Refactor lexer tests to use testify.
Signed-off-by: Paweł Szulik <paul.szulik@gmail.com>
2024-02-01 13:51:31 +00:00
Filip Petkovski a577a0a542
Fix last_over_time for native histograms
The last_over_time retains a histogram sample without making a copy.
This sample is now coming from the buffered iterator used for windowing functions,
and can be reused for reading subsequent samples as the iterator progresses.

I would propose copying the sample in the last_over_time function, similar to
how it is done for rate, sum_over_time and others.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-01-26 15:02:40 +01:00
Marco Pracucci f639d7794c
Fix TestParseExpressions
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-01-25 14:57:43 +01:00
Bryan Boreham 74b73d1e2c
Labels: Add DropMetricName function, used in PromQL (#13446)
This function is called very frequently when executing PromQL functions,
and we can do it much more efficiently inside Labels.

In the common case that `__name__` comes first in the labels, we simply
re-point to start at the next label, which is nearly free.

`DropMetricName` is now so cheap I removed the cache - benchmarks show
everything still goes faster.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-01-25 11:48:49 +01:00
Filip Petkovski 583f3e587c
Optimize histogram iterators (#13340)
Optimize histogram iterators

Histogram iterators allocate new objects in the AtHistogram and
AtFloatHistogram methods, which makes calculating rates over long
ranges expensive.

In #13215 we allowed an existing object to be reused
when converting an integer histogram to a float histogram. This commit follows
the same idea and allows injecting an existing object in the AtHistogram and
AtFloatHistogram methods. When the injected value is nil, iterators allocate
new histograms, otherwise they populate and return the injected object.

The commit also adds a CopyTo method to Histogram and FloatHistogram which
is used in the BufferedIterator to overwrite items in the ring instead of making
new copies.

Note that a specialized HPoint pool is needed for all of this to work 
(`matrixSelectorHPool`).

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2024-01-23 17:02:14 +01:00
Ben Ye d778591fd3 add more context cancellation check at evaluation time
Signed-off-by: Ben Ye <benye@amazon.com>
2024-01-21 14:19:39 -08:00
Björn Rabenstein bfbb13cf36
Merge pull request #13267 from linasm/simplify-native-histogram-math
promql: simplify Native Histogram arithmetics
2024-01-18 13:50:59 +01:00
Julien Pivotto 4f941bbf69
Merge pull request #13416 from tylitianrui/feat/remove_obsolete_build_tag
remove  obsolete build tag
2024-01-17 18:12:21 +01:00
zenador a3ddfbd1ee
Add warnings for histogramRate applied with isCounter not matching counter/gauge histogram (#13392)
Add warnings for histogramRate applied with isCounter not matching counter/gauge histogram

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-01-17 17:06:35 +01:00
tyltr f97fa2736c remove obsolete build tag
Signed-off-by: tyltr <tylitianrui@126.com>
2024-01-17 22:26:32 +08:00
Ivan Babrou a6b35ff304
promql: use natural sort in sort_by_label and sort_by_label_desc (#13411)
These functions are intended for humans, as robots can already sort the results
however they please. Humans like things sorted "naturally":

* https://blog.codinghorror.com/sorting-for-humans-natural-sort-order/

A similar thing has been done to Grafana, which is also used by humans:

* https://github.com/grafana/grafana/pull/78024
* https://github.com/grafana/grafana/pull/78494

Signed-off-by: Ivan Babrou <github@ivan.computer>
2024-01-16 21:34:09 -03:00
zenador 72a8f1084b
Restore more efficient version of NewPossibleNonCounterInfo annotation (#13022)
Restore more efficient version of NewPossibleNonCounterInfo annotation

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-01-16 09:54:16 +01:00
Ayoub Mrini ace9c8a3da
promtool: allow setting multiple matchers to "promtool tsdb dump" command. (#13296)
Conditions are ANDed inside the same matcher but matchers are ORed

Including unit tests for "promtool tsdb dump".

Refactor some matchers scraping utils.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-01-15 10:29:53 +00:00
Bryan Boreham 252031c86f Revert "Adding small test update for temp dir using t.TempDir (#13293)"
This reverts commit 2ddb3596ef.

Various tests are failing in CI after this change; reverting to free up
other work.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-12-30 19:17:30 +00:00
Mile Druzijanic 2ddb3596ef
Adding small test update for temp dir using t.TempDir (#13293)
* Adding small test update for temp dir using t.TempDir

Signed-off-by: Mile Druzijanic <miledruz@gmail.com>
Signed-off-by: Mile Druzijanic <zedsprogramms@gmail.com>

* removing not required cleanup

Signed-off-by: Mile Druzijanic <zedsprogramms@gmail.com>

---------

Signed-off-by: Mile Druzijanic <miledruz@gmail.com>
Signed-off-by: Mile Druzijanic <zedsprogramms@gmail.com>
2023-12-28 21:49:57 +01:00
Filip Petkovski 0e1ae1d1ca
Add comment
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-25 11:41:07 +01:00
Filip Petkovski 35f9620cd1
Expand benchmark
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-25 11:30:29 +01:00
Filip Petkovski 5df3820c7a
Copy last histogram point
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-25 11:20:51 +01:00
Filip Petkovski 1f69dcfa6b
Fix reusing float histograms
In https://github.com/prometheus/prometheus/pull/13276 we started reusing float histogram objects to reduce allocations in PromQL.
That PR introduces a bug where histogram pointers gets copied to the beginning of the histograms slice,
but are still kept in the end of the slice. When a new histogram is read into the last element,
it can overwrite a previous element because the pointer is the same.

This commit fixes the issue by moving outdated points to the end of the slice
so that we don't end up with duplicate pointers in the same buffer. In other words,
the slice gets rotated so that old objects can get reused.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-14 11:53:58 +01:00
Filip Petkovski bb8363dbb3
Add comment on SampleRingIterator
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-13 08:30:02 +01:00
Filip Petkovski e2a9f8ac0f
Reuse float histogram objects
This commit reduces the memory needed to query native histogram objects
by reusing existing HPoint instances.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-11 08:24:58 +01:00
Björn Rabenstein db915b07cb
Merge pull request #13215 from fpetkovski/float-histogram-reuse
Enable reusing memory when converting between histogram types
2023-12-09 22:44:46 +01:00
Bartlomiej Plotka 91a383f52c
Merge pull request #13059 from zenador/add-mad-function
Add mad_over_time function
2023-12-08 11:53:22 +00:00
Filip Petkovski 10a82f87fd
Enable reusing memory when converting between histogram types
The 'ToFloat' method on integer histograms currently allocates new memory
each time it is called.

This commit adds an optional *FloatHistogram parameter that can be used
to reuse span and bucket slices. It is up to the caller to make sure the
input float histogram is not used anymore after the call.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-08 10:22:59 +01:00
Linas Medziunas 7319ad6a0b promql: simplify Native Histogram arithmetics
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
2023-12-08 10:59:00 +02:00
Matthieu MOREL 9c4782f1cc
golangci-lint: enable testifylint linter (#13254)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-12-07 11:35:01 +00:00
Jeanette Tan 2910b48180 Make mad_over_time experimental and move tests
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-12-01 01:56:07 +08:00
Jeanette Tan 9bf4cc993e Add mad_over_time function
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-12-01 01:22:58 +08:00
Björn Rabenstein 5dbbadf598
Merge pull request #13216 from prometheus/beorn7/doc
Update “conventional histogram” → “classic histogram”
2023-11-30 10:35:27 +01:00
Oleksandr Redko 2a75604f8e
Enable default revive rules (#13068)
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2023-11-29 17:23:34 +00:00
beorn7 0eb0ca42c5 Update “conventional histogram” → “classic histogram”
Signed-off-by: beorn7 <beorn@grafana.com>
2023-11-29 15:22:58 +01:00
Julien Pivotto c1ec6ae851 sort_by_label: Switch to feature flag
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-11-28 15:10:12 +01:00
Alexander Trost 5051a993ab promql: add sort_by_label and sort_by_label_desc functions
This adds functions to sort a vector by its label value.

Based on https://github.com/prometheus/prometheus/pull/1533

Signed-off-by: Alexander Trost <galexrt@googlemail.com>
2023-11-28 14:40:07 +01:00
zenador ccfe14d7e7
PromQL: ignore small errors for bucketQuantile (#13153)
promql: Improve histogram_quantile calculation for classic buckets

Tiny differences between classic buckets are most likely caused by floating point precision issues. With this commit, relative changes below a certain threshold are ignored. This makes the result of histogram_quantile more meaningful, and also avoids triggering the _input to histogram_quantile needed to be fixed for monotonicity_ annotations in unactionable cases.

This commit also adds explanation of the new adjustment and of the monotonicity annotation to the documentation of `histogram_quantile`.

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-11-25 00:05:38 +01:00
Filip Petkovski 35a15e8f04
Add benchmark for native histograms (#13160)
* Add benchmark for native histograms

This commit adds a PromQL benchmark for queries on native histograms.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-11-23 14:09:17 +00:00
Julien Pivotto c92fbf3fdf Add feature flag for PromQL experimental functions.
This PR adds an Experimental flag to the functions.

This can be used by https://github.com/prometheus/prometheus/pull/13059
but also xrate and other future functions.

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-11-14 17:48:58 +01:00
Linas Medziunas 1cd6c1cde5 ValidateHistogram: strict Count check in absence of NaNs
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
2023-11-03 16:17:24 +02:00
Dimitar Dimitrov 9e3df532d8
Export promql.Engine.FindMinMaxTime
This function is useful to analyze promQL queries. We want to use this in Mimir to record the time range which the query touches.

I also chose to remove the `Engine` receiver because it was unnecessary, and it makes it easier to use, but happy to refactor that if you disagree.

The function is untested on its own. If you prefer to have unit tests now that its exported, I can look into adding some.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
2023-11-02 13:17:35 +01:00
Björn Rabenstein a43669e611
Merge pull request #12928 from alexandear/ci-enable-godot
ci(lint): enable godot; append dot at the end of comments
2023-11-01 17:15:41 +01:00
Julien Pivotto f568221610
Merge pull request #13057 from prometheus/release-2.48
Merge release-2.48 back into main
2023-10-31 15:24:39 -04:00
Oleksandr Redko fa90ca46e5 ci(lint): enable godot; append dot at the end of comments
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2023-10-31 19:53:38 +02:00
Oleksandr Redko 8e5f0387a2
ci(lint): enable nolintlint and remove redundant comments (#12926)
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2023-10-31 12:35:13 +01:00
Bryan Boreham 49c5e7afe1 PromQL: reduce garbage in range-query evaluation
The temporary variable was allocated on the heap, and it is unnecessary.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-10-29 19:45:06 +00:00
zenador 80e977aae6
Remove NewPossibleNonCounterInfo and minimise creating empty annotations (#13012)
* Remove NewPossibleNonCounterInfo until it can be made more efficient, and avoid creating empty annotations as much as possible

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-10-24 17:36:07 +01:00
Marc Tuduri af7c31ee10
PR feedback
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
2023-10-18 11:53:50 +02:00
Marc Tuduri 8fededf6ad
promql(histograms): Change sample total calculation for histograms
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
2023-10-18 11:51:11 +02:00
Jeanette Tan 9a8bd8eac6 Fix possible non-counter warning for empty names and native histograms
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-10-16 15:52:10 +08:00
Julius Volz 191c24a0ed Fix: Exempt "_bucket" suffix from PossibleNonCounterInfo warning (#12982)
Related to PR #12152

Signed-off-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2023-10-15 13:47:42 -04:00
Jeanette Tan 0cbf0c1c68 Revise according to code review
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-10-06 19:09:32 +08:00
Jeanette Tan feaa93da77 Add warning when monotonicity is forced in the input to histogram_quantile
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-10-04 18:53:55 +08:00
Alan Protasio a15e884e7a
Prevent Prometheus from overallocating memory on subquery with large amount of steps. (#12734)
* change initial points slice size

Signed-off-by: Alan Protasio <alanprot@gmail.com>

* refactor on the steps calculation and moving the getXPoint/putXPoint method to the evaluator

Signed-off-by: Alan Protasio <alanprot@gmail.com>

* prevent potential panic

Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Update promql/engine.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Update promql/engine.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Update promql/engine.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Update promql/engine.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Allocating slice with maximum size of 5k

Signed-off-by: Alan Protasio <alanprot@gmail.com>

* adding comments

Signed-off-by: Alan Protasio <alanprot@gmail.com>

---------

Signed-off-by: Alan Protasio <alanprot@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2023-09-25 20:15:41 +01:00
Goutham Veeramachaneni 86729d4d7b
Update exp package (#12650) 2023-09-21 22:53:51 +02:00
Bryan Boreham 91054875d6
Merge pull request #12732 from bboreham/simplify-rangeeval
promql: simplify inner loop of rangeEval
2023-09-20 20:22:05 +00:00
zenador 69edd8709b
Add warnings (and annotations) to PromQL query results (#12152)
Return annotations (warnings and infos) from PromQL queries

This generalizes the warnings we have already used before (but only for problems with remote read) as "annotations".

Annotations can be warnings or infos (the latter could be false positives). We do not treat them different in the API for now and return them all as "warnings". It would be easy to distinguish them and return infos separately, should that appear useful in the future.

The new annotations are then used to create a lot of warnings or infos during PromQL evaluations. Partially these are things we have wanted for a long time (e.g. inform the user that they have applied `rate` to a metric that doesn't look like a counter), but the new native histograms have created even more needs for those annotations (e.g. if a query tries to aggregate float numbers with histograms).

The annotations added here are not yet complete. A prominent example would be a warning about a range too short for a rate calculation. But such a warnings is more tricky to create with good fidelity and we will tackle it later.

Another TODO is to take annotations into account when evaluating recording rules.

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-09-14 18:57:31 +02:00
Arve Knudsen 156222cc50
Add context argument to LabelQuerier.LabelValues (#12665)
Add context argument to LabelQuerier.LabelValues and
LabelQuerier.SortedLabelValues.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-09-14 16:02:04 +02:00
Arve Knudsen a964349e97
Add context argument to LabelQuerier.LabelNames (#12666)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-09-14 10:39:51 +02:00
Arve Knudsen 4451ba10b4
Add context argument to IndexReader.Postings (#12667)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-09-13 17:45:06 +02:00
Arve Knudsen 6daee89e5f
Add context argument to Querier.Select (#12660)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-09-12 12:37:38 +02:00
Julien Pivotto 284ba3426b
Merge pull request #12758 from bboreham/trim-rangequery-benchmarks
PromQL: reduce numbers of benchmarks
2023-09-08 14:06:21 +02:00
Bryan Boreham e4dd3469ac lint
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-28 10:39:16 +01:00
Bryan Boreham 5ce990cabc promql: simplify rangeEval a bit more
We can't have both a float and a histogram at the same timestep.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-28 10:28:09 +01:00
Bryan Boreham c5671c6d97
Merge pull request #12755 from bboreham/rangequery-benchmark-mmap
promql: force mmap of head chunks in BenchmarkRangeQuery
2023-08-26 15:56:52 +01:00
Bryan Boreham 1ea57a3f8c PromQL: reduce numbers of benchmarks
Make it more likely that contributors will run the benchmark suite.

count_values needs more than 2GB at 1,000 steps, so just run it for 100.

And remove 10-step variant because it doesn't add much to 100 and
1000-step benchmarks.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-26 14:12:28 +00:00
Bryan Boreham 0d283effa8 promql: force mmap of head chunks in BenchmarkRangeQuery
Otherwise we have a highly unusual situation of over 100 chunks
in the headChunks list of each series, which heavily skews
performance.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-26 09:40:59 +00:00
Gregor Zeitlinger f01718262a
Unit tests for native histograms (#12668)
promql: Extend testing framework to support native histograms

This includes both the internal testing framework as well as the rules unit test feature of promtool.

This also adds a bunch of basic tests. Many of the code level tests can now be converted to tests within the framework, and more tests can be added easily.

---------

Signed-off-by: Harold Dost <h.dost@criteo.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Stephen Lang <stephen.lang@grafana.com>
Co-authored-by: Harold Dost <h.dost@criteo.com>
Co-authored-by: Stephen Lang <stephen.lang@grafana.com>
Co-authored-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
2023-08-25 23:35:42 +02:00
zenador 54aaa2bd7e
Add histogram_stdvar and histogram_stddev functions (#12614)
* Add new function: histogram_stdvar and histogram_stddev

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-08-24 21:02:14 +02:00
beorn7 aa82fe198f tsdb: Fix histogram validation
So far, `ValidateHistogram` would not detect if the count did not
include the count in the zero bucket. This commit fixes the problem
and updates all the tests that have been undetected offenders so far.

Note that this problem would only ever create false negatives, so we
never falsely rejected to store a histogram because of it.

On the other hand, `ValidateFloatHistogram` has been to strict with
the count being at least as large as the sum of the counts in all the
buckets. Float precision issues could create false positives here, see
products of PromQL evaluations, it's actually quite hard to put an
upper limit no the floating point imprecision. Users could produce the
weirdest expressions, maxing out float precision problems. Therefore,
this commit simply removes that particular check from
`ValidateFloatHistogram`.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-08-22 23:04:01 +02:00
Bryan Boreham 3879488476 promql: simplify inner loop of rangeEval
Took out the loops with break after one iteration, and extract some
common code to a function.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-21 19:52:14 +01:00
Michael Hoffmann 4d8e380269
promql: allow tests to be imported (#12050)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2023-08-18 20:48:59 +02:00
Bryan Boreham 5cea37c069
Merge pull request #12682 from bboreham/contains-same-label-set
promql engine: check unique labels using existing map

ContainsSameLabelset constructs a map with the same hash key as the one used to compile the output of rangeEval, so we can use that one and save work.

Need to hold the timestamp so we can be sure we saw the same series in the same evaluation.
2023-08-14 14:12:47 +01:00
Bryan Boreham 0670e4771a promql engine: check unique labels using existing map
`ContainsSameLabelset` constructs a map with the same hash key as
the one used to compile the output of `rangeEval`, so we can use that
one and save work.

Need to hold the timestamp so we can be sure we saw the same series
in the same evaluation.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-13 18:09:10 +01:00
Bryan Boreham 8d47b3d497
Merge pull request #12579 from charleskorn/timestamp
Don't recreate iterator for each series on each timestep when evaluating a query with `timestamp()`
2023-08-05 10:51:38 +01:00
Charles Korn d396282941
Address PR feedback: clarify comment
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-08-02 11:48:34 +10:00
Charles Korn 145d7457fe
Address PR feedback: use loop to create expected test result
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-08-01 13:30:12 +10:00
Charles Korn 6087c555ed
Address PR feedback: clarify comment
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-08-01 13:30:10 +10:00
Charles Korn fb3935e8f9
Address PR feedback: rename method
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-08-01 13:30:07 +10:00
Julius Volz 531567d46e Drop metric name for "atan2" binary operator
The operator changes the meaning of the metric, so the metric name should
be dropped. Technically this would be a breaking change, but it's also very
obviously a bug and not likely that anyone depends on it.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2023-07-24 14:36:02 +02:00
Charles Korn 6903d6edd8
Add test to confirm timestamp() behaves correctly when evaluating a range query.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-07-20 11:25:33 +10:00
Charles Korn fde6ebb17d
Create per-series iterators only once per selector, rather than recreating it for each time step.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-07-20 11:24:21 +10:00
Charles Korn 993618adea
Don't create a new iterator for every time step.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-07-20 11:24:21 +10:00
Charles Korn b114c0888d
Simplify loop
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-07-20 11:24:20 +10:00
Charles Korn a142998052
Expand series set just once
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-07-20 11:24:19 +10:00
Charles Korn eeface2e17
Inline method
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-07-20 11:24:19 +10:00
Charles Korn a2a2cc757e
Extract timestamp special case to its own method.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-07-20 11:24:18 +10:00
Charles Korn 15fa680117
Add benchmark for query using timestamp()
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-07-20 11:24:16 +10:00
Julien Pivotto 0a48f93111
Merge pull request #10367 from ianwoolf/pr_add_close_for_query_logger
add Close for ActiveQueryTracker to close the file.
2023-07-18 13:53:18 +02:00
cui fliter 096ceca44f
remove repetitive words (#12556)
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-07-13 15:53:40 +02:00
beorn7 162612ea86 histograms: Improve comment
Oversight during review of #12525.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-07-12 14:52:49 +02:00
Ziqi Zhao 42d9169ba1 enhance histogram_quantile to get min/max value
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2023-07-12 04:29:54 +08:00
Carrie Edwards 2f9bc98b8a Add tests for min and max functions
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2023-07-11 21:51:20 +08:00
Carrie Edwards bc0ee4a469 Implement native histogram min and max query functions
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2023-07-11 21:51:20 +08:00
Bryan Boreham ce153e3fff Replace sort.Sort with faster slices.SortFunc
The generic version is more efficient.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-07-10 09:43:45 +00:00
Giedrius Statkevičius 3f230fc9f8 promql: convert QueryOpts to interface
Convert QueryOpts to an interface so that downstream projects like
https://github.com/thanos-community/promql-engine could extend the query
options with engine specific options that are not in the original
engine.

Will be used to enable query analysis per-query.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2023-07-03 16:20:31 +03:00
Julien Pivotto a605b81b14
Merge pull request #12170 from fpetkovski/parser-inject-functions
parser: Allow parsing arbitrary functions
2023-06-27 13:32:46 +02:00
Bryan Boreham 67d2ef004d Placate lint
I think the version using scoping was better, but I'm out of energy to fight the linter.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-06-01 18:36:34 +00:00
Bryan Boreham bb0d8320dd promql: include parsing in active-query tracking
So that the max-concurrency limit is applied.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-06-01 18:16:05 +00:00
Bryan Boreham 71fc4f1516 promql: refactor: create query object before parsing
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-06-01 17:54:17 +00:00
Bryan Boreham 1f3821379c promql: refactor: extract fn to wait on concurrency limit
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-06-01 17:17:04 +00:00
zenador 191bf9055b
Handle more arithmetic operators for native histograms (#12262)
Handle more arithmetic operators and aggregators for native histograms

This includes operators for multiplication (formerly known as scaling), division, and subtraction. Plus aggregations for average and the avg_over_time function.

Stdvar and stddev will (for now) ignore histograms properly (rather than counting them but adding a 0 for them).

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-05-16 21:15:20 +02:00
beorn7 9e500345f3 textparse/scrape: Add option to scrape both classic and native histograms
So far, if a target exposes a histogram with both classic and native
buckets, a native-histogram enabled Prometheus would ignore the
classic buckets. With the new scrape config option
`scrape_classic_histograms` set, both buckets will be ingested,
creating all the series of a classic histogram in parallel to the
native histogram series. For example, a histogram `foo` would create a
native histogram series `foo` and classic series called `foo_sum`,
`foo_count`, and `foo_bucket`.

This feature can be used in a migration strategy from classic to
native histograms, where it is desired to have a transition period
during which both native and classic histograms are present.

Note that two bugs in classic histogram parsing were found and fixed
as a byproduct of testing the new feature:

1. Series created from classic _gauge_ histograms didn't get the
   _sum/_count/_bucket prefix set.
2. Values of classic _float_ histograms weren't parsed properly.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-05-13 01:32:25 +02:00
Justin Lei 7bbf24b707 Make MemoizedSeriesIterator not implement chunkenc.Iterator
Signed-off-by: Justin Lei <justin.lei@grafana.com>
2023-05-03 12:45:39 -07:00
Justin Lei 6985dcbe73 Optimize and test MemoizedSeriesIterator
Signed-off-by: Justin Lei <justin.lei@grafana.com>
2023-05-02 08:53:18 -07:00
Matthieu MOREL 7e9acc2e46
golangci-lint: remove skip-cache and restore singleCaseSwitch rule
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-20 18:43:51 +02:00
Julien Pivotto f7c6130ff2
Merge pull request #12251 from prymitive/query_samples_total
Add query_samples_total metric
2023-04-20 15:48:24 +02:00
Matthieu MOREL bae9a21200
Merge branch 'main' into linter/nilerr
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-19 19:56:39 +02:00
beorn7 5b53aa1108 style: Replace else if cascades with switch
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.

The exceptions that I have found in our codebase are just these two:

* The `if else` is followed by an additional statement before the next
  condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
  used. In this case, using `switch` would require tagging the `for`
  loop, which probably tips the balance.

Why are `switch` statements more readable?

For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.

I'm sure the aforemention wise coders can list even more reasons.

In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:22:31 +02:00
beorn7 c3c7d44d84 lint: Adjust to the lint warnings raised by current versions of golint-ci
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.

There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).

I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.

I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:10:10 +02:00
Ben Ye fd3630b9a3 add ctx to QueryEngine interface
Signed-off-by: Ben Ye <benye@amazon.com>
2023-04-17 21:32:38 -07:00
ianwoolf 79e4bdee8e add Close for ActiveQueryTracker to close the file.
Signed-off-by: ianwoolf <btw515wolf2@gmail.com>
2023-04-14 14:43:23 +08:00
Matthieu MOREL fb3eb21230 enable gocritic, unconvert and unused linters
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-13 19:20:22 +00:00
beorn7 551de0346f promql: Do not return nil slices to the pool
Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:24 +02:00
beorn7 817a2396cb Name float values as "floats", not as "values"
In the past, every sample value was a float, so it was fine to call a
variable holding such a float "value" or "sample". With native
histograms, a sample might have a histogram value. And a histogram
value is still a value. Calling a float value just "value" or "sample"
or "V" is therefore misleading. Over the last few commits, I already
renamed many variables, but this cleans up a few more places where the
changes are more invasive.

Note that we do not to attempt naming in the JSON APIs or in the
protobufs. That would be quite a disruption. However, internally, we
can call variables as we want, and we should go with the option of
avoiding misunderstandings.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:24 +02:00
beorn7 c0879d64cf promql: Separate Point into FPoint and HPoint
In other words: Instead of having a “polymorphous” `Point` that can
either contain a float value or a histogram value, use an `FPoint` for
floats and an `HPoint` for histograms.

This seemingly small change has a _lot_ of repercussions throughout
the codebase.

The idea here is to avoid the increase in size of `Point` arrays that
happened after native histograms had been added.

The higher-level data structures (`Sample`, `Series`, etc.) are still
“polymorphous”. The same idea could be applied to them, but at each
step the trade-offs needed to be evaluated.

The idea with this change is to do the minimum necessary to get back
to pre-histogram performance for functions that do not touch
histograms. Here are comparisons for the `changes` function. The test
data doesn't include histograms yet. Ideally, there would be no change
in the benchmark result at all.

First runtime v2.39 compared to directly prior to this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     542µs ± 1%  +38.58%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     617µs ± 2%  +36.48%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.36ms ± 2%  +21.58%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    8.94ms ± 1%  +14.21%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.30ms ± 1%  +10.67%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.10ms ± 1%  +11.82%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    11.8ms ± 1%  +12.50%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    87.4ms ± 1%  +12.63%  (p=0.000 n=9+9)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    32.8ms ± 1%   +8.01%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.6ms ± 2%   +9.64%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     117ms ± 1%  +11.69%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     876ms ± 1%  +11.83%  (p=0.000 n=9+10)
```

And then runtime v2.39 compared to after this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     547µs ± 1%  +39.84%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     616µs ± 2%  +36.15%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.26ms ± 1%  +12.20%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    7.95ms ± 1%   +1.59%  (p=0.000 n=10+8)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.38ms ± 2%  +13.49%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.02ms ± 1%   +9.80%  (p=0.000 n=10+9)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    10.8ms ± 1%   +3.08%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    78.1ms ± 1%   +0.58%  (p=0.035 n=9+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    33.5ms ± 4%  +10.18%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.0ms ± 1%   +7.98%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     107ms ± 1%   +1.92%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     775ms ± 1%   -1.02%  (p=0.019 n=9+9)
```

In summary, the runtime doesn't really improve with this change for
queries with just a few steps. For queries with many steps, this
commit essentially reinstates the old performance. This is good
because the many-step queries are the one that matter most (longest
absolute runtime).

In terms of allocations, though, this commit doesn't make a dent at
all (numbers not shown). The reason is that most of the allocations
happen in the sampleRingIterator (in the storage package), which has
to be addressed in a separate commit.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:16 +02:00
Łukasz Mierzwa b6573353c1 Add query_samples_total metric
query_samples_total is a counter that tracks the total number of samples loaded by all queries.

The goal with this metric is to be able to see the amount of 'work' done by Prometheus to service queries.
At the moment we have metrics with the number of queries, plus more detailed metrics showing how much time each step of a query takes.
While those metrics do help they don't show us the whole picture.
Queries that do load more samples are (in general) more expensive than queries that do load fewer samples.
This means that looking only at the number of queries doesn't tell us how much 'work' Prometheus received.
Adding a counter that tracks the total number of samples loaded allows us to see if there was a spike in the cost of queries, not just the number of them.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2023-04-12 14:05:06 +01:00
Ganesh Vernekar 5588cab8b2
Merge pull request #12173 from bboreham/builder-no-empty-labels
labels: simplify call to get Labels from Builder
2023-04-04 12:02:55 +05:30
Bryan Boreham 1bb6b8b309
Merge pull request #12190 from bboreham/faster-topk
promql: use faster heap method for topk/bottomk
2023-03-30 14:05:53 +01:00