mirror of https://github.com/prometheus/prometheus.git synced 2025-03-05 20:59:13 -08:00

History

beorn7 6fcd225aee Some checks failed CI / Go tests (push) Has been cancelled Details CI / More Go tests (push) Has been cancelled Details CI / Go tests with previous Go version (push) Has been cancelled Details CI / UI tests (push) Has been cancelled Details CI / Go tests on Windows (push) Has been cancelled Details CI / Mixins tests (push) Has been cancelled Details CI / Build Prometheus for common architectures (0) (push) Has been cancelled Details CI / Build Prometheus for common architectures (1) (push) Has been cancelled Details CI / Build Prometheus for common architectures (2) (push) Has been cancelled Details CI / Build Prometheus for all architectures (0) (push) Has been cancelled Details CI / Build Prometheus for all architectures (1) (push) Has been cancelled Details CI / Build Prometheus for all architectures (10) (push) Has been cancelled Details CI / Build Prometheus for all architectures (11) (push) Has been cancelled Details CI / Build Prometheus for all architectures (2) (push) Has been cancelled Details CI / Build Prometheus for all architectures (3) (push) Has been cancelled Details CI / Build Prometheus for all architectures (4) (push) Has been cancelled Details CI / Build Prometheus for all architectures (5) (push) Has been cancelled Details CI / Build Prometheus for all architectures (6) (push) Has been cancelled Details CI / Build Prometheus for all architectures (7) (push) Has been cancelled Details CI / Build Prometheus for all architectures (8) (push) Has been cancelled Details CI / Build Prometheus for all architectures (9) (push) Has been cancelled Details CI / Check generated parser (push) Has been cancelled Details CI / golangci-lint (push) Has been cancelled Details CI / fuzzing (push) Has been cancelled Details CI / codeql (push) Has been cancelled Details CI / Report status of build Prometheus for all architectures (push) Has been cancelled Details CI / Publish main branch artifacts (push) Has been cancelled Details CI / Publish release artefacts (push) Has been cancelled Details CI / Publish UI on npm Registry (push) Has been cancelled Details promql(native histograms): Introduce exponential interpolation The linear interpolation (assuming that observations are uniformly distributed within a bucket) is a solid and simple assumption in lack of any other information. However, the exponential bucketing used by standard schemas of native histograms has been chosen to cover the whole range of observations in a way that bucket populations are spread out over buckets in a reasonably way for typical distributions encountered in real-world scenarios. This is the origin of the idea implemented here: If we divide a given bucket into two (or more) smaller exponential buckets, we "most naturally" expect that the samples in the original buckets will split among those smaller buckets in a more or less uniform fashion. With this assumption, we end up with an "exponential interpolation", which therefore appears to be a better match for histograms with exponential bucketing. This commit leaves the linear interpolation in place for NHCB, but changes the interpolation for exponential native histograms to exponential. This affects `histogram_quantile` and `histogram_fraction` (because the latter is more or less the inverse of the former). The zero bucket has to be treated specially because the assumption above would lead to an "interpolation to zero" (the bucket density approaches infinity around zero, and with the postulated uniform usage of buckets, we would end up with an estimate of zero for all quantiles ending up in the zero bucket). We simply fall back to linear interpolation within the zero bucket. At the same time, this commit makes the call to stick with the assumption that the zero bucket only contains positive observations for native histograms without negative buckets (and vice versa). (This is an assumption relevant for interpolation. It is a mostly academic point, as the zero bucket is supposed to be very small anyway. However, in cases where it _is_ relevantly broad, the assumption helps a lot in practice.) This commit also updates and completes the documentation to match both details about interpolation. As a more high level note: The approach here attempts to strike a balance between a more simplistic approach without any assumption, and a more involved approach with more sophisticated assumptions. I will shortly describe both for reference: The "zero assumption" approach would be to not interpolate at all, but _always_ return the harmonic mean of the bucket boundaries of the bucket the quantile ends up in. This has the advantage of minimizing the maximum possible relative error of the quantile estimation. (Depending on the exact definition of the relative error of an estimation, there is also an argument to return the arithmetic mean of the bucket boundaries.) While limiting the maximum possible relative error is a good property, this approach would throw away the information if a quantile is closer to the upper or lower end of the population within a bucket. This can be valuable trending information in a dashboard. With any kind of interpolation, the maximum possible error of a quantile estimation increases to the full width of a bucket (i.e. it more than doubles for the harmonic mean approach, and precisely doubles for the arithmetic mean approach). However, in return the _expectation value_ of the error decreases. The increase of the theoretical maximum only has practical relevance for pathologic distributions. For example, if there are thousand observations within a bucket, they could _all_ be at the upper bound of the bucket. If the quantile calculation picks the 1st observation in the bucket as the relevant one, an interpolation will yield a value close to the lower bucket boundary, while the true quantile value is close to the upper boundary. The "fancy interpolation" approach would be one that analyses the _actual_ distribution of samples in the histogram. A lot of statistics could be applied based on the information we have available in the histogram. This would include the population of neighboring (or even all) buckets in the histogram. In general, the resolution of a native histogram should be quite high, and therefore, those "fancy" approaches would increase the computational cost quite a bit with very little practical benefits (i.e. just tiny corrections of the estimated quantile value). The results are also much harder to reason with. Signed-off-by: beorn7 <beorn@grafana.com>		2024-09-19 14:19:10 +02:00
..
testdata	promql(native histograms): Introduce exponential interpolation	2024-09-19 14:19:10 +02:00
README.md	remove eval_with_nhcb	2024-06-20 22:49:00 +08:00
test.go	Merge pull request #14064 from aknuds1/arve/close-engine	2024-09-04 12:07:35 +02:00
test_test.go	Merge branch 'main' into HEAD	2024-09-04 18:50:00 +02:00

README.md

The PromQL test scripting language

This package contains two things:

an implementation of a test scripting language for PromQL engines
a predefined set of tests written in that scripting language

The predefined set of tests can be run against any PromQL engine implementation by calling promqltest.RunBuiltinTests(). Any other test script can be run with promqltest.RunTest().

The rest of this document explains the test scripting language.

Each test script is written in plain text.

Comments can be given by prefixing the comment with a #, for example:

# This is a comment.

Each test file contains a series of commands. There are three kinds of commands:

load
clear
eval

Each command is executed in the order given in the file.

`load` command

load adds some data to the test environment.

The syntax is as follows:

load <interval>
    <series> <points>
    ...
    <series> <points>

<interval> is the step between points (eg. 1m or 30s)
<series> is a Prometheus series name in the usual metric{label="value"} syntax
<points> is a specification of the points to add for that series, following the same expanding syntax as for promtool unittest documented here

For example:

load 1m
    my_metric{env="prod"} 5 2+3x2 _ stale {{schema:1 sum:3 count:22 buckets:[5 10 7]}}

...will create a single series with labels my_metric{env="prod"}, with the following points:

t=0: value is 5
t=1m: value is 2
t=2m: value is 5
t=3m: value is 7
t=4m: no point
t=5m: stale marker
t=6m: native histogram with schema 1, sum -3, count 22 and bucket counts 5, 10 and 7

Each load command is additive - it does not replace any data loaded in a previous load command. Use clear to remove all loaded data.

Native histograms with custom buckets (NHCB)

When loading a batch of classic histogram float series, you can optionally append the suffix _with_nhcb to convert them to native histograms with custom buckets and load both the original float series and the new histogram series.

`clear` command

clear removes all data previously loaded with load commands.

`eval` command

eval runs a query against the test environment and asserts that the result is as expected.

Both instant and range queries are supported.

The syntax is as follows:

# Instant query
eval instant at <time> <query>
    <series> <points>
    ...
    <series> <points>
    
# Range query
eval range from <start> to <end> step <step> <query>
    <series> <points>
    ...
    <series> <points>

<time> is the timestamp to evaluate the instant query at (eg. 1m)
<start> and <end> specify the time range of the range query, and use the same syntax as <time>
<step> is the step of the range query, and uses the same syntax as <time> (eg. 30s)
<series> and <points> specify the expected values, and follow the same syntax as for load above

For example:

eval instant at 1m sum by (env) (my_metric)
    {env="prod"} 5
    {env="test"} 20
    
eval range from 0 to 3m step 1m sum by (env) (my_metric)
    {env="prod"} 2 5 10 20
    {env="test"} 10 20 30 45

Instant queries also support asserting that the series are returned in exactly the order specified: use eval_ordered instant ... instead of eval instant .... This is not supported for range queries.

It is also possible to test that queries fail: use eval_fail instant ... or eval_fail range .... eval_fail optionally takes an expected error message string or regexp to assert that the error message is as expected.

For example:

# Assert that the query fails for any reason without asserting on the error message.
eval_fail instant at 1m ceil({__name__=~'testmetric1|testmetric2'})

# Assert that the query fails with exactly the provided error message string.
eval_fail instant at 1m ceil({__name__=~'testmetric1|testmetric2'})
    expected_fail_message vector cannot contain metrics with the same labelset

# Assert that the query fails with an error message matching the regexp provided.
eval_fail instant at 1m ceil({__name__=~'testmetric1|testmetric2'})
    expected_fail_regexp (vector cannot contain metrics .*|something else went wrong)

README.md

The PromQL test scripting language

load command

Native histograms with custom buckets (NHCB)

clear command

eval command

`load` command

`clear` command

`eval` command