prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-15 10:04:07 -08:00

Author	SHA1	Message	Date
beorn7	6fcd225aee	promql(native histograms): Introduce exponential interpolation Some checks failed CI / Go tests (push) Has been cancelled Details CI / More Go tests (push) Has been cancelled Details CI / Go tests with previous Go version (push) Has been cancelled Details CI / UI tests (push) Has been cancelled Details CI / Go tests on Windows (push) Has been cancelled Details CI / Mixins tests (push) Has been cancelled Details CI / Build Prometheus for common architectures (0) (push) Has been cancelled Details CI / Build Prometheus for common architectures (1) (push) Has been cancelled Details CI / Build Prometheus for common architectures (2) (push) Has been cancelled Details CI / Build Prometheus for all architectures (0) (push) Has been cancelled Details CI / Build Prometheus for all architectures (1) (push) Has been cancelled Details CI / Build Prometheus for all architectures (10) (push) Has been cancelled Details CI / Build Prometheus for all architectures (11) (push) Has been cancelled Details CI / Build Prometheus for all architectures (2) (push) Has been cancelled Details CI / Build Prometheus for all architectures (3) (push) Has been cancelled Details CI / Build Prometheus for all architectures (4) (push) Has been cancelled Details CI / Build Prometheus for all architectures (5) (push) Has been cancelled Details CI / Build Prometheus for all architectures (6) (push) Has been cancelled Details CI / Build Prometheus for all architectures (7) (push) Has been cancelled Details CI / Build Prometheus for all architectures (8) (push) Has been cancelled Details CI / Build Prometheus for all architectures (9) (push) Has been cancelled Details CI / Check generated parser (push) Has been cancelled Details CI / golangci-lint (push) Has been cancelled Details CI / fuzzing (push) Has been cancelled Details CI / codeql (push) Has been cancelled Details CI / Report status of build Prometheus for all architectures (push) Has been cancelled Details CI / Publish main branch artifacts (push) Has been cancelled Details CI / Publish release artefacts (push) Has been cancelled Details CI / Publish UI on npm Registry (push) Has been cancelled Details The linear interpolation (assuming that observations are uniformly distributed within a bucket) is a solid and simple assumption in lack of any other information. However, the exponential bucketing used by standard schemas of native histograms has been chosen to cover the whole range of observations in a way that bucket populations are spread out over buckets in a reasonably way for typical distributions encountered in real-world scenarios. This is the origin of the idea implemented here: If we divide a given bucket into two (or more) smaller exponential buckets, we "most naturally" expect that the samples in the original buckets will split among those smaller buckets in a more or less uniform fashion. With this assumption, we end up with an "exponential interpolation", which therefore appears to be a better match for histograms with exponential bucketing. This commit leaves the linear interpolation in place for NHCB, but changes the interpolation for exponential native histograms to exponential. This affects `histogram_quantile` and `histogram_fraction` (because the latter is more or less the inverse of the former). The zero bucket has to be treated specially because the assumption above would lead to an "interpolation to zero" (the bucket density approaches infinity around zero, and with the postulated uniform usage of buckets, we would end up with an estimate of zero for all quantiles ending up in the zero bucket). We simply fall back to linear interpolation within the zero bucket. At the same time, this commit makes the call to stick with the assumption that the zero bucket only contains positive observations for native histograms without negative buckets (and vice versa). (This is an assumption relevant for interpolation. It is a mostly academic point, as the zero bucket is supposed to be very small anyway. However, in cases where it _is_ relevantly broad, the assumption helps a lot in practice.) This commit also updates and completes the documentation to match both details about interpolation. As a more high level note: The approach here attempts to strike a balance between a more simplistic approach without any assumption, and a more involved approach with more sophisticated assumptions. I will shortly describe both for reference: The "zero assumption" approach would be to not interpolate at all, but _always_ return the harmonic mean of the bucket boundaries of the bucket the quantile ends up in. This has the advantage of minimizing the maximum possible relative error of the quantile estimation. (Depending on the exact definition of the relative error of an estimation, there is also an argument to return the arithmetic mean of the bucket boundaries.) While limiting the maximum possible relative error is a good property, this approach would throw away the information if a quantile is closer to the upper or lower end of the population within a bucket. This can be valuable trending information in a dashboard. With any kind of interpolation, the maximum possible error of a quantile estimation increases to the full width of a bucket (i.e. it more than doubles for the harmonic mean approach, and precisely doubles for the arithmetic mean approach). However, in return the _expectation value_ of the error decreases. The increase of the theoretical maximum only has practical relevance for pathologic distributions. For example, if there are thousand observations within a bucket, they could _all_ be at the upper bound of the bucket. If the quantile calculation picks the 1st observation in the bucket as the relevant one, an interpolation will yield a value close to the lower bucket boundary, while the true quantile value is close to the upper boundary. The "fancy interpolation" approach would be one that analyses the _actual_ distribution of samples in the histogram. A lot of statistics could be applied based on the information we have available in the histogram. This would include the population of neighboring (or even all) buckets in the histogram. In general, the resolution of a native histogram should be quite high, and therefore, those "fancy" approaches would increase the computational cost quite a bit with very little practical benefits (i.e. just tiny corrections of the estimated quantile value). The results are also much harder to reason with. Signed-off-by: beorn7 <beorn@grafana.com>	2024-09-19 14:19:10 +02:00
Jan Fajerski	91608c002f	Merge branch 'main' into release-3.0-beta.0 Conflicts: scrape/scrape_test.go Pick both changes.	2024-09-10 20:51:20 +02:00
Charles Korn	e8c7482137	Return negative counts when multiplied or divided by a negative value Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-09-09 14:37:59 +10:00
Charles Korn	113de6301c	Add failing test cases for unary negation and multiplication and division with negative scalars Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-09-04 16:20:28 +10:00
Charles Korn	9b451abec7	Make positive and negative bucket counts different in existing test cases Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-09-04 16:08:05 +10:00
Jan Fajerski	956245b25b	promqltest: adjust eval times and range selector In order to fix new tests for changes added in https://github.com/prometheus/prometheus/pull/13904. Signed-off-by: Jan Fajerski <jfajersk@redhat.com>	2024-09-02 11:27:39 +02:00
Jan Fajerski	00315ce15e	Merge branch 'main' into 3.0-main-sync-24-08-30 using -Xours Signed-off-by: Jan Fajerski <jfajersk@redhat.com>	2024-09-02 11:27:18 +02:00
Neeraj Gartia	8c7bf39d96	Moves TestNativeHistogram_MulDivOperator to promql testing framework (#14688 ) Some checks are pending CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Build Prometheus for common architectures (0) (push) Waiting to run Details CI / Build Prometheus for common architectures (1) (push) Waiting to run Details CI / Build Prometheus for common architectures (2) (push) Waiting to run Details CI / Build Prometheus for all architectures (0) (push) Waiting to run Details CI / Build Prometheus for all architectures (1) (push) Waiting to run Details CI / Build Prometheus for all architectures (10) (push) Waiting to run Details CI / Build Prometheus for all architectures (11) (push) Waiting to run Details CI / Build Prometheus for all architectures (2) (push) Waiting to run Details CI / Build Prometheus for all architectures (3) (push) Waiting to run Details CI / Build Prometheus for all architectures (4) (push) Waiting to run Details CI / Build Prometheus for all architectures (5) (push) Waiting to run Details CI / Build Prometheus for all architectures (6) (push) Waiting to run Details CI / Build Prometheus for all architectures (7) (push) Waiting to run Details CI / Build Prometheus for all architectures (8) (push) Waiting to run Details CI / Build Prometheus for all architectures (9) (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details PromQL: add test for mul and div operator Also, remove the converted test from the engine_test.go file. This also includes an extension of the test framework to allow NaN/Inf in histogram buckets. --------- Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>	2024-08-29 16:42:35 +02:00
Jan Fajerski	7c8c748399	promql tests: adjust range query intervals Some test queries need their interval adjusted to account for https://github.com/prometheus/prometheus/pull/13904. Otherwise the queries don't return enough samples. promql/engine_test.go:TestHistogramCopyFromIteratorRegression needed the same, but also the result needed a fix since `increase` interpolates over the full range. Signed-off-by: Jan Fajerski <jfajersk@redhat.com>	2024-08-21 12:33:52 +02:00
Jan Fajerski	5138922b0d	Merge branch 'main' into 3.0-main-sync-24-08-21	2024-08-21 09:09:36 +02:00
Ziqi Zhao	8f828d45c1	convert TestNativeHistogram_Sum_Count_Add_AvgOperator into testing framework Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>	2024-08-21 09:24:50 +08:00
Charles Korn	52818a97e2	Merge branch 'main' into sum-and-avg-over-mixed-custom-exponential-histograms # Conflicts: # promql/promqltest/testdata/native_histograms.test	2024-08-14 07:52:08 +10:00
György Krajcsovits	386fc8b9f6	Update from review comments. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-13 15:26:07 +02:00
György Krajcsovits	6aee5b4b38	fix typo Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-12 12:04:45 +02:00
György Krajcsovits	06a8886b94	Native histograms: define behavior when rate is null. Histogram quantile returns NaN in this case, which might be surprising, so add a unit test that clarifies that this is intentional. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-08-12 10:40:21 +02:00
Charles Korn	f992f81bd0	Merge branch 'main' into sum-and-avg-over-mixed-custom-exponential-histograms Signed-off-by: Charles Korn <charleskorn@users.noreply.github.com>	2024-08-09 13:58:54 +10:00
Charles Korn	5cfdde327c	Address PR feedback: add extra test case Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-08-09 13:57:37 +10:00
Charles Korn	82bb35fabb	Address PR feedback: fix typo and rename variable Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-08-09 13:51:31 +10:00
Charles Korn	f07b3ae67b	Fix issue where `avg` over mixed exponential and custom buckets, or incompatible custom buckets, produces incorrect results or panics Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-08-07 15:32:35 +10:00
Charles Korn	5ee94f49a2	Fix issue where `sum` over mixed exponential and custom buckets, or incompatible custom buckets, produces incorrect results Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-08-07 15:30:01 +10:00
Charles Korn	424cefcf5e	Fix "cannot reduce resolution to custom buckets schema" panic in `rate` over native histograms with mix of custom and exponential buckets Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-08-07 14:45:02 +10:00
Charles Korn	aadec25faf	promql: Fix issue where some native histogram-related annotations are not emitted by `rate` (#14575 ) Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-08-06 09:10:40 +01:00
Jan Fajerski	adf5d6bce1	Merge branch 'main' into 3.0-main-sync-24-07-18 Signed-off-by: Jan Fajerski <jfajersk@redhat.com> Conflicts: VERSION pick 3.0.0 promql/promqltest/testdata/histograms.test pick changes from `c39776c5b5`, but adjust 5m range selectors to 10m to account for https://github.com/prometheus/prometheus/pull/13904. Fixes: promql/promqltest/testdata/functions.test promql/promqltest/testdata/staleness.test Tests added in https://github.com/prometheus/prometheus/pull/9138 need to be adjusted to account for https://github.com/prometheus/prometheus/pull/13904.	2024-07-18 15:56:40 +02:00
Filip Petkovski	acb6c1ae4b	Fix decoding buckets for native histograms in binops The optimizer which detects cases where histogram buckets can be skipped does not take into account binary expressions. This can lead to buckets not being decoded if a metric is used with both histogram_fraction/quantile and histogram_sum/count in the same expression. Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>	2024-07-10 11:55:29 +02:00
Jan Fajerski	c0d67fd845	Merge branch 'main' into 3.0-main-sync-24-07-09	2024-07-09 16:51:07 +02:00
Jan Fajerski	b4152309a4	Merge branch 'main' into 3.0-main-sync Conflicts: promql/engine_test.go Resolved by picking main changes but adjusting total_samples for query "max_over_time(metricWith1HistogramEvery10Seconds[60s])[20s:5s]" to 312. Via https://github.com/prometheus/prometheus/pull/13662 this histogram now stores 13 values per timestamp, but via https://github.com/prometheus/prometheus/pull/13904 the range query is now left-open. promql/promqltest/testdata/functions.test Resolved by picking changes in main. See also https://github.com/prometheus/prometheus/pull/13662, but adjust some range selectors (`s/1m/2m/`) to account for https://github.com/prometheus/prometheus/pull/13904. promql/promqltest/testdata/histograms.test Resolved by picking changes in main. See also https://github.com/prometheus/prometheus/pull/13662, but adjust some range selectors (`s/5m/10m/`) to account for https://github.com/prometheus/prometheus/pull/13904. Signed-off-by: Jan Fajerski <jfajersk@redhat.com>	2024-07-03 09:45:25 +02:00
Ganesh Vernekar	3d54bcc018	Merge pull request #14362 from charleskorn/charleskorn/sum-infinity	2024-07-03 01:05:03 -04:00
Charles Korn	fd6bdf5230	Fix issue where summation of +/- infinity returns NaN instead of infinity Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-06-28 11:26:54 +10:00
Jeanette Tan	dda5f48c9e	Merge branch 'main' into nhcb-review-2	2024-06-20 22:50:00 +08:00
Zhang Zhanpeng	debbdb8608	make matrix selection and lookback left-open and right-closed Signed-off-by: Zhang Zhanpeng <zhangzhanpeng.zzp@alibaba-inc.com> Signed-off-by: beorn7 <beorn@grafana.com> Co-authored-by: beorn7 <beorn@grafana.com>	2024-06-20 22:05:40 +08:00
Jeanette Tan	14f8dded39	Merge branch 'main' into nhcb Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2024-06-07 19:17:14 +08:00
beorn7	c7fdfe8004	promql: Add tests for histogram counter reset only in bucket This also exercises the "fast path" (only decoding count and sum), i.e. where the counter reset isn't visible at all in the decoded data. Signed-off-by: beorn7 <beorn@grafana.com>	2024-06-06 17:47:38 +02:00
Jeanette Tan	f028496133	Merge branch 'main' into nhcb Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2024-05-14 16:20:15 +08:00
Neeraj Gartia	661856cb65	removes the added tests from engine_test.go Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>	2024-05-13 22:58:25 +05:30
Neeraj Gartia	6119124d0e	some nits Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>	2024-05-13 22:55:38 +05:30
Neeraj Gartia	adf5a36c1e	adds test for sum, count, stddev, stdvar, quantile and fraction func to promql testing framework Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>	2024-05-13 22:55:38 +05:30
Neeraj Gartia	8b838a05d9	adds test for native histogram rate func in promql testing framework Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>	2024-05-13 22:55:38 +05:30
Neeraj Gartia	548bd9d6fb	adds TestNativeHistogramRate func to promql test framework Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>	2024-05-13 22:55:38 +05:30
Bryan Boreham	11b27d5d22	test: move test files into new promqltest package So that promql package does not bring in test-only dependencies. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-05-08 13:42:55 +01:00

39 commits