Commit graph

14606 commits

Author SHA1 Message Date
Ben Ye ded35ef20d expose compactor metrics
Signed-off-by: Ben Ye <benye@amazon.com>
2024-03-31 15:10:29 -07:00
Arthur Silva Sens b51bbdd7ad
Merge pull request #13862 from nicolastakashi/refactor/moving-mergedOOOChunks-struct
[refactor] moving mergedOOOChunks to ooo_head_read
2024-03-29 20:51:38 -03:00
Nicolas Takashi 79d6750364
Merge branch 'prometheus:main' into refactor/moving-mergedOOOChunks-struct 2024-03-29 23:33:38 +00:00
Nicolas Takashi 0b762db154
[refactor] moving mergedOOOChunks to ooo_head_read
Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>
2024-03-29 23:33:15 +00:00
Bryan Boreham c97a066bbc
Merge pull request #13861 from simonpasquier/fix-sendall-deadlock
Notifier: fix deadlock when zero alerts
2024-03-29 19:37:51 +00:00
Simon Pasquier 8bd6ae1b20 Notifier: fix deadlock when zero alerts
When all alerts were dropped after alert relabeling, the `sendAll()`
function didn't release the lock properly which created a deadlock with
the Alertmanager target discovery.

In addition, the commit detects early when there are no Alertmanager
endpoint to notify to avoid unnecessary work.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2024-03-29 15:44:05 +01:00
Arthur Silva Sens 6a66f1f579
Merge pull request #13859 from prometheus/beorn7/release
Appoint release shepherds for v2.52 and v2.53
2024-03-28 19:09:57 -03:00
beorn7 fc3ad66539 Appoint release shepherds for v2.52 and v2.53
Note that we have delayed v2.52 by a week to avoid collisions with
events and travels.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-03-28 18:33:39 +01:00
Björn Rabenstein d81e41d58e
Merge pull request #13854 from prometheus/beorn7/testing
promql: Fix histogram comparison in test framework
2024-03-28 13:47:28 +01:00
Bryan Boreham 255098e053 CI: Publish step should require all Go tests to pass
This was an unintentional effect of splitting out Go tests into multiple
parallel blocks.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-28 10:06:28 +00:00
Bryan Boreham e1a5886c88
Merge branch 'main' into merge-2.51-into-main
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-28 10:00:25 +00:00
Bryan Boreham 855b5ac4b8
Cut release 2.51.1 (#13853)
* Cut release 2.51.1

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-28 09:23:45 +00:00
beorn7 65b4696b88 promql: Remove leftover debug output
Signed-off-by: beorn7 <beorn@grafana.com>
2024-03-27 19:02:27 +01:00
beorn7 2c1f9558b2 promql: Fix histogram comparison in test framework
The definition of histograms in the test framework may create
histograms in a non-compact form. Since histogram comparison relies on
exact equality of the bucket layout, we have to compact the histograms
created by the test framework language before comparing them to
histograms returned from the PromQL engine.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-03-27 19:00:16 +01:00
György Krajcsovits 2a4aa085d2 Merge branch 'main' into nhcb 2024-03-27 18:42:10 +01:00
George Krajcsovits 4eab18abd6
[nhcb branch] Use single bit to differentiate between optimized bounds and floats (#13828)
* Use single bit to differentiate between optimized bounds and floats

Use one bit to decide what kind of data to read/write.
This reduces storage need of floats from 72 bits to 65 bits and makes the
integers store in 5 to 32 bits instead of 16.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-03-27 18:40:59 +01:00
George Krajcsovits dc7b282d39
engine_test: adjust and comment histogram sample counts (#13841)
The size of histogram points are now bigger by 24 bytes due to the
custom values slice.

When histograms are loaded into partial results in vector selectors
we use HPoint type where the size is calculated as
(size of histogram + 8 for timestamp)/16.
a3d1a46eda/promql/value.go (L176)

When histograms are put into Sample type in range evaluations, the
Sample has more overhead and the size is calculated differently:
(size of histogram / 16) + 1 for time stamp.
a3d1a46eda/promql/engine.go (L1928)

When the size of the histogram is 16k, then the first calculation gives k
but the second gives k+1 for the sample count.
If the histogram size is 16k+8, then both would give k+1.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-03-27 18:19:14 +01:00
Bryan Boreham 2fc79839fd
Merge pull request #13851 from prometheus/krajo/pick-pr13846
Release 2.51: Cherry pick #13846 fix the bug of setting native histogram min bucket factor
2024-03-27 18:09:54 +01:00
Bryan Boreham ef7e9966d2
Merge pull request #13850 from prometheus/cherry-pick-13845
Release 2.51: Cherry-pick #13845 bugfix for DropMetricName
2024-03-27 18:06:44 +01:00
Björn Rabenstein b9a2a4e329
Merge pull request #13852 from prometheus/fix-hist-std-dev-var-negative
Fix hist std dev var negative
2024-03-27 17:58:03 +01:00
Jeanette Tan 4f2df329bd improve handling of empty buckets with infinite bounds in histogram std dev/var
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-03-27 17:06:12 +01:00
Jeanette Tan 22d0f4f114 improve handling of negative bounds in histogram std dev/var
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-03-27 17:06:12 +01:00
György Krajcsovits d64c6fe34f fix the bug of setting native histogram min bucket factor (#13846)
* fix the bug of setting native histogram min bucket factor

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>

* Add unit test for checking that min_bucket_factor is correctly applied

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-03-27 16:41:23 +01:00
Ziqi Zhao 64dfd8a158
fix the bug of setting native histogram min bucket factor (#13846)
* fix the bug of setting native histogram min bucket factor

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>

* Add unit test for checking that min_bucket_factor is correctly applied

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-03-27 16:32:37 +01:00
Domantas 3929d6500a [BUGFIX] labels: don't modify original labels in DropMetricName (#13845)
Restrict the capacity of first argument to `append()` to force an allocation.
This is for the slice implementation only.

Signed-off-by: Domantas Jadenkus <djadenkus@gmail.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-27 15:32:03 +00:00
Bryan Boreham c6a42f8891
Merge pull request #13741 from bboreham/azure-test-labels
Azure Discovery tests: Add test for mapping VMs to labels
2024-03-27 11:36:17 +01:00
Domantas 435f330d0b
[BUGFIX] labels: don't modify original labels in DropMetricName (#13845)
Restrict the capacity of first argument to `append()` to force an allocation.
This is for the slice implementation only.

Signed-off-by: Domantas Jadenkus <djadenkus@gmail.com>
2024-03-27 10:35:17 +00:00
suntala 9a7c6a5cc4 Support native histogram values in template functions
Co-authored-by: Aleks Fazlieva <britishrum@users.noreply.github.com>
Signed-off-by: suntala <arati.rana@grafana.com>
2024-03-26 22:30:01 +01:00
suntala 44f385fd51 Support expansion of native histogram values in alert templates
Co-authored-by: Aleks Fazlieva <britishrum@users.noreply.github.com>
Signed-off-by: suntala <arati.rana@grafana.com>
2024-03-26 22:30:01 +01:00
Björn Rabenstein 25a8d57671
Merge pull request #13844 from aknuds1/bugfix/wlog-checkpoint-float-histogram
tsdb/wlog.Checkpoint: Handle also float histograms
2024-03-26 17:41:43 +01:00
Federico Leva 2aab70b839 Clarify batch_send_deadline docs
This is the time period covered by a batch of samples, when the
number of waiting samples is lower than max_samples_per_send.
It does not affect timeouts or retries.

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Federico Leva <federico.leva@relexsolutions.com>
2024-03-26 17:18:46 +02:00
Arve Knudsen 35aab01de0 tsdb/wlog.Checkpoint: Handle also float histograms
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-03-26 15:27:32 +01:00
Bartlomiej Plotka de05d5e11b
Merge pull request #13776 from aknuds1/arve/checkpoint-histogram-samples
tsdb/wlog.Checkpoint: Fix counting of histogram samples
2024-03-26 12:33:48 +01:00
Bartlomiej Plotka 25578f2b22
[test] Merge pull request #13790 from aknuds1/arve/retention-commit
tsdb.BeyondTimeRetention: Fix comment and test at retention duration
2024-03-26 12:26:32 +01:00
Charles Korn 5cc97a1820
[tests]: extend test scripting language to support range queries (#13825)
* Extract method to make it easier to test.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Remove superfluous interface definition.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add test cases for existing instant query functionality.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add support for testing range queries

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Expand test coverage for instant queries and clarify error when a float is returned but a histogram is expected (or vice versa)

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Improve error message formatting

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add test case for instant query command with invalid timestamp

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Fix linting warning.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Remove superfluous print statement and expected result

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Fix linting warning.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add note about ordered range eval commands.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Check that matrix results are always sorted by labels.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

---------

Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-03-26 11:22:22 +00:00
Nick Pillitteri 481f14e1c0
TSDB: Don't rely on integer overflow in head compaction check (#13755)
* TSDB: Don't compact the head block when empty

Don't compact the Head block if there have not yet been any samples
appended.

Previously, the logic for determining if the head should be compacted
relied on the default values for min and max time and integer overflow
when they were checked in `Head.compactable()`. The check in
`Head.compactable()` effectively did `math.MinInt64 - math.MaxInt64`
which overflowed and wrapped to `1`. Since `1` is less than `1.5`
times the chunk range, compaction did not happen. This was the correct
behavior but relying on overflow wrapping is surprising.

This change add a method for checking if the min and max time for the
head is unset and uses it to short-circuit compaction in that case.
It also replaces several explicit checks for the default value to
determine if the head has not yet had any samples added.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
2024-03-26 12:17:38 +01:00
Bartlomiej Plotka d771cabc95
Merge pull request #13831 from prometheus/cherry-pick-13803
Release 2.51: Cherry-pick #13803 bugfix for label_join
2024-03-26 12:17:10 +01:00
Ben Ye ceca6c4716
[ENHANCEMENT] TSDB: Log more statistics during startup (#13838)
* log chunk snapshot and mmap chunks replay duration together with total replay duration

Signed-off-by: Ben Ye <benye@amazon.com>
2024-03-26 11:16:27 +00:00
Bryan Boreham 78c0fd2f4d
Merge pull request #13799 from machine424/wbl
chore(tsdb): set the wbl to nil as well in DBReadOnly.loadDataAsQuery…
2024-03-25 15:03:56 +01:00
Bryan Boreham f381ee3e55
Merge pull request #13834 from arturmelanchyk/lock-free-total
TSDB: lock-free total counter
2024-03-25 14:58:44 +01:00
Bryan Boreham 7d705ea9e8
Merge pull request #13830 from deterclosed/main
[DOCS] remove repeated word
2024-03-25 14:55:22 +01:00
Bryan Boreham b6c144ab2d
Merge pull request #13835 from sellskin/main
[STYLE] Discovery tests: remove code that will not be executed
2024-03-25 14:54:20 +01:00
Bryan Boreham 5540d34d94
Merge pull request #13461 from pracucci/upstream-fastregexmatcher
Further optimise FastRegexMatcher
2024-03-25 13:49:29 +01:00
Bryan Boreham 48786ad4e8 Use slices insteda of exp/slices
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-25 12:20:18 +00:00
Bryan Boreham 080d440bf8 Merge remote-tracking branch 'origin/main' into pr/13461 2024-03-25 12:14:26 +00:00
sellskin ff15b17400 remove code that will not be executed
Signed-off-by: sellskin <mydesk@yeah.net>
2024-03-25 12:18:33 +08:00
Artur Melanchyk 44dcf02c69
TSDB: make total lock-free by using atomic
Signed-off-by: Artur Melanchyk <artur.melanchyk@gmail.com>
2024-03-23 19:51:29 +01:00
Bryan Boreham 773170f372
Merge pull request #13822 from dgl/promtool-test-errors
promtool: Avoid using testify for user rule tests
2024-03-23 09:42:34 +01:00
tdakkota 17e2c30754 promql: validate label_join destination label
Signed-off-by: tdakkota <tanc13@yandex.ru>
2024-03-23 09:05:39 +01:00
Bryan Boreham bb62e3f808
Merge pull request #13803 from tdakkota/fix/validate-label-join-dest-label
promql: validate `label_join` destination label
2024-03-23 09:02:11 +01:00