Commit graph

13250 commits

Author SHA1 Message Date
George Krajcsovits 4eab18abd6
[nhcb branch] Use single bit to differentiate between optimized bounds and floats (#13828)
* Use single bit to differentiate between optimized bounds and floats

Use one bit to decide what kind of data to read/write.
This reduces storage need of floats from 72 bits to 65 bits and makes the
integers store in 5 to 32 bits instead of 16.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-03-27 18:40:59 +01:00
George Krajcsovits dc7b282d39
engine_test: adjust and comment histogram sample counts (#13841)
The size of histogram points are now bigger by 24 bytes due to the
custom values slice.

When histograms are loaded into partial results in vector selectors
we use HPoint type where the size is calculated as
(size of histogram + 8 for timestamp)/16.
a3d1a46eda/promql/value.go (L176)

When histograms are put into Sample type in range evaluations, the
Sample has more overhead and the size is calculated differently:
(size of histogram / 16) + 1 for time stamp.
a3d1a46eda/promql/engine.go (L1928)

When the size of the histogram is 16k, then the first calculation gives k
but the second gives k+1 for the sample count.
If the histogram size is 16k+8, then both would give k+1.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-03-27 18:19:14 +01:00
Bryan Boreham 2fc79839fd
Merge pull request #13851 from prometheus/krajo/pick-pr13846
Release 2.51: Cherry pick #13846 fix the bug of setting native histogram min bucket factor
2024-03-27 18:09:54 +01:00
Bryan Boreham ef7e9966d2
Merge pull request #13850 from prometheus/cherry-pick-13845
Release 2.51: Cherry-pick #13845 bugfix for DropMetricName
2024-03-27 18:06:44 +01:00
Björn Rabenstein b9a2a4e329
Merge pull request #13852 from prometheus/fix-hist-std-dev-var-negative
Fix hist std dev var negative
2024-03-27 17:58:03 +01:00
Jeanette Tan 4f2df329bd improve handling of empty buckets with infinite bounds in histogram std dev/var
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-03-27 17:06:12 +01:00
Jeanette Tan 22d0f4f114 improve handling of negative bounds in histogram std dev/var
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-03-27 17:06:12 +01:00
György Krajcsovits d64c6fe34f fix the bug of setting native histogram min bucket factor (#13846)
* fix the bug of setting native histogram min bucket factor

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>

* Add unit test for checking that min_bucket_factor is correctly applied

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-03-27 16:41:23 +01:00
Ziqi Zhao 64dfd8a158
fix the bug of setting native histogram min bucket factor (#13846)
* fix the bug of setting native histogram min bucket factor

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>

* Add unit test for checking that min_bucket_factor is correctly applied

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-03-27 16:32:37 +01:00
Domantas 3929d6500a [BUGFIX] labels: don't modify original labels in DropMetricName (#13845)
Restrict the capacity of first argument to `append()` to force an allocation.
This is for the slice implementation only.

Signed-off-by: Domantas Jadenkus <djadenkus@gmail.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-27 15:32:03 +00:00
Bryan Boreham c6a42f8891
Merge pull request #13741 from bboreham/azure-test-labels
Azure Discovery tests: Add test for mapping VMs to labels
2024-03-27 11:36:17 +01:00
Domantas 435f330d0b
[BUGFIX] labels: don't modify original labels in DropMetricName (#13845)
Restrict the capacity of first argument to `append()` to force an allocation.
This is for the slice implementation only.

Signed-off-by: Domantas Jadenkus <djadenkus@gmail.com>
2024-03-27 10:35:17 +00:00
suntala 9a7c6a5cc4 Support native histogram values in template functions
Co-authored-by: Aleks Fazlieva <britishrum@users.noreply.github.com>
Signed-off-by: suntala <arati.rana@grafana.com>
2024-03-26 22:30:01 +01:00
suntala 44f385fd51 Support expansion of native histogram values in alert templates
Co-authored-by: Aleks Fazlieva <britishrum@users.noreply.github.com>
Signed-off-by: suntala <arati.rana@grafana.com>
2024-03-26 22:30:01 +01:00
Björn Rabenstein 25a8d57671
Merge pull request #13844 from aknuds1/bugfix/wlog-checkpoint-float-histogram
tsdb/wlog.Checkpoint: Handle also float histograms
2024-03-26 17:41:43 +01:00
Federico Leva 2aab70b839 Clarify batch_send_deadline docs
This is the time period covered by a batch of samples, when the
number of waiting samples is lower than max_samples_per_send.
It does not affect timeouts or retries.

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Federico Leva <federico.leva@relexsolutions.com>
2024-03-26 17:18:46 +02:00
Arve Knudsen 35aab01de0 tsdb/wlog.Checkpoint: Handle also float histograms
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-03-26 15:27:32 +01:00
Bartlomiej Plotka de05d5e11b
Merge pull request #13776 from aknuds1/arve/checkpoint-histogram-samples
tsdb/wlog.Checkpoint: Fix counting of histogram samples
2024-03-26 12:33:48 +01:00
Bartlomiej Plotka 25578f2b22
[test] Merge pull request #13790 from aknuds1/arve/retention-commit
tsdb.BeyondTimeRetention: Fix comment and test at retention duration
2024-03-26 12:26:32 +01:00
Charles Korn 5cc97a1820
[tests]: extend test scripting language to support range queries (#13825)
* Extract method to make it easier to test.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Remove superfluous interface definition.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add test cases for existing instant query functionality.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add support for testing range queries

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Expand test coverage for instant queries and clarify error when a float is returned but a histogram is expected (or vice versa)

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Improve error message formatting

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add test case for instant query command with invalid timestamp

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Fix linting warning.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Remove superfluous print statement and expected result

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Fix linting warning.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add note about ordered range eval commands.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Check that matrix results are always sorted by labels.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

---------

Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-03-26 11:22:22 +00:00
Nick Pillitteri 481f14e1c0
TSDB: Don't rely on integer overflow in head compaction check (#13755)
* TSDB: Don't compact the head block when empty

Don't compact the Head block if there have not yet been any samples
appended.

Previously, the logic for determining if the head should be compacted
relied on the default values for min and max time and integer overflow
when they were checked in `Head.compactable()`. The check in
`Head.compactable()` effectively did `math.MinInt64 - math.MaxInt64`
which overflowed and wrapped to `1`. Since `1` is less than `1.5`
times the chunk range, compaction did not happen. This was the correct
behavior but relying on overflow wrapping is surprising.

This change add a method for checking if the min and max time for the
head is unset and uses it to short-circuit compaction in that case.
It also replaces several explicit checks for the default value to
determine if the head has not yet had any samples added.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
2024-03-26 12:17:38 +01:00
Bartlomiej Plotka d771cabc95
Merge pull request #13831 from prometheus/cherry-pick-13803
Release 2.51: Cherry-pick #13803 bugfix for label_join
2024-03-26 12:17:10 +01:00
Ben Ye ceca6c4716
[ENHANCEMENT] TSDB: Log more statistics during startup (#13838)
* log chunk snapshot and mmap chunks replay duration together with total replay duration

Signed-off-by: Ben Ye <benye@amazon.com>
2024-03-26 11:16:27 +00:00
Bryan Boreham 78c0fd2f4d
Merge pull request #13799 from machine424/wbl
chore(tsdb): set the wbl to nil as well in DBReadOnly.loadDataAsQuery…
2024-03-25 15:03:56 +01:00
Bryan Boreham f381ee3e55
Merge pull request #13834 from arturmelanchyk/lock-free-total
TSDB: lock-free total counter
2024-03-25 14:58:44 +01:00
Bryan Boreham 7d705ea9e8
Merge pull request #13830 from deterclosed/main
[DOCS] remove repeated word
2024-03-25 14:55:22 +01:00
Bryan Boreham b6c144ab2d
Merge pull request #13835 from sellskin/main
[STYLE] Discovery tests: remove code that will not be executed
2024-03-25 14:54:20 +01:00
Bryan Boreham 5540d34d94
Merge pull request #13461 from pracucci/upstream-fastregexmatcher
Further optimise FastRegexMatcher
2024-03-25 13:49:29 +01:00
Bryan Boreham 48786ad4e8 Use slices insteda of exp/slices
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-25 12:20:18 +00:00
Bryan Boreham 080d440bf8 Merge remote-tracking branch 'origin/main' into pr/13461 2024-03-25 12:14:26 +00:00
sellskin ff15b17400 remove code that will not be executed
Signed-off-by: sellskin <mydesk@yeah.net>
2024-03-25 12:18:33 +08:00
Artur Melanchyk 44dcf02c69
TSDB: make total lock-free by using atomic
Signed-off-by: Artur Melanchyk <artur.melanchyk@gmail.com>
2024-03-23 19:51:29 +01:00
Bryan Boreham 773170f372
Merge pull request #13822 from dgl/promtool-test-errors
promtool: Avoid using testify for user rule tests
2024-03-23 09:42:34 +01:00
tdakkota 17e2c30754 promql: validate label_join destination label
Signed-off-by: tdakkota <tanc13@yandex.ru>
2024-03-23 09:05:39 +01:00
Bryan Boreham bb62e3f808
Merge pull request #13803 from tdakkota/fix/validate-label-join-dest-label
promql: validate `label_join` destination label
2024-03-23 09:02:11 +01:00
deterclosed fab6298550 chore: remove repetitive word
Signed-off-by: deterclosed <fliter@outlook.com>
2024-03-23 13:39:27 +08:00
György Krajcsovits a3d1a46eda Merge branch 'main' into nhcb 2024-03-22 14:51:48 +01:00
zenador 4acbb7dea6
Add custom buckets to native histogram chunks encoding (#13706)
* add custom bounds to chunks encoding
* change custom buckets schema number
* rename custom bounds to custom values

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-03-22 14:36:39 +01:00
Arve Knudsen bfaa0a319c
Merge pull request #13826 from aknuds1/arve/fix-mock-querier-typo
storage: Fix mockChunkQuerier type name
2024-03-22 13:03:21 +01:00
Julien 2ef07b850c
Merge pull request #13814 from prometheus/dependabot/go_modules/k8s-io-dcd36d7e14
build(deps): bump the k8s-io group with 3 updates
2024-03-22 12:18:07 +01:00
Julien 2f57a25433
Merge pull request #13815 from prometheus/dependabot/go_modules/github.com/linode/linodego-1.30.0
build(deps): bump github.com/linode/linodego from 1.29.0 to 1.30.0
2024-03-22 12:17:24 +01:00
Julien b76ccb735f
Merge pull request #13823 from prymitive/num_dropped
Use consistent keys for logs
2024-03-22 12:09:38 +01:00
Arve Knudsen d8e4230696 storage: Fix mockChunkQuerier type name
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-03-22 07:44:38 +01:00
Ben Kochie 69ce3780d7
Merge pull request #13807 from prometheus/dependabot/github_actions/scripts/actions/checkout-4.1.2
build(deps): bump actions/checkout from 4.1.1 to 4.1.2 in /scripts
2024-03-22 07:30:53 +01:00
Ben Kochie f96b794835
Merge pull request #13808 from prometheus/dependabot/github_actions/golangci/golangci-lint-action-4.0.0
build(deps): bump golangci/golangci-lint-action from 3.7.0 to 4.0.0
2024-03-22 07:29:58 +01:00
Ben Kochie 5dc3c1dc9b
Merge pull request #13809 from prometheus/dependabot/github_actions/bufbuild/buf-setup-action-1.30.0
build(deps): bump bufbuild/buf-setup-action from 1.28.1 to 1.30.0
2024-03-22 07:29:06 +01:00
Jeanette Tan 9d32754bc0 add unit tests with all negative values for histogram_stddev and var
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-03-22 03:42:50 +08:00
Łukasz Mierzwa 3bb27c33e9 Use consistent keys for logs
Rule warnings are logged with numDropped=N while every other component uses num_dropped=N:

```
notifier/notifier.go:		level.Warn(n.logger).Log("msg", "Alert batch larger than queue capacity, dropping alerts", "num_dropped", d)
notifier/notifier.go:		level.Warn(n.logger).Log("msg", "Alert notification queue full, dropping alerts", "num_dropped", d)
storage/remote/write_handler.go:		_ = level.Warn(h.logger).Log("msg", "Error on ingesting out-of-order exemplars", "num_dropped", outOfOrderExemplarErrs)
rules/group.go:				level.Warn(logger).Log("msg", "Error on ingesting out-of-order result from rule evaluation", "num_dropped", numOutOfOrder)
rules/group.go:				level.Warn(logger).Log("msg", "Error on ingesting too old result from rule evaluation", "num_dropped", numTooOld)
rules/group.go:				level.Warn(logger).Log("msg", "Error on ingesting results from rule evaluation with different value but same timestamp", "num_dropped", numDuplicates)
scrape/scrape.go:		level.Warn(sl.l).Log("msg", "Error on ingesting out-of-order samples", "num_dropped", appErrs.numOutOfOrder)
scrape/scrape.go:		level.Warn(sl.l).Log("msg", "Error on ingesting samples with different value but same timestamp", "num_dropped", appErrs.numDuplicates)
scrape/scrape.go:		level.Warn(sl.l).Log("msg", "Error on ingesting samples that are too old or are too far into the future", "num_dropped", appErrs.numOutOfBounds)
scrape/scrape.go:		level.Warn(sl.l).Log("msg", "Error on ingesting out-of-order exemplars", "num_dropped", appErrs.numExemplarOutOfOrder)
```

Rename numDropped to num_dropped for consistency.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2024-03-21 15:59:20 +00:00
David Leadbeater 7ec4a11472 promtool: Avoid using testify for user rule tests
Using testify outside of unit tests results in panics rather than a
useful error for the user.

Fixes #13703

Signed-off-by: David Leadbeater <dgl@dgl.cx>
2024-03-21 22:08:10 +11:00
dependabot[bot] 191c467f16
build(deps): bump github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/compute/armcompute/v5
Bumps [github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/compute/armcompute/v5](https://github.com/Azure/azure-sdk-for-go) from 5.5.0 to 5.6.0.
- [Release notes](https://github.com/Azure/azure-sdk-for-go/releases)
- [Changelog](https://github.com/Azure/azure-sdk-for-go/blob/main/documentation/release.md)
- [Commits](https://github.com/Azure/azure-sdk-for-go/compare/sdk/resourcemanager/compute/armcompute/v5.5.0...sdk/resourcemanager/compute/armcompute/v5.6.0)

---
updated-dependencies:
- dependency-name: github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/compute/armcompute/v5
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-21 08:15:52 +00:00