Commit graph

12959 commits

Author SHA1 Message Date
Julien Pivotto 53091126c2 Chunked remote read: close the querier earlier
I have seen prometheis instances misebehaving because of broken chinked remote
read requests.

In order to avoid OOM's when this happens, I propose to close the
queries used by the streamed remote read requests earlier.

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2024-03-15 14:03:16 +01:00
Arve Knudsen cef1025ea8 tsdb/wlog.Checkpoint: Fix counting of histogram samples
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-03-15 10:23:59 +01:00
Bryan Boreham d45b5deb75 TSDB: move function only used in tests
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-15 08:54:47 +00:00
Bryan Boreham 3274cac0d3 TSDB: remove unused function
Was only used in old WAL implementation.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-15 08:51:57 +00:00
Goutham Veeramachaneni 15809223c7
Merge pull request #13759 from jcajka/main
otlptranslator: fix up import paths
2024-03-15 09:40:26 +01:00
Charles Korn dca47ce2c9
Only run on prometheus/prometheus repo
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-03-15 11:44:35 +11:00
Björn Rabenstein 0f70fc7687
Merge pull request #13772 from machine424/signals
adjust signal termination warning log
2024-03-14 19:34:37 +01:00
machine424 3eed6c760a
adjust signal termination warning log
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-03-14 18:45:45 +01:00
Ben Kochie 537ce87d6b
Merge pull request #13770 from prometheus/superq/sync_description
Add container_description.yml to repo sync
2024-03-14 12:11:42 +01:00
Arve Knudsen 1de49d5b69
Remove unused function tsdb/chunks.PopulatedChunk (#13763)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-03-14 11:15:17 +01:00
SuperQ a0fbc75f34
Add container_description.yml to repo sync
Add the container_description.yml workflow to the repo file sync script.
* Skip sync if there is no Dockerfile.
* Fixup minor typo in container_description.yml.

Signed-off-by: SuperQ <superq@gmail.com>
2024-03-14 09:20:40 +01:00
Ben Kochie f252d8a9d1
Merge pull request #13761 from prometheus/superq/fixup_workflow
Fix container_description workflow
2024-03-14 08:42:15 +01:00
Bryan Boreham 87edf1f960 [Cleanup] TSDB: Remove old deprecated WAL implementation
Deprecated since 2018.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-13 15:57:23 +00:00
SuperQ 46401b988e
Normalize and fixup step names.
Signed-off-by: SuperQ <superq@gmail.com>
2024-03-13 15:56:37 +01:00
Jakub Čajka 505fd638be
otlptranslator: fix up import paths
Signed-off-by: Jakub Čajka <jcajka@redhat.com>
2024-03-13 15:56:14 +01:00
Bartlomiej Plotka 312e3fd728
Merge pull request #13713 from charleskorn/query-engine-interface
rules: allow using alternative PromQL engines for rule evaluation by callers using Prometheus as a lib.
2024-03-13 14:45:42 +01:00
SuperQ 3bff79451d
Fix container_description workflow
Fix yaml indentation. 🤦

Signed-off-by: SuperQ <superq@gmail.com>
2024-03-13 14:28:05 +01:00
Björn Rabenstein f1e9ec29f8
Merge pull request #13752 from prometheus/superq/publish_docker_readme
Add GitHub action to publish container README
2024-03-13 12:57:22 +01:00
dependabot[bot] cd3e0078f0
build(deps): bump github.com/prometheus/common (#13728)
Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.49.0 to 0.50.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](https://github.com/prometheus/common/compare/v0.49.0...v0.50.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-12 20:07:03 +01:00
Björn Rabenstein af3618fd35
Merge pull request #13667 from prometheus/beorn7/promql
Improve TestQueryStatistics and fix bugs exposed by it
2024-03-12 16:17:11 +01:00
Bryan Boreham ab9c544ec7 Azure Discovery tests: Add test for VMToLabelSet
Test fails due to bug in code on main.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-12 15:04:09 +00:00
Bryan Boreham 5f2c0c5283 Azure Discovery tests: mock the azure client interface
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-12 15:04:09 +00:00
Bryan Boreham 4e24e5b1d1 Refactor: Azure Discovery: introduce an interface for the client
So we can mock it.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-12 15:04:09 +00:00
Bryan Boreham b8d428b753 Refactor: Azure Discovery: extract function to generate labelSet
This should make it easier to test.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-12 15:04:04 +00:00
SuperQ 2061eb0a6a
Add GitHub action to publish container README
Add a GitHub action to publish the README.md to Docker Hub and Quay.io.

Fixes: https://github.com/prometheus/prometheus/issues/5348

Signed-off-by: SuperQ <superq@gmail.com>
2024-03-12 14:18:52 +01:00
Bryan Boreham 0bb5588386
labels: optimize String method (#13673)
Use a stack buffer to reduce memory allocations.

`Write(AppendQuote(AvailableBuffer` does not allocate or copy when
the buffer has sufficient space.

Also add a benchmark, with some refactoring.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-12 11:34:03 +00:00
Bryan Boreham d08f054950
[ENHANCEMENT] TSDB: Check CRC without allocating (#13742)
Use the existing utility function which does this.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-12 12:24:27 +01:00
Charles Korn 26262a1eb7
Remove unnecessary SetQueryLogger method on QueryEngine interface
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-03-12 22:01:01 +11:00
Julien ca57bd6352
Merge pull request #13743 from carrychair/main
fix function and struct names in comments
2024-03-12 11:50:25 +01:00
Julien 106d98b986
Merge pull request #13751 from testwill/http_statuscode
chore: use constant instead of numeric literal
2024-03-12 11:41:34 +01:00
guoguangwu 1cccdbaedb chore: use constant instead of numeric literal
Signed-off-by: guoguangwu <guoguangwug@gmail.com>
2024-03-12 17:19:50 +08:00
Bryan Boreham 8d53e7ba90
Cut v2.51.0-rc.0 (#13729)
* Cherry-pick [BUGFIX] Azure SD: Fix 'error: parameter virtualMachineScaleSetName cannot be empty' (#13702)

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-11 13:35:44 +00:00
Julien b4d4dcd9f6
Merge pull request #13739 from bboreham/no-race-prev-go
CI: don't run race-detector on tests with previous Go version
2024-03-11 13:55:19 +01:00
carrychair 856f6e49c8 fix function and struct name
Signed-off-by: carrychair <linghuchong404@gmail.com>
2024-03-09 17:53:17 +08:00
michaelact eea6ab1cdd
[BUGFIX] Azure SD: Fix 'error: parameter virtualMachineScaleSetName cannot be empty' (#13702)
Erroneous code was introduced during a merge-back-to-main at #13399.

Signed-off-by: michaelact <86778470+michaelact@users.noreply.github.com>
2024-03-08 15:19:39 +00:00
Bryan Boreham e8bf2ce4e1
Merge pull request #13735 from bboreham/fix-notifier-relabel
[BUGFIX] Alerts: don't reuse payload after relabeling.
2024-03-08 12:29:36 +00:00
Bryan Boreham 54f50e1498
Merge pull request #13737 from bboreham/fix-scrape-tolerance
[BUGFIX] Scraping: Tolerance should be max 1% of interval
2024-03-08 12:28:04 +00:00
Bryan Boreham e54082a621 CI: don't run race-detector on tests with previous Go version
The purpose of running with a previous Go version is to spot usage of
new language features; we don't need to intensively look for bugs.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-08 10:38:10 +00:00
Bryan Boreham 6c41ec984f [BUGFIX] Scraping: Tolerance should be max 1% of interval
Previous code set it at minimum 1%, which was not intended.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-08 10:18:18 +00:00
Bryan Boreham 8c4e4b72a8 Notifier: pass parameters to goroutine explicitly
Avoids possible false sharing between loops.

Plausibly there is no problem in the current code, but it's easy enough to write it more safely.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-08 09:20:36 +00:00
Bryan Boreham 57c799132b Notifier: don't reuse payload after relabeling
Also clarify why these variables are being cleared.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-08 09:16:43 +00:00
Björn Rabenstein 9acae57937
Merge pull request #13681 from krajorama/native-latency-histograms
Add native histograms to latency/duration metrics
2024-03-07 20:46:43 +01:00
Bryan Boreham 3d16d39881
Merge pull request #13716 from prometheus/update-go-deps
Update Go dependencies for v2.51
2024-03-07 12:05:58 +00:00
Björn Rabenstein b0c0961f9d
Merge pull request #13725 from prometheus/beorn7/promql2
promql: Fix limiting of extrapolation to negative values
2024-03-07 12:36:17 +01:00
Bryan Boreham 28f8a346bc Wind back gophercloud to 1.8.0
It triggers a failure in TestOpenstackSDInstanceRefresh.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-07 10:13:59 +00:00
Bryan Boreham 34a655716e Back off google.golang.org/protobuf to v1.32.0
Otherwise we get a compilation error.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-07 10:13:59 +00:00
Bryan Boreham 3c2c9ac067 Update Go dependencies for v2.51
Simply make update-all-go-deps

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-07 10:13:59 +00:00
Bryan Boreham e705d83e37
Merge pull request #13721 from bboreham/no-codeql-go
CI: stop running codeql for Go
2024-03-07 10:09:53 +00:00
Bryan Boreham b965aac82d
Merge pull request #13722 from bboreham/split-go-tests
CI: split Go tests into two parts, to run in parallel
2024-03-07 10:09:13 +00:00
beorn7 7f912db15a promql: Fix limiting of extrapolation to negative values
This is a bit tough to explain, but I'll try:

`rate` & friends have a sophisticated extrapolation algorithm.
Usually, we extrapolate the result to the total interval specified in
the range selector. However, if the first sample within the range is
too far away from the beginning of the interval, or if the last sample
within the range is too far away from the end of the interval, we
assume the series has just started half a sampling interval before the
first sample or after the last sample, respectively, and shorten the
extrapolation interval correspondingly. We calculate the sampling
interval by looking at the average time between samples within the
range, and we define "too far away" as "more than 110% of that
sampling interval".

However, if this algorithm leads to an extrapolated starting value
that is negative, we limit the start of the extrapolation interval to
the point where the extrapolated starting value is zero.

At least that was the intention.

What we actually implemented is the following: If extrapolating all
the way to the beginning of the total interval would lead to an
extrapolated negative value, we would only extrapolate to the zero
point as above, even if the algorithm above would have selected a
starting point that is just half a sampling interval before the first
sample and that starting point would not have an extrapolated negative
value. In other word: What was meant as a _limitation_ of the
extrapolation interval yielded a _longer_ extrapolation interval in
this case.

There is an exception to the case just described: If the increase of
the extrapolation interval is more than 110% of the sampling interval,
we suddenly drop back to only extrapolate to half a sampling interval.

This behavior can be nicely seen in the testcounter_zero_cutoff test,
where the rate goes up all the way to 0.7 and then jumps back to 0.6.

This commit changes the behavior to what was (presumably) intended
from the beginning: The extension of the extrapolation interval is
only limited if actually needed to prevent extrapolation to negative
values, but the "limitation" never leads to _more_ extrapolation
anymore.

The difference is subtle, and probably it never bothered anyone.
However, if you calculate a rate of a classic histograms, the old
behavior might create non-monotonic histograms as a result (because of
the jumps you can see nicely in the old version of the
testcounter_zero_cutoff test). With this fix, that doesn't happen
anymore.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-03-07 01:20:33 +01:00