Commit graph

1478 commits

Author SHA1 Message Date
Martin Chodur 00b110c65c
Fix data corruption in remote write if max_sample_age is applied (#14078)
* fix: try to reproduce the bug from https://github.com/prometheus/prometheus/issues/13979 in a test case

Signed-off-by: David Vavra <sevenood@gmail.com>

* fix: data corruption in remote write if max_sample_age is applied

Signed-off-by: David Vavra <sevenood@gmail.com>

* add benchmark for buildTimeSeries which does the filtering

Signed-off-by: Callum Styan <callumstyan@gmail.com>

---------

Signed-off-by: David Vavra <sevenood@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: David Vavra <sevenood@gmail.com>
Co-authored-by: Callum Styan <callumstyan@gmail.com>
2024-06-21 14:19:58 -07:00
Piotr d78253319d
queue_manager: add histogram info to error logs (#14326)
Signed-off-by: Piotr Gwizdala <17101802+thampiotr@users.noreply.github.com>
2024-06-20 16:45:13 -07:00
🌲 Harry 🌊 John 🏔 d5f6887294 Pass limit param as hint to storage.Querier
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-06-20 09:47:38 -07:00
Marco Pracucci 35564c0cb0
Export remote.LabelsToLabelsProto() and remote.LabelProtosToLabels()
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-06-19 17:30:49 +02:00
Marco Pracucci 0fbf4a2529
Export remote.ToLabelMatchers()
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-06-17 10:40:45 +02:00
Bryan Boreham 42b546a43d
tsdb: add details to duplicate sample error (#13277)
Now the error will include the timestamp and the existing and new values.
When you are trying to track down the source of this error, it can be
useful to see that the values are close, or alternating, or something
else.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-06-04 08:54:09 +01:00
Jayapriya Pai 2d2b440304
fix: correct the typo in azuread sdk auth (#14106)
Signed-off-by: Jayapriya Pai <janantha@redhat.com>
2024-05-21 19:08:35 +02:00
Oleksandr Redko f10c3454e9 Enable perfsprint linter and fix up code
Signed-off-by: Oleksandr Redko <oleksandr.red+github@gmail.com>
2024-05-15 17:51:05 +03:00
Anthony Mirabella 3b8b57700c
otlp: Remove OTel feature gate registration from copied translation package (#13932)
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
2024-05-10 10:41:21 +02:00
Bryan Boreham 3fd24d1cd7
Merge pull request #13999 from bboreham/extract-promqltest
[Test] Extract most PromQL test code into separate packages
2024-05-09 13:23:11 +01:00
Arve Knudsen d699dc3c77
Fix language in docs and comments (#14041)
Fix language in docs and comments

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
2024-05-08 17:57:09 +02:00
Bryan Boreham 8fd96241ab test: add promqltest package references
To packages outside of promql.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-05-08 16:08:04 +01:00
Arve Knudsen fc34570b06 prometheusremotewrite: Move TimeSeries method to timeseries.go
To facilitate generating OTel translation code for other Prometheus
compatible backends, modify the prometheusremotewrite sources slightly
so that the PrometheusConverter.TimeSeries method is in a file called
timeseries.go. The rationale is to allow other backends to define their
own implementation of this method.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-05-01 13:02:10 +02:00
Arve Knudsen 9189507569 prometheusremotewrite: Add PrometheusConverter.FromMetrics benchmark
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-04-30 13:13:37 +02:00
Arve Knudsen 99f3051f45 OTLP: Use PrometheusConverter directly
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-04-30 13:10:27 +02:00
Arve Knudsen 759ca8b207
Merge branch 'main' into refactor/add_max_func_to_maxTimestamp
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-04-30 11:50:37 +02:00
Jesus Vazquez 7554384dac
otlp: Prometheus to own its own copy of the otlptranslator package (#13991)
After a lot of productive discussion between the Prometheus and
OpenTelemetry community we decided that it made sense for Prometheus to
own its own copy of the code in charge for handling OTLP ingestion
traffic.

This commit is removing the README and update-copy.sh files that had the
previous steps to update the code.

Also it is updating the licensing of all the files to make sure the
OpenTelemetry provenance is explicit and to state the new ownership.

Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-04-30 11:29:52 +02:00
komisan19 b974a99279 fix
Signed-off-by: komisan19 <18901496+komisan19@users.noreply.github.com>
2024-04-30 10:45:50 +09:00
komisan19 3d84d4d6dc fix
Signed-off-by: komisan19 <18901496+komisan19@users.noreply.github.com>
2024-04-22 19:04:00 +09:00
komisan19 5ab24a06d0 refactor: add max func to maxTimestamp
Signed-off-by: komisan19 <18901496+komisan19@users.noreply.github.com>
2024-04-21 23:39:25 +09:00
Matthieu MOREL 6f595c6762
golangci-lint: enable whitespace linter (#13905)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-04-11 09:27:54 +01:00
Matthieu MOREL d496687c8e golangci-lint: enable usestdlibvars linter
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-04-08 19:26:23 +00:00
carehabit a672662073
all: fix some typos (#13863)
Signed-off-by: carehabit <shenyuting@outlook.com>
2024-04-01 18:06:05 +02:00
Arve Knudsen d8e4230696 storage: Fix mockChunkQuerier type name
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-03-22 07:44:38 +01:00
Jan-Otto Kröpke 302e151de8
{discovery,remote_write}/azure: Support default SDK authentication (#13099)
* discovery/azure: Offer default SDK authentication

Signed-off-by: Jan-Otto Kröpke <mail@jkroepke.de>
2024-03-16 11:06:57 +00:00
Julien d1abc3f255
Merge pull request #13777 from roidelapluie/remoteread2
Chunked remote read: close the querier earlier
2024-03-15 14:42:30 +01:00
Julien Pivotto 53091126c2 Chunked remote read: close the querier earlier
I have seen prometheis instances misebehaving because of broken chinked remote
read requests.

In order to avoid OOM's when this happens, I propose to close the
queries used by the streamed remote read requests earlier.

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2024-03-15 14:03:16 +01:00
Jakub Čajka 505fd638be
otlptranslator: fix up import paths
Signed-off-by: Jakub Čajka <jcajka@redhat.com>
2024-03-13 15:56:14 +01:00
Björn Rabenstein 9acae57937
Merge pull request #13681 from krajorama/native-latency-histograms
Add native histograms to latency/duration metrics
2024-03-07 20:46:43 +01:00
beorn7 7df0b9b92d storage: simplify sampleRing fix
Signed-off-by: beorn7 <beorn@grafana.com>
2024-03-05 15:41:18 +01:00
Yuri Nikolic d5ab1851dc SampleRingIterator: add currType field
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
2024-03-01 14:59:19 +01:00
György Krajcsovits 4d4d822c36 Add native histograms to latency/duration metrics
Dogfood native histograms.
Allow dependent projects to migrate to native histograms.

I took the defaults from client_golang.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-03-01 14:44:38 +01:00
Robert Fratto a09465baee
storage/remote: disable resharding during active retry backoffs (#13562)
* storage/remote: disable resharding during active retry backoffs

Today, remote_write reshards based on pure throughput. This is
problematic if throughput has been diminished because of HTTP 429s;
increasing the number of shards due to backpressure will only exacerbate
the problem.

This commit disables resharding for twice the retry backoff, ensuring
that resharding will never occur during an active backoff, and that
resharding does not become enabled again until enough time has elapsed
to allow any pending requests to be retried.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* storage/remote: test that resharding is disabled on retry

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* storage/remote: address review feedback

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* storage/remote: track time where resharding initially got disabled

This change introduces a second atomic int64 to roughly track when
resharding got disabled. This int64 is only updated after updating the
disabled timestamp if resharding was previously enabled.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

---------

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2024-02-28 14:28:39 -08:00
machine424 f477e0539a
Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21
Prevent adding back golang.org/x/exp/slices.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-02-28 14:54:53 +01:00
Bryan Boreham 2ac1632eec storage/remote: improve symbol-table handling
On the incoming path, `writeHandler.write()` creates a new table for
each request.

`labelProtosToLabels` takes a `ScratchBuilder` now.

Call `NewScratchBuilder` as required in tests.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-23 13:50:27 +00:00
Bryan Boreham 8f525b4ba4 storage/remote tests: refactor: extract function newTestQueueManager
To reduce repetition.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-23 13:50:27 +00:00
Arve Knudsen bf5ca8cf38 otlptranslator: Upgrade to v0.95.0
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-02-22 09:12:07 +01:00
Bryan Boreham aba0071480
Merge pull request #13589 from bboreham/trace_id
Standardise exemplar label as "trace_id"
2024-02-19 09:34:04 +00:00
Owen Williams a28d7865ad UTF-8: Add support for parsing UTF8 metric and label names
This adds support for the new grammar of `{"metric_name", "l1"="val"}` to promql and some of the exposition formats.
This grammar will also be valid for non-UTF-8 names.
UTF-8 names will not be considered valid unless model.NameValidationScheme is changed.

This does not update the go expfmt parser in text_parse.go, which will be addressed by https://github.com/prometheus/common/issues/554/.

Part of https://github.com/prometheus/prometheus/issues/13095

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2024-02-15 14:34:37 -05:00
Bryan Boreham c0e36e6bb3 Standardise exemplar label as "trace_id"
This is consistent with the OpenTelemetry standard, and an example in OpenMetrics.

https://github.com/open-telemetry/opentelemetry-specification/blob/89aa01348139/specification/metrics/data-model.md#exemplars
https://github.com/OpenObservability/OpenMetrics/blob/138654493130/specification/OpenMetrics.md#exemplars-1

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-15 14:20:08 +00:00
Bryan Boreham 17f48f2b3b Tests: use replacement DeepEquals in more places
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-08 19:32:33 +00:00
Bryan Boreham 8655fe5401
Merge pull request #13491 from bboreham/faster-store-series
storage/remote: speed up StoreSeries by re-using labels.Builder
2024-02-06 17:16:32 +01:00
Bryan Boreham 41f3eeb048
Merge pull request #13497 from captncraig/cmp_signedheaders
storage/remote: apply custom headers before sigv4 transport
2024-02-04 14:46:14 +01:00
Bryan Boreham fbca054af6 storage: don't wrap single querier in merge-queriers
If given a single querier, just return it instead of constructing a
complicated wrapper. The code in `mergeGenericQuerier` which skipped
merging when there was only one is not needed any more.

This change required a few tests to be tweaked, because they relied on
the specific behaviour of `mergeGenericQuerier.Select()`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-01-31 12:14:22 +00:00
Bryan Boreham b9fdf3dad1 storage/remote: document why two benchmarks are skipped
One was silently doing nothing; one was doing something but the work
didn't go up linearly with iteration count.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-01-30 16:48:04 +00:00
Craig Peterson 5b5230deb7 remote client: apply custom headers before sigv4 transport
Signed-off-by: Craig Peterson <192540+captncraig@users.noreply.github.com>
2024-01-30 09:27:00 -05:00
Bryan Boreham dcd024a095 storage/remote: speed up StoreSeries by re-using labels.Builder
Relabeling can take a pre-populated `Builder` instead of making a new
one every time. This is much more efficient.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-01-29 18:49:55 +00:00
Bryan Boreham d9483bb77c storage/remote: add BenchmarkStoreSeries
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-01-29 16:54:12 +00:00
Marco Pracucci 501bc6419e
Add ShardedPostings() support to TSDB (#10421)
This PR is a reference implementation of the proposal described in #10420.

In addition to what described in #10420, in this PR I've introduced labels.StableHash(). The idea is to offer an hashing function which doesn't change over time, and that's used by query sharding in order to get a stable behaviour over time. The implementation of labels.StableHash() is the hashing function used by Prometheus before stringlabels, and what's used by Grafana Mimir for query sharding (because built before stringlabels was a thing).

Follow up work
As mentioned in #10420, if this PR is accepted I'm also open to upload another foundamental piece used by Grafana Mimir query sharding to accelerate the query execution: an optional, configurable and fast in-memory cache for the series hashes.

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-01-29 11:57:27 +00:00
Goutham Veeramachaneni fb552d290d
Merge pull request #13464 from aknuds1/arve/fix-update-copy
otlptranslator/update-copy.sh: Fix sed command lines
2024-01-26 17:52:49 +05:30
Arve Knudsen de28494434 Make update-copy.sh work for both OSX and GNU sed
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-01-25 15:11:03 +01:00
Arve Knudsen 660df3488d otlptranslator/update-copy.sh: Fix sed command lines
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-01-25 13:56:04 +01:00
Filip Petkovski 583f3e587c
Optimize histogram iterators (#13340)
Optimize histogram iterators

Histogram iterators allocate new objects in the AtHistogram and
AtFloatHistogram methods, which makes calculating rates over long
ranges expensive.

In #13215 we allowed an existing object to be reused
when converting an integer histogram to a float histogram. This commit follows
the same idea and allows injecting an existing object in the AtHistogram and
AtFloatHistogram methods. When the injected value is nil, iterators allocate
new histograms, otherwise they populate and return the injected object.

The commit also adds a CopyTo method to Histogram and FloatHistogram which
is used in the BufferedIterator to overwrite items in the ring instead of making
new copies.

Note that a specialized HPoint pool is needed for all of this to work 
(`matrixSelectorHPool`).

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2024-01-23 17:02:14 +01:00
Bryan Boreham 63cdd6dbe1 storage: skip merging when no remote storage configured
Prometheus is hard-coded to use a fanout storage between TSDB and
a remote storage which by default is empty.
This change detects the empty storage and skips merging between
result sets, which would make `Select()` sort results.

Bottom line: we skip a sort unless there really is some remote storage
configured.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-01-18 17:50:06 +00:00
Goutham aee6896c47
Minor fixes to otlp vendor update script
Signed-off-by: Goutham <gouthamve@gmail.com>
2024-01-18 15:32:06 +05:30
Marc Tudurí 78c5ce3196
Drop old inmemory samples (#13002)
* Drop old inmemory samples

Co-authored-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Avoid copying timeseries when the feature is disabled

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Run gofmt

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Clarify docs

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Add more logging info

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Remove loggers

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* optimize function and add tests

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Simplify filter

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* rename var

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Update help info from metrics

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* use metrics to keep track of drop elements during buildWriteRequest

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* rename var in tests

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* pass time.Now as parameter

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Change buildwriterequest during retries

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Revert "Remove loggers"

This reverts commit 54f91dfcae20488944162335ab4ad8be459df1ab.

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* use log level debug for loggers

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Fix linter

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Remove noisy debug-level logs; add 'reason' label to drop metrics

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Remove accidentally committed files

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Propagate logger to buildWriteRequest to log dropped data

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Fix docs comment

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Make drop reason more specific

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Remove unnecessary pass of logger

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Use snake_case for reason label

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Fix dropped samples metric

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

---------

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
Signed-off-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com>
Co-authored-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com>
2024-01-05 10:40:30 -08:00
Daniel Kerbel b2185d96af
Consider storage.ErrTooOldSample as non-retryable
Signed-off-by: Daniel Kerbel <nmdanny@gmail.com>
2023-12-26 18:44:39 +02:00
Bryan Boreham 8065bef172 Move metric type definitions to common/model
They are used in multiple repos, so common is a better place for them.
Several packages now don't depend on `model/textparse`, e.g.
`storage/remote`.

Also remove `metadata` struct from `api.go`, since it was identical to
a struct in the `metadata` package.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-12-19 18:56:54 +00:00
Bryan Boreham 99c17b4319
Merge pull request #13177 from bboreham/less-madness
scrape: consistent function names for metadata
2023-12-19 17:51:52 +00:00
Björn Rabenstein 775de1a3bd
Merge pull request #13276 from fpetkovski/reuse-float-histograms
Reuse float histogram objects
2023-12-13 13:43:01 +01:00
Filip Petkovski ea356c472e
Add comment on SampleRingIterator methods
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-13 08:35:02 +01:00
Filip Petkovski bb8363dbb3
Add comment on SampleRingIterator
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-13 08:30:02 +01:00
Filip Petkovski 48df9fc020
Export SampleRingIterator
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-11 11:18:08 +01:00
Arthur Silva Sens 5082655392
Append Created Timestamps (#12733)
* Append created timestamps.

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Log when created timestamps are ignored

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Proposed changes to Append CT PR.

Changes:

* Changed textparse Parser interface for consistency and robustness.
* Changed CT interface to be more explicit and handle validation.
* Simplified test, change scrapeManager to allow testability.
* Added TODOs.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Updates.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Refactor head_appender test

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Fix linter issues

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

* Use model.Sample in head appender test

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>

---------

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
Co-authored-by: bwplotka <bwplotka@gmail.com>
2023-12-11 08:43:42 +00:00
Filip Petkovski e2a9f8ac0f
Reuse float histogram objects
This commit reduces the memory needed to query native histogram objects
by reusing existing HPoint instances.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-11 08:24:58 +01:00
Filip Petkovski 10a82f87fd
Enable reusing memory when converting between histogram types
The 'ToFloat' method on integer histograms currently allocates new memory
each time it is called.

This commit adds an optional *FloatHistogram parameter that can be used
to reuse span and bucket slices. It is up to the caller to make sure the
input float histogram is not used anymore after the call.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-08 10:22:59 +01:00
Matthieu MOREL 9c4782f1cc
golangci-lint: enable testifylint linter (#13254)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-12-07 11:35:01 +00:00
Oleksandr Redko 2a75604f8e
Enable default revive rules (#13068)
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2023-11-29 17:23:34 +00:00
Fiona Liao 5bee0cfce2
Change ChunkReader.Chunk() to ChunkOrIterable()
The ChunkReader interface's Chunk() has been changed to ChunkOrIterable(). 

This is a precursor to OOO native histogram support - with OOO native histograms, the chunks.Meta passed to Chunk() can result in multiple chunks being returned rather than just a single chunk (e.g. if oooMergedChunk has a counter reset in the middle). 

To support this, ChunkOrIterable() requires either a single chunk or an iterable to be returned. If an iterable is returned, the caller has the responsibility of converting the samples from the iterable into possibly multiple chunks. The OOOHeadChunkReader now returns an iterable rather than a chunk to prepare for the native histograms case. Also as a beneficial side effect, oooMergedChunk and boundedChunk has been simplified as they only need to implement the Iterable interface now, not the full Chunk interface.

---------

Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2023-11-28 11:14:29 +01:00
Bryan Boreham 34676a240e scrape: consistent function names for metadata
Too confusing to have `MetadataList` and `ListMetadata`, etc.
I standardised on the ones which are in an interface.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-11-23 09:08:02 +00:00
Bryan Boreham a2d5c02298
Merge pull request #13084 from charleskorn/concatenatingchunkiterator
Fix issue where `concatenatingChunkIterator` can obscure errors
2023-11-22 08:16:39 +00:00
Goutham 3048a88ae7
Add suffixes
Older version already did that. This upgrade needed manual opt-in

Signed-off-by: Goutham <gouthamve@gmail.com>
2023-11-15 15:52:18 +01:00
Goutham a99f48cc9f
Bump OTel Collector dependency to v0.88.0
I initially didn't copy the otlptranslator/prometheus folder because I
assumed it wouldn't get changes. But it did. So this PR fixes that and
updates the Collector version.

Supersedes: https://github.com/prometheus/prometheus/pull/12809

Signed-off-by: Goutham <gouthamve@gmail.com>
2023-11-15 15:18:14 +01:00
machine424 413b713aa8
remote/storage.go: adjust Storage.Notify() to avoid a race condition with Storage.ApplyConfig()
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2023-11-14 10:07:45 +01:00
machine424 08c17df244
remote/storage.go: add a test to highlight a race condition
between Storage.Notify() and Storage.ApplyConfig()

see https://github.com/prometheus/prometheus/issues/12747

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2023-11-13 13:23:53 +01:00
machine424 0996b78326
remote_write: add a unit test to make sure the write client sends
the extra http headers as expected

This will help letting prometheus off the hook from situations like
https://github.com/prometheus/prometheus/issues/13030

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2023-11-09 15:56:48 +01:00
Julien Pivotto cf01ec2119
Merge pull request #13091 from mmorel-35/errorlint/util
util: use Go standard errors package
2023-11-07 21:07:25 -06:00
Linas Medziunas 1f8aea11d6 Move histogram validation code to model/histogram
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
2023-11-03 16:17:24 +02:00
Linas Medziunas 1cd6c1cde5 ValidateHistogram: strict Count check in absence of NaNs
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
2023-11-03 16:17:24 +02:00
Matthieu MOREL fe057fc60d use Go standard errors package
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-11-03 07:26:31 +00:00
Charles Korn 8274e248ad
Fix issue where concatenatingChunkIterator can obscure errors.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-11-02 15:29:09 +11:00
Björn Rabenstein a43669e611
Merge pull request #12928 from alexandear/ci-enable-godot
ci(lint): enable godot; append dot at the end of comments
2023-11-01 17:15:41 +01:00
Charles Korn 5184368db6
Fix issue where chainSampleIterator can obscure errors (#13006)
* Fix issue where `chainSampleIterator` can obscure errors

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Address PR feedback.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

---------

Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-10-31 22:35:17 +00:00
Julien Pivotto f568221610
Merge pull request #13057 from prometheus/release-2.48
Merge release-2.48 back into main
2023-10-31 15:24:39 -04:00
Oleksandr Redko fa90ca46e5 ci(lint): enable godot; append dot at the end of comments
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2023-10-31 19:53:38 +02:00
beorn7 4696b46dd5 storage: Fix mixed samples handling in sampleRing
Two issues are fixed here, that lead to the same problem:

1. If `newSampleRing` is called with an unknown ValueType including
   ValueNone, we have initialized the interface buffer (`iBuf`).
   However, we would still use a specialized buffer for the first
   sample, opportunistically assuming that we might still not
   encounter mixed samples and we should go down the more efficient
   road.

2. If the `sampleRing` is `reset`, we leave all buffers alone,
   including `iBuf`, which is generally fine, but not for `iBuf`, see
   below.

In both cases, `iBuf` already contains values, but we will fill one of
the specialized buffers first. Once we then actually encounter mixed
samples, the content of the specialized buffer is copied into `iBuf`
using `append`. That's by itself the right idea because `iBuf` might
be `nil`, and even if not, it might or might not have the right
capacity. However, this approach assumes that `iBuf` is empty, or more
precisely has a length of zero.

This commit makes sure that `iBuf` does not get needlessly initialized
in `newSampleRing` and that it is emptied upon `reset`.

A test case is added to demonstrate both issues above.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-10-31 16:18:09 +01:00
Oleksandr Redko 8e5f0387a2
ci(lint): enable nolintlint and remove redundant comments (#12926)
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2023-10-31 12:35:13 +01:00
Matthieu MOREL 1ec6e407d0
ci(lint): enable errorlint on storage (#12935)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-10-31 12:15:30 +01:00
Levi Harrison 454a0a2c1b Update dependencies for 2.48 (#12964)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2023-10-15 13:47:42 -04:00
Levi Harrison dcaca86958
Update dependencies for 2.48 (#12964) 2023-10-15 10:53:59 -04:00
Bryan Boreham a5a4eab679
Storage: reduce memory allocations when merging series sets (#12938)
Instead of setting to nil and allocating a new slice every time the
merge is advanced, re-use the previous slice.
This is safe because the `currentSets` member is only used inside member
functions, and explicitly copied in `At()`, the only place it leaves the
struct.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-10-06 12:28:07 +01:00
rakshith210 cdad64002a
Added Azure OAuth support (#12572)
* Added Azure OAuth support

Signed-off-by: rakshith210 <rakshith.me@gmail.com>

* Added missing comment

Signed-off-by: rakshith210 <rakshith.me@gmail.com>

* Addressing comment

Signed-off-by: rakshith210 <rakshith.me@gmail.com>

* Fixed lint issue

Signed-off-by: rakshith210 <rakshith.me@gmail.com>

* Fix test

Signed-off-by: rakshith210 <rakshith.me@gmail.com>

* Addressing comments

Signed-off-by: rakshith210 <rakshith.me@gmail.com>

* Added documentation and updated unit tests

Signed-off-by: rakshith210 <rakshith.me@gmail.com>

* Addressing comments

Signed-off-by: rakshith210 <rakshith.me@gmail.com>

---------

Signed-off-by: rakshith210 <rakshith.me@gmail.com>
2023-10-04 22:16:36 -04:00
Goutham Veeramachaneni 86729d4d7b
Update exp package (#12650) 2023-09-21 22:53:51 +02:00
Björn Rabenstein ade2ee3fe4
Merge pull request #12716 from bboreham/simplify-seek
storage: simplify Seek on BufferedSeriesIterator
2023-09-21 13:56:29 +02:00
William Dumont ce6ad15422 remote-write: TestClientRetryAfter status code 500
and compare the retryAfter values.

Signed-off-by: William Dumont <william.dumont@grafana.com>
2023-09-20 10:25:43 +00:00
William Dumont febd62a23e remote-write: refactor TestClientRetryAfter
The new version features a set of test cases that simplify the addition
of new HTTP status codes.

Signed-off-by: William Dumont <william.dumont@grafana.com>
2023-09-20 10:24:52 +00:00
Bryan Boreham 9b85354acd remote-write: respect Retry-After header on 5xx errors
If the server sent it to us, we should assume it knows better than we
do and respect it.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-09-20 10:14:38 +00:00
Paschalis Tsilias c173cd57c9
Add a header to count retried remote write requests (#12729)
Header name is `Retry-Attempt`, only set when >0.

Signed-off-by: Marc Tuduri <marctc@protonmail.com>
Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
2023-09-20 11:11:03 +01:00
George Krajcsovits 3512b2d678
storage: make histogram reset handling consistent in chainSampleIterator (#12779)
storage: make histogram reset handling consistent in chainSampleIterator

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2023-09-19 17:06:46 +02:00
zenador 69edd8709b
Add warnings (and annotations) to PromQL query results (#12152)
Return annotations (warnings and infos) from PromQL queries

This generalizes the warnings we have already used before (but only for problems with remote read) as "annotations".

Annotations can be warnings or infos (the latter could be false positives). We do not treat them different in the API for now and return them all as "warnings". It would be easy to distinguish them and return infos separately, should that appear useful in the future.

The new annotations are then used to create a lot of warnings or infos during PromQL evaluations. Partially these are things we have wanted for a long time (e.g. inform the user that they have applied `rate` to a metric that doesn't look like a counter), but the new native histograms have created even more needs for those annotations (e.g. if a query tries to aggregate float numbers with histograms).

The annotations added here are not yet complete. A prominent example would be a warning about a range too short for a rate calculation. But such a warnings is more tricky to create with good fidelity and we will tackle it later.

Another TODO is to take annotations into account when evaluating recording rules.

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-09-14 18:57:31 +02:00