prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-10 07:34:04 -08:00

Author	SHA1	Message	Date
Jan-Otto Kröpke	302e151de8	{discovery,remote_write}/azure: Support default SDK authentication (#13099 ) * discovery/azure: Offer default SDK authentication Signed-off-by: Jan-Otto Kröpke <mail@jkroepke.de>	2024-03-16 11:06:57 +00:00
Julien	d1abc3f255	Merge pull request #13777 from roidelapluie/remoteread2 Chunked remote read: close the querier earlier	2024-03-15 14:42:30 +01:00
Julien Pivotto	53091126c2	Chunked remote read: close the querier earlier I have seen prometheis instances misebehaving because of broken chinked remote read requests. In order to avoid OOM's when this happens, I propose to close the queries used by the streamed remote read requests earlier. Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2024-03-15 14:03:16 +01:00
Jakub Čajka	505fd638be	otlptranslator: fix up import paths Signed-off-by: Jakub Čajka <jcajka@redhat.com>	2024-03-13 15:56:14 +01:00
György Krajcsovits	4d4d822c36	Add native histograms to latency/duration metrics Dogfood native histograms. Allow dependent projects to migrate to native histograms. I took the defaults from client_golang. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-03-01 14:44:38 +01:00
Robert Fratto	a09465baee	storage/remote: disable resharding during active retry backoffs (#13562 ) * storage/remote: disable resharding during active retry backoffs Today, remote_write reshards based on pure throughput. This is problematic if throughput has been diminished because of HTTP 429s; increasing the number of shards due to backpressure will only exacerbate the problem. This commit disables resharding for twice the retry backoff, ensuring that resharding will never occur during an active backoff, and that resharding does not become enabled again until enough time has elapsed to allow any pending requests to be retried. Signed-off-by: Robert Fratto <robertfratto@gmail.com> * storage/remote: test that resharding is disabled on retry Signed-off-by: Robert Fratto <robertfratto@gmail.com> * storage/remote: address review feedback Signed-off-by: Robert Fratto <robertfratto@gmail.com> * storage/remote: track time where resharding initially got disabled This change introduces a second atomic int64 to roughly track when resharding got disabled. This int64 is only updated after updating the disabled timestamp if resharding was previously enabled. Signed-off-by: Robert Fratto <robertfratto@gmail.com> --------- Signed-off-by: Robert Fratto <robertfratto@gmail.com>	2024-02-28 14:28:39 -08:00
machine424	f477e0539a	Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21 Prevent adding back golang.org/x/exp/slices. Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-02-28 14:54:53 +01:00
Bryan Boreham	2ac1632eec	storage/remote: improve symbol-table handling On the incoming path, `writeHandler.write()` creates a new table for each request. `labelProtosToLabels` takes a `ScratchBuilder` now. Call `NewScratchBuilder` as required in tests. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-02-23 13:50:27 +00:00
Bryan Boreham	8f525b4ba4	storage/remote tests: refactor: extract function newTestQueueManager To reduce repetition. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-02-23 13:50:27 +00:00
Arve Knudsen	bf5ca8cf38	otlptranslator: Upgrade to v0.95.0 Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-02-22 09:12:07 +01:00
Bryan Boreham	aba0071480	Merge pull request #13589 from bboreham/trace_id Standardise exemplar label as "trace_id"	2024-02-19 09:34:04 +00:00
Owen Williams	a28d7865ad	UTF-8: Add support for parsing UTF8 metric and label names This adds support for the new grammar of `{"metric_name", "l1"="val"}` to promql and some of the exposition formats. This grammar will also be valid for non-UTF-8 names. UTF-8 names will not be considered valid unless model.NameValidationScheme is changed. This does not update the go expfmt parser in text_parse.go, which will be addressed by https://github.com/prometheus/common/issues/554/. Part of https://github.com/prometheus/prometheus/issues/13095 Signed-off-by: Owen Williams <owen.williams@grafana.com>	2024-02-15 14:34:37 -05:00
Bryan Boreham	c0e36e6bb3	Standardise exemplar label as "trace_id" This is consistent with the OpenTelemetry standard, and an example in OpenMetrics. https://github.com/open-telemetry/opentelemetry-specification/blob/89aa01348139/specification/metrics/data-model.md#exemplars https://github.com/OpenObservability/OpenMetrics/blob/138654493130/specification/OpenMetrics.md#exemplars-1 Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-02-15 14:20:08 +00:00
Bryan Boreham	17f48f2b3b	Tests: use replacement DeepEquals in more places Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-02-08 19:32:33 +00:00
Bryan Boreham	8655fe5401	Merge pull request #13491 from bboreham/faster-store-series storage/remote: speed up StoreSeries by re-using labels.Builder	2024-02-06 17:16:32 +01:00
Bryan Boreham	41f3eeb048	Merge pull request #13497 from captncraig/cmp_signedheaders storage/remote: apply custom headers before sigv4 transport	2024-02-04 14:46:14 +01:00
Bryan Boreham	b9fdf3dad1	storage/remote: document why two benchmarks are skipped One was silently doing nothing; one was doing something but the work didn't go up linearly with iteration count. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-01-30 16:48:04 +00:00
Craig Peterson	5b5230deb7	remote client: apply custom headers before sigv4 transport Signed-off-by: Craig Peterson <192540+captncraig@users.noreply.github.com>	2024-01-30 09:27:00 -05:00
Bryan Boreham	dcd024a095	storage/remote: speed up StoreSeries by re-using labels.Builder Relabeling can take a pre-populated `Builder` instead of making a new one every time. This is much more efficient. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-01-29 18:49:55 +00:00
Bryan Boreham	d9483bb77c	storage/remote: add BenchmarkStoreSeries Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-01-29 16:54:12 +00:00
Goutham Veeramachaneni	fb552d290d	Merge pull request #13464 from aknuds1/arve/fix-update-copy otlptranslator/update-copy.sh: Fix sed command lines	2024-01-26 17:52:49 +05:30
Arve Knudsen	de28494434	Make update-copy.sh work for both OSX and GNU sed Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-01-25 15:11:03 +01:00
Arve Knudsen	660df3488d	otlptranslator/update-copy.sh: Fix sed command lines Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-01-25 13:56:04 +01:00
Filip Petkovski	583f3e587c	Optimize histogram iterators (#13340 ) Optimize histogram iterators Histogram iterators allocate new objects in the AtHistogram and AtFloatHistogram methods, which makes calculating rates over long ranges expensive. In #13215 we allowed an existing object to be reused when converting an integer histogram to a float histogram. This commit follows the same idea and allows injecting an existing object in the AtHistogram and AtFloatHistogram methods. When the injected value is nil, iterators allocate new histograms, otherwise they populate and return the injected object. The commit also adds a CopyTo method to Histogram and FloatHistogram which is used in the BufferedIterator to overwrite items in the ring instead of making new copies. Note that a specialized HPoint pool is needed for all of this to work (`matrixSelectorHPool`). --------- Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com> Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>	2024-01-23 17:02:14 +01:00
Goutham	aee6896c47	Minor fixes to otlp vendor update script Signed-off-by: Goutham <gouthamve@gmail.com>	2024-01-18 15:32:06 +05:30
Marc Tudurí	78c5ce3196	Drop old inmemory samples (#13002 ) * Drop old inmemory samples Co-authored-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Avoid copying timeseries when the feature is disabled Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Run gofmt Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Clarify docs Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Add more logging info Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Remove loggers Signed-off-by: Marc Tuduri <marctc@protonmail.com> * optimize function and add tests Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Simplify filter Signed-off-by: Marc Tuduri <marctc@protonmail.com> * rename var Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Update help info from metrics Signed-off-by: Marc Tuduri <marctc@protonmail.com> * use metrics to keep track of drop elements during buildWriteRequest Signed-off-by: Marc Tuduri <marctc@protonmail.com> * rename var in tests Signed-off-by: Marc Tuduri <marctc@protonmail.com> * pass time.Now as parameter Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Change buildwriterequest during retries Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Revert "Remove loggers" This reverts commit 54f91dfcae20488944162335ab4ad8be459df1ab. Signed-off-by: Marc Tuduri <marctc@protonmail.com> * use log level debug for loggers Signed-off-by: Marc Tuduri <marctc@protonmail.com> * Fix linter Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove noisy debug-level logs; add 'reason' label to drop metrics Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove accidentally committed files Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Propagate logger to buildWriteRequest to log dropped data Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix docs comment Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Make drop reason more specific Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove unnecessary pass of logger Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use snake_case for reason label Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix dropped samples metric Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> --------- Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Signed-off-by: Marc Tuduri <marctc@protonmail.com> Signed-off-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com>	2024-01-05 10:40:30 -08:00
Daniel Kerbel	b2185d96af	Consider storage.ErrTooOldSample as non-retryable Signed-off-by: Daniel Kerbel <nmdanny@gmail.com>	2023-12-26 18:44:39 +02:00
Bryan Boreham	8065bef172	Move metric type definitions to common/model They are used in multiple repos, so common is a better place for them. Several packages now don't depend on `model/textparse`, e.g. `storage/remote`. Also remove `metadata` struct from `api.go`, since it was identical to a struct in the `metadata` package. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-12-19 18:56:54 +00:00
Bryan Boreham	99c17b4319	Merge pull request #13177 from bboreham/less-madness scrape: consistent function names for metadata	2023-12-19 17:51:52 +00:00
Arthur Silva Sens	5082655392	Append Created Timestamps (#12733 ) * Append created timestamps. Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> * Log when created timestamps are ignored Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> * Proposed changes to Append CT PR. Changes: * Changed textparse Parser interface for consistency and robustness. * Changed CT interface to be more explicit and handle validation. * Simplified test, change scrapeManager to allow testability. * Added TODOs. Signed-off-by: bwplotka <bwplotka@gmail.com> * Updates. Signed-off-by: bwplotka <bwplotka@gmail.com> * Addressed comments. Signed-off-by: bwplotka <bwplotka@gmail.com> * Refactor head_appender test Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> * Fix linter issues Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> * Use model.Sample in head appender test Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> --------- Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> Signed-off-by: bwplotka <bwplotka@gmail.com> Co-authored-by: bwplotka <bwplotka@gmail.com>	2023-12-11 08:43:42 +00:00
Filip Petkovski	10a82f87fd	Enable reusing memory when converting between histogram types The 'ToFloat' method on integer histograms currently allocates new memory each time it is called. This commit adds an optional *FloatHistogram parameter that can be used to reuse span and bucket slices. It is up to the caller to make sure the input float histogram is not used anymore after the call. Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>	2023-12-08 10:22:59 +01:00
Matthieu MOREL	9c4782f1cc	golangci-lint: enable testifylint linter (#13254 ) Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-12-07 11:35:01 +00:00
Oleksandr Redko	2a75604f8e	Enable default revive rules (#13068 ) Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>	2023-11-29 17:23:34 +00:00
Bryan Boreham	34676a240e	scrape: consistent function names for metadata Too confusing to have `MetadataList` and `ListMetadata`, etc. I standardised on the ones which are in an interface. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-11-23 09:08:02 +00:00
Goutham	3048a88ae7	Add suffixes Older version already did that. This upgrade needed manual opt-in Signed-off-by: Goutham <gouthamve@gmail.com>	2023-11-15 15:52:18 +01:00
Goutham	a99f48cc9f	Bump OTel Collector dependency to v0.88.0 I initially didn't copy the otlptranslator/prometheus folder because I assumed it wouldn't get changes. But it did. So this PR fixes that and updates the Collector version. Supersedes: https://github.com/prometheus/prometheus/pull/12809 Signed-off-by: Goutham <gouthamve@gmail.com>	2023-11-15 15:18:14 +01:00
machine424	413b713aa8	remote/storage.go: adjust Storage.Notify() to avoid a race condition with Storage.ApplyConfig() Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2023-11-14 10:07:45 +01:00
machine424	08c17df244	remote/storage.go: add a test to highlight a race condition between Storage.Notify() and Storage.ApplyConfig() see https://github.com/prometheus/prometheus/issues/12747 Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2023-11-13 13:23:53 +01:00
machine424	0996b78326	remote_write: add a unit test to make sure the write client sends the extra http headers as expected This will help letting prometheus off the hook from situations like https://github.com/prometheus/prometheus/issues/13030 Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2023-11-09 15:56:48 +01:00
Oleksandr Redko	fa90ca46e5	ci(lint): enable godot; append dot at the end of comments Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>	2023-10-31 19:53:38 +02:00
Oleksandr Redko	8e5f0387a2	ci(lint): enable nolintlint and remove redundant comments (#12926 ) Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>	2023-10-31 12:35:13 +01:00
Matthieu MOREL	1ec6e407d0	ci(lint): enable errorlint on storage (#12935 ) Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-10-31 12:15:30 +01:00
Levi Harrison	dcaca86958	Update dependencies for 2.48 (#12964 )	2023-10-15 10:53:59 -04:00
rakshith210	cdad64002a	Added Azure OAuth support (#12572 ) * Added Azure OAuth support Signed-off-by: rakshith210 <rakshith.me@gmail.com> * Added missing comment Signed-off-by: rakshith210 <rakshith.me@gmail.com> * Addressing comment Signed-off-by: rakshith210 <rakshith.me@gmail.com> * Fixed lint issue Signed-off-by: rakshith210 <rakshith.me@gmail.com> * Fix test Signed-off-by: rakshith210 <rakshith.me@gmail.com> * Addressing comments Signed-off-by: rakshith210 <rakshith.me@gmail.com> * Added documentation and updated unit tests Signed-off-by: rakshith210 <rakshith.me@gmail.com> * Addressing comments Signed-off-by: rakshith210 <rakshith.me@gmail.com> --------- Signed-off-by: rakshith210 <rakshith.me@gmail.com>	2023-10-04 22:16:36 -04:00
Goutham Veeramachaneni	86729d4d7b	Update exp package (#12650 )	2023-09-21 22:53:51 +02:00
William Dumont	ce6ad15422	remote-write: TestClientRetryAfter status code 500 and compare the retryAfter values. Signed-off-by: William Dumont <william.dumont@grafana.com>	2023-09-20 10:25:43 +00:00
William Dumont	febd62a23e	remote-write: refactor TestClientRetryAfter The new version features a set of test cases that simplify the addition of new HTTP status codes. Signed-off-by: William Dumont <william.dumont@grafana.com>	2023-09-20 10:24:52 +00:00
Bryan Boreham	9b85354acd	remote-write: respect Retry-After header on 5xx errors If the server sent it to us, we should assume it knows better than we do and respect it. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-09-20 10:14:38 +00:00
Paschalis Tsilias	c173cd57c9	Add a header to count retried remote write requests (#12729 ) Header name is `Retry-Attempt`, only set when >0. Signed-off-by: Marc Tuduri <marctc@protonmail.com> Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>	2023-09-20 11:11:03 +01:00
zenador	69edd8709b	Add warnings (and annotations) to PromQL query results (#12152 ) Return annotations (warnings and infos) from PromQL queries This generalizes the warnings we have already used before (but only for problems with remote read) as "annotations". Annotations can be warnings or infos (the latter could be false positives). We do not treat them different in the API for now and return them all as "warnings". It would be easy to distinguish them and return infos separately, should that appear useful in the future. The new annotations are then used to create a lot of warnings or infos during PromQL evaluations. Partially these are things we have wanted for a long time (e.g. inform the user that they have applied `rate` to a metric that doesn't look like a counter), but the new native histograms have created even more needs for those annotations (e.g. if a query tries to aggregate float numbers with histograms). The annotations added here are not yet complete. A prominent example would be a warning about a range too short for a rate calculation. But such a warnings is more tricky to create with good fidelity and we will tackle it later. Another TODO is to take annotations into account when evaluating recording rules. --------- Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2023-09-14 18:57:31 +02:00
Arve Knudsen	156222cc50	Add context argument to LabelQuerier.LabelValues (#12665 ) Add context argument to LabelQuerier.LabelValues and LabelQuerier.SortedLabelValues. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2023-09-14 16:02:04 +02:00
Arve Knudsen	a964349e97	Add context argument to LabelQuerier.LabelNames (#12666 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2023-09-14 10:39:51 +02:00
Arve Knudsen	6daee89e5f	Add context argument to Querier.Select (#12660 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2023-09-12 12:37:38 +02:00
Gregor Zeitlinger	f01718262a	Unit tests for native histograms (#12668 ) promql: Extend testing framework to support native histograms This includes both the internal testing framework as well as the rules unit test feature of promtool. This also adds a bunch of basic tests. Many of the code level tests can now be converted to tests within the framework, and more tests can be added easily. --------- Signed-off-by: Harold Dost <h.dost@criteo.com> Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com> Signed-off-by: Stephen Lang <stephen.lang@grafana.com> Co-authored-by: Harold Dost <h.dost@criteo.com> Co-authored-by: Stephen Lang <stephen.lang@grafana.com> Co-authored-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>	2023-08-25 23:35:42 +02:00
Justin Lei	8ef7dfdeeb	Add a chunk size limit in bytes (#12054 ) Add a chunk size limit in bytes This creates a hard cap for XOR chunks of 1024 bytes. The limit for histogram chunk is also 1024 bytes, but it is a soft limit as a histogram has a dynamic size, and even a single one could be larger than 1024 bytes. This also avoids cutting new histogram chunks if the existing chunk has fewer than 10 histograms yet. In that way, we are accepting "jumbo chunks" in order to have at least 10 histograms in a chunk, allowing compression to kick in. Signed-off-by: Justin Lei <justin.lei@grafana.com>	2023-08-24 15:21:17 +02:00
beorn7	aa82fe198f	tsdb: Fix histogram validation So far, `ValidateHistogram` would not detect if the count did not include the count in the zero bucket. This commit fixes the problem and updates all the tests that have been undetected offenders so far. Note that this problem would only ever create false negatives, so we never falsely rejected to store a histogram because of it. On the other hand, `ValidateFloatHistogram` has been to strict with the count being at least as large as the sum of the counts in all the buckets. Float precision issues could create false positives here, see products of PromQL evaluations, it's actually quite hard to put an upper limit no the floating point imprecision. Users could produce the weirdest expressions, maxing out float precision problems. Therefore, this commit simply removes that particular check from `ValidateFloatHistogram`. Signed-off-by: beorn7 <beorn@grafana.com>	2023-08-22 23:04:01 +02:00
Michael Hoffmann	4d8e380269	promql: allow tests to be imported (#12050 ) Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>	2023-08-18 20:48:59 +02:00
Bryan Boreham	d2ae8dc3cb	remote-write: add http.resend_count tracing attribute As recommended by the OpenTelemetry semantic conventions. https://opentelemetry.io/docs/specs/otel/trace/semantic_conventions/http/#http-client Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-08-11 16:20:12 +00:00
Goutham Veeramachaneni	ad4f514e66	Add OTLP Ingestion endpoint (#12571 ) * Add OTLP Ingestion endpoint We copy files from the otel-collector-contrib. See the README in `storage/remote/otlptranslator/README.md`. This supersedes: https://github.com/prometheus/prometheus/pull/11965 Signed-off-by: gouthamve <gouthamve@gmail.com> * Return a 200 OK It is what the OTEL Golang SDK expect :( https://github.com/open-telemetry/opentelemetry-go/issues/4363 Signed-off-by: Goutham <gouthamve@gmail.com> --------- Signed-off-by: gouthamve <gouthamve@gmail.com> Signed-off-by: Goutham <gouthamve@gmail.com>	2023-07-28 12:35:28 +02:00
LHHDZ	7d8f9b0978	remote-write receiver: reuse 'ref' to optimize multiple samples for same series (#12580 ) reuse 'ref' to optimize multi samples processing efficiency Signed-off-by: changlin.shi <changlin.shi@ly.com>	2023-07-22 14:24:46 +01:00
Julien Pivotto	0f85e4f41d	Merge pull request #12539 from bboreham/slices-sorts Replace sort.Slice with faster slices.SortFunc	2023-07-11 13:09:02 +02:00
Bryan Boreham	ce153e3fff	Replace sort.Sort with faster slices.SortFunc The generic version is more efficient. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-07-10 09:43:45 +00:00
Julien Pivotto	986fde06b2	Merge pull request #11688 from damnever/fix/datamodelvalidation-remotewriteapi Validate the metric names and labels in the remote write handler	2023-07-04 13:52:02 +02:00
Bryan Boreham	5255bf06ad	Replace sort.Slice with faster slices.SortFunc The generic version is more efficient. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-07-02 22:17:08 +00:00
rakshith210	b1675e23af	Add Azure AD package for remote write (#11944 ) * Add Azure AD package for remote write * Made AzurePublic default and updated configuration.md * Updated config structure and removed getToken at initialization * Changed passing context from request Signed-off-by: Rakshith Padmanabha <rapadman@microsoft.com> Signed-off-by: rakshith210 <rakshith.me@gmail.com>	2023-06-01 15:20:10 -06:00
Callum Styan	0d2108ad79	[tsdb] re-implement WAL watcher to read via a "notification" channel (#11949 ) * WIP implement WAL watcher reading via notifications over a channel from the TSDB code Signed-off-by: Callum Styan <callumstyan@gmail.com> * Notify via head appenders Commit (finished all WAL logging) rather than on each WAL Log call Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix misspelled Notify plus add a metric for dropped Write notifications Signed-off-by: Callum Styan <callumstyan@gmail.com> * Update tests to handle new notification pattern Signed-off-by: Callum Styan <callumstyan@gmail.com> * this test maybe needs more time on windows? Signed-off-by: Callum Styan <callumstyan@gmail.com> * does this test need more time on windows as well? Signed-off-by: Callum Styan <callumstyan@gmail.com> * read timeout is already a time.Duration Signed-off-by: Callum Styan <callumstyan@gmail.com> * remove mistakenly commited benchmark data files Signed-off-by: Callum Styan <callumstyan@gmail.com> * address some review feedback Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix missed changes from previous commit Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix issues from wrapper function Signed-off-by: Callum Styan <callumstyan@gmail.com> * try fixing race condition in test by allowing tests to overwrite the read ticker timeout instead of calling the Notify function Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix linting Signed-off-by: Callum Styan <callumstyan@gmail.com> --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	2023-05-15 12:31:49 -07:00
Jeanette Tan	1102ffd188	Fix according to code review Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2023-04-22 02:27:15 +08:00
Jeanette Tan	e9a1e26ab7	Perform integer/float histogram type checking on conversions, and use a consistent method for determining integer vs float histogram Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2023-04-22 02:27:15 +08:00
Matthieu MOREL	bae9a21200	Merge branch 'main' into linter/nilerr Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-04-19 19:56:39 +02:00
beorn7	5b53aa1108	style: Replace `else if` cascades with `switch` Wiser coders than myself have come to the conclusion that a `switch` statement is almost always superior to a statement that includes any `else if`. The exceptions that I have found in our codebase are just these two: * The `if else` is followed by an additional statement before the next condition (separated by a `;`). * The whole thing is within a `for` loop and `break` statements are used. In this case, using `switch` would require tagging the `for` loop, which probably tips the balance. Why are `switch` statements more readable? For one, fewer curly braces. But more importantly, the conditions all have the same alignment, so the whole thing follows the natural flow of going down a list of conditions. With `else if`, in contrast, all conditions but the first are "hidden" behind `} else if `, harder to spot and (for no good reason) presented differently from the first condition. I'm sure the aforemention wise coders can list even more reasons. In any case, I like it so much that I have found myself recommending it in code reviews. I would like to make it a habit in our code base, without making it a hard requirement that we would test on the CI. But for that, there has to be a role model, so this commit eliminates all `if else` occurrences, unless it is autogenerated code or fits one of the exceptions above. Signed-off-by: beorn7 <beorn@grafana.com>	2023-04-19 17:22:31 +02:00
beorn7	c3c7d44d84	lint: Adjust to the lint warnings raised by current versions of golint-ci We haven't updated golint-ci in our CI yet, but this commit prepares for that. There are a lot of new warnings, and it is mostly because the "revive" linter got updated. I agree with most of the new warnings, mostly around not naming unused function parameters (although it is justified in some cases for documentation purposes – while things like mocks are a good example where not naming the parameter is clearer). I'm pretty upset about the "empty block" warning to include `for` loops. It's such a common pattern to do something in the head of the `for` loop and then have an empty block. There is still an open issue about this: https://github.com/mgechev/revive/issues/810 I have disabled "revive" altogether in files where empty blocks are used excessively, and I have made the effort to add individual `// nolint:revive` where empty blocks are used just once or twice. It's borderline noisy, though, but let's go with it for now. I should mention that none of the "empty block" warnings for `for` loop bodies were legitimate. Signed-off-by: beorn7 <beorn@grafana.com>	2023-04-19 17:10:10 +02:00
Matthieu MOREL	fb3eb21230	enable gocritic, unconvert and unused linters Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-04-13 19:20:22 +00:00
Björn Rabenstein	6e0a46900b	Merge pull request #12192 from leizor/leizor/prometheus/issues/11204 Add support for native histograms to concreteSeriesIterator	2023-04-11 12:30:35 +02:00
Justin Lei	f90013a5a0	Update storage/remote/codec.go Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Justin Lei <97976793+leizor@users.noreply.github.com>	2023-04-06 09:54:15 -07:00
Justin Lei	83f43982c9	Add support for native histograms to concreteSeriesIterator Signed-off-by: Justin Lei <justin.lei@grafana.com>	2023-04-06 09:54:15 -07:00
Xiaochao Dong (@damnever)	2b7202c4cc	Validate the metric names and labels in the remote write handler Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>	2023-04-05 19:09:05 +08:00
Bryan Boreham	b987afa7ef	labels: simplify call to get Labels from Builder It took a `Labels` where the memory could be re-used, but in practice this hardly ever benefitted. Especially after converting `relabel.Process` to `relabel.ProcessBuilder`. Comparing the parameter to `nil` was a bug; `EmptyLabels` is not `nil` so the slice was reallocated multiple times by `append`. Lastly `Builder.Labels()` now estimates that the final size will depend on labels added and deleted. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-03-22 17:05:20 +00:00
Björn Rabenstein	559adab471	Merge pull request #12085 from leizor/leizor/prometheus/issues/11204 Handle native histograms in remote read	2023-03-21 17:25:34 +01:00
Oleg Zaytsev	beb7d3b80f	remote.Client: store urlString During remote write, we call url.String() twice: - to add the Endpoint() to the span - to actually know where whe should send the request This value does not change over time, and it's not really that lightweight to calculate. I wrote this simple benchmark: func BenchmarkURLString(b testing.B) { u, err := url.Parse("https://remote.write.com/api/v1") require.NoError(b, err) b.Run("string", func(b testing.B) { count := 0 for i := 0; i < b.N; i++ { count += len(u.String()) } }) } And the results are ~200ns/op, 80B/op, 3 allocs/op. Yes, we're going to go to the network here, which is a huge amount of resources compared to this, but still, on agents that send 500 requests per second, that is 1500 wasteful allocations per second. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2023-03-16 09:53:10 +01:00
Justin Lei	60ad864667	Remove hacky promql.Test native histogram thing Signed-off-by: Justin Lei <justin.lei@grafana.com>	2023-03-09 11:05:53 -08:00
Justin Lei	c16b6a0185	Handle native histograms in remote read Signed-off-by: Justin Lei <justin.lei@grafana.com>	2023-03-09 09:13:53 -08:00
Arve Knudsen	bc9a82f5a1	remote: Improve some comments (#12102 ) Improve some comments in storage/remote/queue_manager.go, wrt. general language and a typo. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2023-03-09 11:05:24 +00:00
Arve Knudsen	435b500de7	remote: Convert to RecoverableError using errors.As (#12103 ) In storage/remote, try converting to RecoverableError using errors.As, instead of through direct casting. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2023-03-08 13:58:09 -07:00
Julien Pivotto	475f9984d0	Merge pull request #11787 from damnever/perf/avoid-alloc-if-no-externallabels Avoid allocation during remote write if external labels is empty	2023-02-22 23:38:21 +01:00
Julien Pivotto	dfd2b5340e	Merge pull request #11951 from Fish-pro/chore/httpvar Use http constants instead of string	2023-02-10 22:44:50 +01:00
Fish-pro	43d77f7c41	Use http constants instead of string Signed-off-by: Fish-pro <zechun.chen@daocloud.io>	2023-02-10 10:21:05 +08:00
Charles Korn	0a1de58f7e	Mark Histogram.(Positive\|Negative)Spans as non-nullable. As far as I understand it, we'd never expect to receive a nil span, and remote.spansProtoToSpans would panic if we received a nil span. Marking the fields as non-nullable also means the generated Golang code doesn't use pointers for these fields, reducing allocations. Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-02-03 13:49:22 +11:00
György Krajcsovits	2d9a9cbc08	Fix storage/remote/codec ignoreing histogram reset hint Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2023-01-24 12:56:30 +01:00
Jesus Vazquez	136956cca4	Attempt to append ooo sample at the end first (#11615 ) This is an optimization on the existing append in OOOChunk. What we've been doing so far is find the place inside the out-of-order slice where the new sample should go in and then place it there and move any samples to the right if necessary. This is OK but requires a binary search every time the slice is bigger than 0. The optimization is opinionated and suggests that although out-of-order samples can be out-of-order amongst themselves they'll probably be in order thus we can probably optimistically append at the end and if not do the binary search. OOOChunks are capped to 30 samples by default so this is a small optimization but everything adds up, specially if you handle many active timeseries with out-of-order samples. Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> Signed-off-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	2023-01-13 19:00:50 +05:30
Marc Tudurí	721f33dbb0	histograms: Add remote-write support for Float Histograms (#11817 ) * adapt code.go and write_handler.go to support float histograms * adapt watcher.go to support float histograms * wip adapt queue_manager.go to support float histograms * address comments for metrics in queue_manager.go * set test cases for queue manager * use same counts for histograms and float histograms * refactor createHistograms tests * fix float histograms ref in watcher_test.go * address PR comments Signed-off-by: Marc Tuduri <marctc@protonmail.com>	2023-01-13 16:39:20 +05:30
Xiaochao Dong (@damnever)	2d61d012ff	Avoid copy during remote write if external labels is empty Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>	2022-12-30 19:18:30 +08:00
Fish-pro	6ed71a229e	Use errors.Is to check for a specific error Signed-off-by: Fish-pro <zechun.chen@daocloud.io>	2022-12-29 23:23:07 +08:00
Marc Tudurí	9474610baf	Support FloatHistogram in TSDB (#11522 ) Extends Appender.AppendHistogram function to accept the FloatHistogram. TSDB supports appending, querying, WAL replay, for this new type of histogram. Signed-off-by: Marc Tudurí <marctc@protonmail.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-12-28 14:25:07 +05:30
Bryan Boreham	ccea61c7bf	Merge pull request #11717 from bboreham/labels-abstraction Add and use abstractions over labels.Labels	2022-12-20 17:23:39 +00:00
Sniper91	46fb802791	reset frameBytesLeft after writing (#11689 ) Signed-off-by: sniper91 <kevinzhao91@outlook.com> Signed-off-by: sniper91 <kevinzhao91@outlook.com>	2022-12-19 16:54:49 +01:00
Bryan Boreham	047585360b	Update package storage/remote tests for new labels.Labels type Use ScratchBuilder to create labels. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-19 15:22:09 +00:00
Bryan Boreham	abd9909595	Update package storage/remote for new labels.Labels type `QueueManager.externalLabels` becomes a slice rather than a `Labels` so we can index into it when doing the merge operation. Note we avoid calling `Labels.Len()` in `labelProtosToLabels()`. It isn't necessary - `append()` will enlarge the buffer and we're expecting to re-use it many times. Also, we now validate protobuf input before converting to Labels. This way we can detect errors first, and we don't place unnecessary requirements on the Labels structure. Re-do seriesFilter using labels.Builder (albeit N^2). Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-19 15:22:09 +00:00
Bryan Boreham	463f5cafdd	storage: re-use iterators to save garbage Re-use previous memory if it is already of the correct type. In `NewListSeries` we hoist the conversion to an interface value out so it only allocates once. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-15 18:32:45 +00:00
Bryan Boreham	3c7de69059	storage: allow re-use of iterators Patterned after `Chunk.Iterator()`: pass the old iterator in so it can be re-used to avoid allocating a new object. (This commit does not do any re-use; it is just changing all the method signatures so re-use is possible in later commits.) Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-15 18:32:45 +00:00
Julius Volz	1a2c645dfa	Correctly handle error unwrapping in rules and remote write receiver errors.Unwrap() actually dangerously returns nil if the error does not have an Unwrap() method, which is the case in at least one of these places where I noticed that no error was being logged at all when it should have. Signed-off-by: Julius Volz <julius.volz@gmail.com>	2022-12-15 12:50:55 +01:00

1 2 3 4 5 ...

561 commits