prometheus/storage/remote
Robert Fratto a09465baee
storage/remote: disable resharding during active retry backoffs (#13562)
* storage/remote: disable resharding during active retry backoffs

Today, remote_write reshards based on pure throughput. This is
problematic if throughput has been diminished because of HTTP 429s;
increasing the number of shards due to backpressure will only exacerbate
the problem.

This commit disables resharding for twice the retry backoff, ensuring
that resharding will never occur during an active backoff, and that
resharding does not become enabled again until enough time has elapsed
to allow any pending requests to be retried.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* storage/remote: test that resharding is disabled on retry

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* storage/remote: address review feedback

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* storage/remote: track time where resharding initially got disabled

This change introduces a second atomic int64 to roughly track when
resharding got disabled. This int64 is only updated after updating the
disabled timestamp if resharding was previously enabled.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

---------

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2024-02-28 14:28:39 -08:00
..
azuread golangci-lint: enable testifylint linter (#13254) 2023-12-07 11:35:01 +00:00
otlptranslator otlptranslator: Upgrade to v0.95.0 2024-02-22 09:12:07 +01:00
chunked.go (storage): move from github.com/pkg/errors to 'errors' and 'fmt' (#10946) 2022-07-01 18:59:50 +02:00
chunked_test.go golangci-lint: enable testifylint linter (#13254) 2023-12-07 11:35:01 +00:00
client.go remote client: apply custom headers before sigv4 transport 2024-01-30 09:27:00 -05:00
client_test.go remote_write: add a unit test to make sure the write client sends 2023-11-09 15:56:48 +01:00
codec.go Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21 2024-02-28 14:54:53 +01:00
codec_test.go storage/remote: improve symbol-table handling 2024-02-23 13:50:27 +00:00
ewma.go style: Replace else if cascades with switch 2023-04-19 17:22:31 +02:00
intern.go Move away from testutil, refactor imports (#8087) 2020-10-22 11:00:08 +02:00
intern_test.go golangci-lint: enable testifylint linter (#13254) 2023-12-07 11:35:01 +00:00
max_timestamp.go Remote: Do not collect non-initialized timestamp metrics (#8060) 2020-10-15 23:53:59 +02:00
metadata_watcher.go scrape: consistent function names for metadata 2023-11-23 09:08:02 +00:00
metadata_watcher_test.go Move metric type definitions to common/model 2023-12-19 18:56:54 +00:00
queue_manager.go storage/remote: disable resharding during active retry backoffs (#13562) 2024-02-28 14:28:39 -08:00
queue_manager_test.go storage/remote: disable resharding during active retry backoffs (#13562) 2024-02-28 14:28:39 -08:00
read.go Add warnings (and annotations) to PromQL query results (#12152) 2023-09-14 18:57:31 +02:00
read_handler.go Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21 2024-02-28 14:54:53 +01:00
read_handler_test.go golangci-lint: enable testifylint linter (#13254) 2023-12-07 11:35:01 +00:00
read_test.go storage/remote: improve symbol-table handling 2024-02-23 13:50:27 +00:00
storage.go remote/storage.go: adjust Storage.Notify() to avoid a race condition with Storage.ApplyConfig() 2023-11-14 10:07:45 +01:00
storage_test.go golangci-lint: enable testifylint linter (#13254) 2023-12-07 11:35:01 +00:00
write.go Append Created Timestamps (#12733) 2023-12-11 08:43:42 +00:00
write_handler.go storage/remote: improve symbol-table handling 2024-02-23 13:50:27 +00:00
write_handler_test.go storage/remote: improve symbol-table handling 2024-02-23 13:50:27 +00:00
write_test.go golangci-lint: enable testifylint linter (#13254) 2023-12-07 11:35:01 +00:00