prometheus/docs/configuration
Bryan Boreham 1ed94142fc
remote-write: slow down retries to avoid DDOS (#9634)
* remote-write: slow down retries to avoid DDOS

Increase the default max retry time from 100ms to 5 seconds.

Remote write calls are retried after a recoverable error such as the
back-end returning 500. Prometheus waits the minimum time and retries,
then doubles the wait on each subsequent retry until the maximum is
reached.

If some data is still getting through, remote-write will also increase
shards, and the default maximum is 200. 200 shards sending every 100ms
is 20 calls per second, to a back-end that is already in trouble.

5 seconds was chosen to match the default BatchSendDeadline: if we can
afford to wait that long for no response, then we can wait the same time
to retry. We will reach 5 seconds after 9 successive failures.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Update config doc for max_backoff change

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-11-09 14:08:24 -08:00
..
alerting_rules.md Update alerting_rules.md (#7252) 2020-09-07 17:30:01 +01:00
configuration.md remote-write: slow down retries to avoid DDOS (#9634) 2021-11-09 14:08:24 -08:00
https.md Add support for security-related HTTP headers (#9546) 2021-10-19 21:26:52 +02:00
index.md Consolidate configuration and rules docs in docs/configuration/ 2017-10-27 09:54:02 +02:00
recording_rules.md Rule alerts/series limit updates (#9541) 2021-10-21 23:14:17 +02:00
template_examples.md format markdown code block (#5594) 2019-05-25 11:28:50 +01:00
template_reference.md Updated docs 2021-05-30 23:36:05 -04:00
unit_testing_rules.md docs: update unit_testing_rules to cover missing and stale samples (#9065) 2021-07-19 15:46:14 +05:30