* [ENHANCEMENT] Alerts: remove metrics for removed Alertmanagers
So they don't continue to report stale values.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Add draining of queued notifications to `notifier.Manager`
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Update docs
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Address PR feedback
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Add more logging
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Address offline feedback: remove timeout
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Ensure stopping takes priority over further processing, make tests more robust
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Make channel unbuffered
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Update docs
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Fix race in test
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Remove unnecessary context
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Make Stop safe to call multiple times
Signed-off-by: Charles Korn <charles.korn@grafana.com>
---------
Signed-off-by: Charles Korn <charles.korn@grafana.com>
This is done to prevent the latter operation from blocking/starving the former, as previously, the `tsets` channel was consumed by the same goroutine that consumes and feeds the buffered `n.more` channel, the `tsets` channel was less likely to be ready as it's unbuffered and only fed every `SDManager.updatert` seconds.
See https://github.com/prometheus/prometheus/issues/13676 and https://github.com/prometheus/prometheus/issues/8768
The synchronization with the sendLoop goroutine is managed through the n.mtx mutex.
This uses a similar approach than scrape manager's efbd6e41c5/scrape/manager.go (L115-L117)
The old TestHangingNotifier was replaced by the new one to more closely reflect reality.
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
When all alerts were dropped after alert relabeling, the `sendAll()`
function didn't release the lock properly which created a deadlock with
the Alertmanager target discovery.
In addition, the commit detects early when there are no Alertmanager
endpoint to notify to avoid unnecessary work.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Avoids possible false sharing between loops.
Plausibly there is no problem in the current code, but it's easy enough to write it more safely.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Addresses: #12536
This commit adds support for configuring sigv4 to an
`alertmanager_config`. Based heavily on the sigv4 work in the remote
write client.
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
Use a label builder instead of a slice when creating labels for the
target alertmanagers. This can be passed directly to
`relabel.ProcessBuilder`, skipping a copy.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
It took a `Labels` where the memory could be re-used, but in practice
this hardly ever benefitted. Especially after converting `relabel.Process`
to `relabel.ProcessBuilder`.
Comparing the parameter to `nil` was a bug; `EmptyLabels` is not `nil`
so the slice was reallocated multiple times by `append`.
Lastly `Builder.Labels()` now estimates that the final size will depend
on labels added and deleted.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* model/relabel: Add benchmark
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* model/relabel: re-use Builder across relabels
Saves memory allocations.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* labels.Builder: allow re-use of result slice
This reduces memory allocations where the caller has a suitable slice available.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* model/relabel: re-use source values slice
To reduce memory allocations.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Unwind one change causing test failures
Restore original behaviour in PopulateLabels, where we must not overwrite the input set.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* relabel: simplify values optimisation
Use a stack-based array for up to 16 source labels, which will be the
vast majority of cases.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* lint
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir
Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
This creates a new `model` directory and moves all data-model related
packages over there:
exemplar labels relabel rulefmt textparse timestamp value
All the others are more or less utilities and have been moved to `util`:
gate logging modetimevfs pool runtime
Signed-off-by: beorn7 <beorn@grafana.com>
We are re-enabling HTTP 2 again. There has been a few bugfixes upstream
in go, and we have also enabled ReadIdleTimeout.
Fix#7588Fix#9068
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
This also fixes a bug in query_log_file, which now is relative to the config file like all other paths.
Signed-off-by: Andy Bursavich <abursavich@gmail.com>
* storage: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* tsdb: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* web: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* notifier: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* cmd: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* scripts: Verify that we are not using restricted packages
It checks that we are not directly importing 'sync/atomic'.
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* Reorganise imports in blocks
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* notifier/test: Apply PR suggestions
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* storage/remote: avoid storing references on newEntry
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* Revert "scripts: Verify that we are not using restricted packages"
This reverts commit 278d32748e.
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* web: Group imports accordingly
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
This change makes sure that nearly-identical Alertmanager configurations
aren't merged together.
The config's identifier was the MD5 hash of the configuration serialized
to JSON but because `relabel.Regexp` has no public field and doesn't
implement the JSON.Marshaler interface, it was always serialized to
"{}".
In practice, the identifier can be based on the index of the
configuration in the list.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Update go.mod dependencies before release
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add issue for showing query warnings in promtool
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Revert json-iterator back to 1.1.6
It produced errors when marshaling Point values with special float
values.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Fix expected step values in promtool tests after client_golang update
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Update generated protobuf code after proto dep updates
Signed-off-by: Julius Volz <julius.volz@gmail.com>
With v0.16.0 Alertmanager introduced a new API (v2). This patch adds a
configuration option for Prometheus to send alerts to the v2 endpoint
instead of the defautl v1 endpoint.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
With the next release of client_golang, Summaries will not have
objectives by default. To not lose the objectives we have right now,
explicitly state the current default objectives.
Signed-off-by: beorn7 <beorn@grafana.com>
1. If alerts is empty after `relabelAlerts`, just return to avoid
subsequent unnecessary operations
2. minor fix in notifier's test case
3. minor fix in comment
Signed-off-by: YaoZengzeng <yaozengzeng@zju.edu.cn>
From the documentation:
> The default HTTP client's Transport may not
> reuse HTTP/1.x "keep-alive" TCP connections if the Body is
> not read to completion and closed.
This effectively enable keep-alive for the fixed requests.
Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
- Unmarshall external_labels config as labels.Labels, add tests.
- Convert some more uses of model.LabelSet to labels.Labels.
- Remove old relabel pkg (fixes#3647).
- Validate external label names.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>