Commit graph

51 commits

Author SHA1 Message Date
beorn7 638e99c814 prometheus-mixin: Make PrometheusRemoteWriteBehind more generic
Currently, it relies on `job, instance` being the labels completely
identifying a Prometheus instance. However, what's intended is to
simply not match on `remote_name, url`.

Signed-off-by: beorn7 <beorn@grafana.com>
2020-11-17 13:29:49 +01:00
beorn7 371ca9ff46 prometheus-mixin: add HA-group aware alerts
There is certainly a potential to add more of these. This is mostly
meant to introduce the concept and cover a few critical parts.

Signed-off-by: beorn7 <beorn@grafana.com>
2020-11-11 19:45:34 +01:00
Matthias Loibl 13ba013a24
Use absolute jsonnet import paths
This should be the way forward when importing libraries in jsonnet. It's
closer to how Go imports look and makes it more obvious where packages
live.

This is not breaking anything, as the old imports were already symlinks
to the now directly used directories.

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>
2020-10-20 11:42:30 +02:00
Björn Rabenstein d49f267f76
Merge pull request #8054 from simonpasquier/improve-not-ingesting-samples-alert
documentation/prometheus-mixin: improve PrometheusNotIngestingSamples
2020-10-15 12:29:39 +02:00
Simon Pasquier f381d8a9bd documentation/prometheus-mixin: improve PrometheusNotIngestingSamples
The alert shouldn't fire when there's no target and no rule configured.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-10-15 11:13:17 +02:00
Julien Pivotto 4596abee4d
Mixin: Ignore unset remote write timestamp (#8046)
* Mixin: Ignore unset remote write timestamp

This pull request ignores the zero value of highest_sent_timestamp_seconds
in Highest Timestamp In vs. Highest Timestamp Sent which just show that
remote write has not been successful yet.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-15 09:15:59 +02:00
Simon Pasquier e693af6c01
.circleci/config.yml: check mixins (#6895)
* .circleci/config.yml: check mixins

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Run jsonnetfmt

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Install tools in the image instead of using coreos/jsonnet-ci

The latter is deprecated

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Update jsonnetfile.json

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-08-25 15:59:41 +02:00
Julien Pivotto f482c7bdd7
Add per scrape-config targets limit (#7554)
* Add per scrape-config targets limit

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-30 14:20:24 +02:00
Tom Wilkie 27b1009acd
Rename the dashboard in the mixin to 'Prometheus Overview'. (#7489)
Due to https://github.com/grafana/grafana/issues/15642, this prevents users putting this dashboard in a Grafana folder called 'Prometheus'.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2020-06-30 15:45:44 +01:00
Manuel Fontan 6e7554639b Update Readme since jsonnetfmt is available in the jsonnet go implementation since v0.16.0
Signed-off-by: Manuel Fontan <mfontangarcia@slack-corp.com>
2020-06-16 10:41:58 +01:00
Callum Styan 5400e71b91 Update mixin dashboards and alerts for new remote write label names.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2020-04-08 12:56:00 -07:00
Marco Pracucci 1e1785690a
Fix queue in alerts annotation
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-02-12 12:48:13 +01:00
paulfantom 7321f1d227
documentation/prometheus-mixin: add dependency on grafonnet
Signed-off-by: paulfantom <pawel@krupa.net.pl>
2020-01-11 23:18:04 +01:00
Callum Styan f4fb6dc208 Simplify remote write dashboard in mixin.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-11-18 19:58:07 -08:00
beorn7 9c8f9bfa63 Fix the description template for PrometheusRemoteWriteDesiredShards
Signed-off-by: beorn7 <beorn@grafana.com>
2019-10-30 13:27:37 +01:00
beorn7 61617eb2d9 Fix PrometheusRemoteWriteDesiredShards
This rule has the same labels on both sides. We don't want
`group_right` and `on`, we want nothing.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-10-29 00:23:39 +01:00
Callum Styan da6d46625f Repeat shards panels on the queue label.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-10-21 11:03:50 -07:00
Callum Styan 818974ff8f Rewrite remote write dashboard using base grafonnet.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-10-17 15:40:58 -07:00
Callum Styan 81fa63006c Add additional shards/segment graphs to remote write dashboard.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-10-09 09:59:02 -07:00
Simon Pasquier e36ab7e192
prometheus-mixin: improve description of sample alerts (#6050)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-09-24 17:44:27 +02:00
Björn Rabenstein 3b3eaf3496
Merge pull request #5787 from cstyan/reshard-max-logging
Add metrics for max/min/desired shards to queue manager.
2019-09-09 22:32:54 +02:00
Callum Styan a98599bea8 Update remote write max shards alert; properly template/query for max
shards in description.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-09-09 12:01:11 -07:00
Callum Styan 3b75614892 Add a warning alert, since the remote write behind alert will probably
already be going off, about desired shards being higher than max shards.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-08-08 06:45:46 -07:00
Simon Pasquier dd174963a2 prometheus-mixin: remove PrometheusTSDBWALCorruptions
The counter is only increased when tsdb.Open() is called which
Prometheus does only once in its lifetime (when it initializes). If the
corruption can't be recovered, tsdb.Open() returns an error and
Prometheus exits. Hence the metric is either 0 (no corruption) or 1
(corruption detected and repaired). If the latter, the alert isn't
actionable and the only way to resolve it is to restart Prometheus which
would reset the counter.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-08-06 14:36:56 +02:00
Matthias Loibl 20d12ff1c7
Fix prometheus-mixin dashboards to use grafanaDashboards
Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>
2019-07-11 15:40:26 +02:00
beorn7 4825585834 Tweak tenses
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-28 17:37:49 +02:00
beorn7 9a2177949d Protect gauge-based alerts against failed scrapes
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-28 16:46:19 +02:00
beorn7 52707535b8 Remove/improve unused variables and weird doc comments
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-28 15:41:31 +02:00
beorn7 7a25a2586d Sync with alerts from kube-prometheus
While doing so, re-introduce the summary/description
annotations. Also, add a few more rules and tweak a few of the
existing ones.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-27 23:50:26 +02:00
beorn7 ded0705bdc Update remote repo for grafana-builder dependency
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-27 14:39:38 +02:00
beorn7 1336a28848 Use a config variable for the Prometheus name
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-27 14:34:11 +02:00
beorn7 613cb5430c Add a "work in progress" disclaimer.
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 23:24:22 +02:00
beorn7 e34af6d4d3 Address various comments from the review
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 23:22:16 +02:00
beorn7 23c03207e9 Fixed indentation
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 20:31:05 +02:00
beorn7 d5845ad05b Fix formatting
This is the outcome of `make fmt`.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 16:23:25 +02:00
beorn7 d45e8a0f61 Adjust to jsonnet v0.13
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 16:22:21 +02:00
beorn7 5c04ef3935 Make README.md immediately useful
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 16:12:59 +02:00
beorn7 ddfabda152 Add Makefile and suitable jsonnet files
This makes the mixins usable as abvertised.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 15:30:55 +02:00
beorn7 e943803a3c Add .gitignore file
Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-26 15:22:23 +02:00
Callum Styan a5762f3681 Add dashboard for remote write to prometheus-mixin.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-06-17 15:02:42 -07:00
Tom Wilkie 38a9bbbec2 Loosen off PrometheusRemoteWriteBehind alert.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-03-04 12:47:24 +00:00
Tom Wilkie b615069289 Update metric names.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-03-01 07:39:48 -08:00
Tom Wilkie e248ffb220 Add alert for WAL remote write falling behind.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-02-12 15:22:58 +00:00
Tom Wilkie 638204c775 Typo
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-19 12:23:42 +00:00
Tom Wilkie 8f42192e52 Add Prometheus alerts from kube-prometheus, remove the alertmanager alerts.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-19 11:22:55 +00:00
Tom Wilkie dfbdf8d3bb Add a basic readme with link to the mixin docs.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:23:14 +00:00
Tom Wilkie 5fd712b210 copypasta.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:17:47 +00:00
Tom Wilkie 50861d586a Alert if more than 1% of alerts fail for a given integration.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:17:47 +00:00
Tom Wilkie 266ba185fe Remove PromScrapeFailed alert.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:17:47 +00:00
Tom Wilkie e8a8ce5654 Basic Prometheus dashboard.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-11-16 17:17:47 +00:00