Commit graph

80 commits

Author SHA1 Message Date
Chris Marchbanks 63ed9d1b70 Send EndsAt along with alerts (#4550)
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2018-08-28 16:05:00 +01:00
Chris Marchbanks 87f1dad16d throttle resends of alerts to 1 minute by default (#4538)
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2018-08-27 17:41:42 +01:00
Goutham Veeramachaneni f3b7c22827 rules: add comment about lock taking (#4525)
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-08-21 21:30:08 +02:00
Chris Marchbanks 11155c7028 Existing alert labels will update based on templates (#4500)
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2018-08-15 08:52:08 +01:00
Fabian Reinartz b7e2f407de rules: Fix double-locking of mutex
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-08-07 07:33:39 -04:00
Benji Visser 8bb6e0dd6e Show rule evaluation errors on rules page (#4457)
* adding information about the health and errors for Rules

adding Health() and LastError() to the Rule interface. This will allow
us to easily surface information about rules.

Signed-off-by: noqcks <benny@noqcks.io>

* updating rules.html with fields for Rule errors and health state

Signed-off-by: noqcks <benny@noqcks.io>

* fix code comment grammar & access Rule health/error info using a mutex

Signed-off-by: noqcks <benny@noqcks.io>

* s/Errors/Error/ in rules.html to remain consistent with targets.html

Signed-off-by: noqcks <benny@noqcks.io>

* adding periods to code comments in reporting/alerting

Signed-off-by: noqcks <benny@noqcks.io>

* putting health/error below mutex in struct field

Signed-off-by: noqcks <benny@noqcks.io>
2018-08-07 00:33:45 +02:00
Julius Volz 2b8fc062a8
rules: HTML-escape rule YAML marshal errors (#4464)
This was pointed out by `gosec`.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-05 14:01:51 +02:00
Ganesh Vernekar f1db699dff Persist alert 'for' state across restarts (#4061)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-08-02 11:18:24 +01:00
Max Leonard Inden 71fafad099
api/v1: Coninue work exposing rules and alerts
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-07-30 15:31:51 +02:00
mg03 31f8ca0dfb
api v1 alerts/rules json endpoint
Signed-off-by: mg03 <mgeng03@gmail.com>
2018-07-30 15:29:44 +02:00
Julius Volz 9e3171f6e3 rules: Minor naming/comment cleanups (#4328)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 04:54:33 +01:00
ferhat elmas ec8e4d8a7c all: remove unnecessary type conversions (#3992)
excep promql due to not to create conflict with #3966.
2018-03-21 09:25:22 +00:00
Simon Pasquier 81c0ab69e0 Don't reset FiredAt for inactive alerts
Otherwise AlertManager receives resolved alerts where StartsAt is zero which
fails the validation.
2018-01-22 17:17:33 +01:00
Fabian Reinartz 62461379b7 rules: decouple notifier packages
The dependency on the notifier packages caused a transitive dependency
on discovery and with that all client libraries our service discovery
uses.
2017-11-27 16:38:14 +01:00
Fabian Reinartz 2d0e3746ac rules: remove dependency on promql.Engine 2017-11-24 07:57:54 +01:00
Goutham Veeramachaneni a880c86375
Fix unexported method on exported interface.
Also move to model.Duration

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-11-23 19:13:57 +05:30
conorbroderick 55aaece116 Add rule evaluation time 2017-11-22 15:22:02 +00:00
Julius Volz 099df0c5f0 Migrate "golang.org/x/net/context" -> "context" (#3333)
In some places, where ctxhttp or gRPC are concerned, we still need to use the
old contexts.
2017-10-24 21:21:42 -07:00
Fabian Reinartz 2d0b8e8b94 Merge branch 'master' into dev-2.0 2017-10-05 13:09:18 +02:00
beorn7 c2e9a151ab Make all rule links link to the "Console" tab rather than "Graph"
Clicking on a rule, either the name or the expression, opens the rule
result (or the corresponding expression, repsectively) in the
expression browser. This should by default happen in the console tab,
as, more often than not, displaying it in the graph tab runs into a
timeout.
2017-09-21 18:28:00 +02:00
Fabian Reinartz d21f149745 *: migrate to go-kit/log 2017-09-08 22:01:51 +05:30
Goutham Veeramachaneni e1fc9dc78d Move /rules to new format (#2901)
Fixes #2891

Signed-off-by: Goutham Veeramachaneni <goutham@boomerangcommerce.com>
2017-07-08 11:38:02 +02:00
Goutham Veeramachaneni 507790a357
Rework logging to use explicitly passed logger
Mostly cleaned up the global logger use. Still some uses in discovery
package.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 15:52:44 +05:30
Brian Brazil dcea3e4773 Don't append a 0 when alert is no longer pending/firing
With staleness we no longer need this behaviour.
2017-05-24 13:52:45 +01:00
Fabian Reinartz 06c2b76cd4 Merge branch 'master' into uptsdb 2017-05-16 16:48:37 +02:00
Julius Volz ac203ef0ee Add externalURL template function (#2716)
This allows users to e.g. add links back to the generating Prometheus
right in their alert templates.
2017-05-13 15:47:04 +02:00
Julius Volz fe11c5933a Fix mutation of active alert elements by notifier (#2656)
This caused the external label application in the notifier to bleed back
into the rule manager's active alerting elements.
2017-04-26 10:29:42 -05:00
Fabian Reinartz e94b0899ee rules: fix tests, remove model types 2016-12-29 17:31:14 +01:00
Fabian Reinartz 5817cb5bde *: migrate from model.* to promql.* types 2016-12-25 00:37:46 +01:00
Fabian Reinartz e68a3cf21f rules: update annotations on each iteration 2016-11-22 15:43:07 +01:00
Jonathan Lange 2a2da40223 Make rule evaluation publicly available
Means that a third-party can parse rules and run them with their own
execution model.
2016-11-18 16:12:50 +00:00
Julius Volz c187308366 storage: Contextify storage interfaces.
This is based on https://github.com/prometheus/prometheus/pull/1997.

This adds contexts to the relevant Storage methods and already passes
PromQL's new per-query context into the storage's query methods.
The immediate motivation supporting multi-tenancy in Frankenstein, but
this could also be used by Prometheus's normal local storage to support
cancellations and timeouts at some point.
2016-09-19 16:29:07 +02:00
Julius Volz ed5a0f0abe promql: Allow per-query contexts.
For Weaveworks' Frankenstein, we need to support multitenancy. In
Frankenstein, we initially solved this without modifying the promql
package at all: we constructed a new promql.Engine for every
query and injected a storage implementation into that engine which would
be primed to only collect data for a given user.

This is problematic to upstream, however. Prometheus assumes that there
is only one engine: the query concurrency gate is part of the engine,
and the engine contains one central cancellable context to shut down all
queries. Also, creating a new engine for every query seems like overkill.

Thus, we want to be able to pass per-query contexts into a single engine.

This change gets rid of the promql.Engine's built-in base context and
allows passing in a per-query context instead. Central cancellation of
all queries is still possible by deriving all passed-in contexts from
one central one, but this is now the responsibility of the caller. The
central query context is now created in main() and passed into the
relevant components (web handler / API, rule manager).

In a next step, the per-query context would have to be passed to the
storage implementation, so that the storage can implement multi-tenancy
or other features based on the contextual information.
2016-09-19 15:38:17 +02:00
Julius Volz da7206ec29 Fix rule HTML escaping issues
This was mentioned as part of https://github.com/prometheus/alertmanager/issues/452
2016-08-12 02:59:41 +02:00
Brian Brazil 6fc88d4b4d Remove __name__ from alerts sent to AM.
Fixes #1861
2016-08-01 23:32:41 +01:00
Brian Brazil 0509b0f2db Expand alert templates at eval time.
Fixes #1678 #1677
2016-07-12 17:13:55 +01:00
beorn7 b95c096a45 Fix style issues in rules/... 2016-05-19 16:59:53 +02:00
Fabian Reinartz d89c254849 Make copying alerting state safer.
This considers static labels in the equality of alerts to
avoid falsely copying state from a different alert definition with
the same name across reloads.

To be safe, it also copies the state map rather than just its pointer
so that remaining collisions disappear after one evaluation interval.
2016-03-02 12:21:54 +01:00
Tobias Schmidt f1f8317fa5 Fix detection of flapping alerts
Alerts in the resolve retention period must be transitioned to the
active state again when their condition is met.
2016-02-04 23:55:12 -05:00
beorn7 a7408bfb47 Unify duration parsing
It's actually happening in several places (and for flags, we use the
standard Go time.Duration...). This at least reduces all our
home-grown parsing to one place (in model).
2016-01-29 15:41:50 +01:00
Fabian Reinartz a6935024e1 Remove old WITH clause in alert printing 2016-01-26 15:45:27 +01:00
Fabian Reinartz 0cf3c6a9ef Add comments, rename a method 2015-12-23 12:29:28 +01:00
Fabian Reinartz bf6abac8f4 Send resolved notifications 2015-12-17 15:42:26 +01:00
Fabian Reinartz 52e5224f5a Refactor rules/ package 2015-12-17 15:42:25 +01:00
Fabian Reinartz 7c90db22ed Use annotation based alerts in rules/
This commit breaks the previously used alert format.
2015-12-14 10:16:07 +01:00
Julius Volz 995d3b831d Fix most golint warnings.
This is with `golint -min_confidence=0.5`.

I left several lint warnings untouched because they were either
incorrect or I felt it was better not to change them at the moment.
2015-08-26 12:44:46 +02:00
Fabian Reinartz d6b8da8d43 Switch promql types to common/model 2015-08-25 13:49:14 +02:00
Fabian Reinartz 306e8468a0 Switch from client_golang/model to common/model 2015-08-21 13:33:38 +02:00
Fabian Reinartz 749ae450c5 promql: add runbook to alert statement.
This commit adds the RUNBOOK keyword to alert statements. The field
is optional and expected to be a link.
2015-06-25 13:00:52 +02:00
Julius Volz d868264bb8 Improve UI of /alerts page.
Changes to the UI:
- "Active Since" timestamps are now human-readable.
- Alerting rules are now pretty-printed better.
- Labels are no longer just strings, but alert bubbles (like we do on
  the status page for base labels).
- Alert states and target health states are now capitalized in the
  presentation layer rather than at the source.
2015-06-23 18:48:45 +02:00