Ganesh Vernekar
f1db699dff
Persist alert 'for' state across restarts ( #4061 )
...
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-08-02 11:18:24 +01:00
Max Leonard Inden
71fafad099
api/v1: Coninue work exposing rules and alerts
...
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-07-30 15:31:51 +02:00
mg03
31f8ca0dfb
api v1 alerts/rules json endpoint
...
Signed-off-by: mg03 <mgeng03@gmail.com>
2018-07-30 15:29:44 +02:00
Bryan Boreham
afdb66dfac
Expose Group.CopyState() ( #4304 )
...
This makes the `rules` package more useful to projects that use
Prometheus as a library.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-07-18 15:14:38 +02:00
Julius Volz
9e3171f6e3
rules: Minor naming/comment cleanups ( #4328 )
...
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 04:54:33 +01:00
Bryan Boreham
2bd510a63e
Make TestUpdate() do some work ( #4306 )
...
Previously it would set no preconditions and check no postconditions,
as the `groups` member was empty.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-06-22 15:21:04 +01:00
Alin Sinpalean
9dc763cc03
Run rule evaluation with timestamps precisely evaluation_interval apart ( #4201 )
...
* Run rule evaluation with timestamps precisely evaluation_interval apart from one another.
Signed-off-by: Alin Sinpalean <alin.sinpalean@gmail.com>
2018-06-01 15:23:07 +01:00
Mario Trangoni
464e747f1e
fix some comments typos ( #4059 )
2018-04-08 10:51:54 +01:00
Bryan Boreham
93494d8b7e
Add an OpenTracing span for each rule ( #4027 )
...
* Add an OpenTracing span for each rule
So that tags and child spans can be traced back to the rule that they
refer to.
2018-03-30 21:29:19 +01:00
ferhat elmas
ec8e4d8a7c
all: remove unnecessary type conversions ( #3992 )
...
excep promql due to not to create conflict with #3966 .
2018-03-21 09:25:22 +00:00
Warren Fernandes
58e2a31db8
Cleans up test by removing unused function ( #3969 )
2018-03-15 08:59:19 +00:00
ferhat elmas
ffa673f7d8
General simplifications ( #3887 )
...
Another try as in #1516
2018-02-26 07:58:10 +00:00
Fabian Reinartz
7ccd4b39b8
*: implement query params
...
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
2018-02-13 12:17:22 +01:00
Simon Pasquier
81c0ab69e0
Don't reset FiredAt for inactive alerts
...
Otherwise AlertManager receives resolved alerts where StartsAt is zero which
fails the validation.
2018-01-22 17:17:33 +01:00
Brian Brazil
30b4439bbd
Remove rule_type label from rule metrics.
...
This is not really needed now that we have rule groups
to distinguish rules.
2017-12-04 11:44:38 +00:00
Brian Brazil
b97f4cf48c
Add metrics for rule group interval and last duration.
2017-12-04 11:44:38 +00:00
Brian Brazil
0a42a9fc8f
Copy over rule group duration on reload.
...
This is currently getting lost, this will soon be in a metric and we
don't want it dropping to 0 on every reload.
2017-12-04 11:44:38 +00:00
Brian Brazil
aa370fa568
Clarify metric names around rule groups.
...
Make it clear they're about overall rule groups.
2017-12-04 11:44:38 +00:00
Fabian Reinartz
62461379b7
rules: decouple notifier packages
...
The dependency on the notifier packages caused a transitive dependency
on discovery and with that all client libraries our service discovery
uses.
2017-11-27 16:38:14 +01:00
Fabian Reinartz
4d964a0a0d
rules: make glob expansion a concern of main
2017-11-24 08:22:57 +01:00
Fabian Reinartz
bd9f7460eb
rules: remove config package dependency
2017-11-24 07:57:54 +01:00
Fabian Reinartz
2d0e3746ac
rules: remove dependency on promql.Engine
2017-11-24 07:57:54 +01:00
Fabian Reinartz
2ec5965b75
Merge pull request #3508 from prometheus/uptsdb
...
update TSDB
2017-11-23 19:11:54 +01:00
Fabian Reinartz
83cd270ea4
*: adapt to storage interface changes
2017-11-23 19:05:04 +01:00
Goutham Veeramachaneni
a880c86375
Fix unexported method on exported interface.
...
Also move to model.Duration
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-11-23 19:13:57 +05:30
conorbroderick
55aaece116
Add rule evaluation time
2017-11-22 15:22:02 +00:00
Goutham Veeramachaneni
e1117715fe
rules: remove skipped iterations cuz no throttling
...
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-11-14 17:33:00 +05:30
Jorge Hernández
6cd0f63eb1
Use testutil in rules subpackage ( #3278 )
...
* Use testutil in rules subpackage
* Fix manager test
* Use testutil in rules subpackage
* Fix manager test
* Fix rebase
* Change to testutil for applyConfig tests
2017-11-11 11:29:47 +01:00
Krasi Georgiev
e86d82ad2d
Fix regression of alert rules state loss on config reload. ( #3382 )
...
* incorrect map name for the group prevented copying state from existing alert rules on config reload
* applyConfig test
* few nits
* nits 2
2017-11-01 12:58:00 +01:00
Julius Volz
099df0c5f0
Migrate "golang.org/x/net/context" -> "context" ( #3333 )
...
In some places, where ctxhttp or gRPC are concerned, we still need to use the
old contexts.
2017-10-24 21:21:42 -07:00
Brian Brazil
cc5499fcad
Only close after checking for err.
2017-10-09 19:44:03 +01:00
Brian Brazil
ee88f0d222
Ensure all values are used or _
2017-10-09 19:44:03 +01:00
Fabian Reinartz
2d0b8e8b94
Merge branch 'master' into dev-2.0
2017-10-05 13:09:18 +02:00
Julius Volz
f7e8348a88
Re-add contexts to storage.Storage.Querier() ( #3230 )
...
* Re-add contexts to storage.Storage.Querier()
These are needed when replacing the storage by a multi-tenant
implementation where the tenant is stored in the context.
The 1.x query interfaces already had contexts, but they got lost in 2.x.
* Convert promql.Engine to use native contexts
2017-10-04 21:04:15 +02:00
beorn7
c2e9a151ab
Make all rule links link to the "Console" tab rather than "Graph"
...
Clicking on a rule, either the name or the expression, opens the rule
result (or the corresponding expression, repsectively) in the
expression browser. This should by default happen in the console tab,
as, more often than not, displaying it in the graph tab runs into a
timeout.
2017-09-21 18:28:00 +02:00
Fabian Reinartz
d21f149745
*: migrate to go-kit/log
2017-09-08 22:01:51 +05:30
Goutham Veeramachaneni
e1fc9dc78d
Move /rules to new format ( #2901 )
...
Fixes #2891
Signed-off-by: Goutham Veeramachaneni <goutham@boomerangcommerce.com>
2017-07-08 11:38:02 +02:00
Goutham Veeramachaneni
37e7b69f56
Merge remote-tracking branch 'upstream/dev-2.0' into rulegroups
...
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-19 16:34:55 +05:30
Goutham Veeramachaneni
c472316fb3
Check done before every rule evaluation.
...
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 16:57:22 +05:30
Goutham Veeramachaneni
6b70a4d850
Incorporate PR feedback
...
* Move fingerprint to Hash()
* Move away from tsdb.MultiError
* 0777 -> 0666 for files
* checkOverflow of extra fields
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 16:44:33 +05:30
Goutham Veeramachaneni
507790a357
Rework logging to use explicitly passed logger
...
Mostly cleaned up the global logger use. Still some uses in discovery
package.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 15:52:44 +05:30
Goutham Veeramachaneni
dc69645e92
Move back to go-yaml
...
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 10:46:21 +05:30
Goutham Veeramachaneni
5ff283a7b7
Reflect the grouping in the UI
...
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-14 16:09:14 +05:30
Goutham Veeramachaneni
8cca666cf2
Add file name to group.
...
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-14 15:18:39 +05:30
Goutham Veeramachaneni
e893c89333
Validate labels and annotations
...
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-14 15:07:54 +05:30
Goutham Veeramachaneni
a48a018368
Make sure groups are unique in a single file
...
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-14 12:19:21 +05:30
Goutham Veeramachaneni
cea1e99f78
Add update-rules command to promtool
...
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-14 11:38:54 +05:30
Goutham Veeramachaneni
e8f55669ea
Move rules to new format
...
Signed-off-by: Goutham Veeramachaneni <goutham@boomerangcommerce.com>
2017-06-12 18:14:39 +05:30
Brian Brazil
dcea3e4773
Don't append a 0 when alert is no longer pending/firing
...
With staleness we no longer need this behaviour.
2017-05-24 13:52:45 +01:00
Brian Brazil
cc867dae60
Copy previous series and alert state more intelligently.
...
Usually rules don't more around, and if they do it's likely
that rules/alerts with the same name stay in the same order.
If rules/alerts with the same name are added/removed this
could cause a blip for one cycle, but this is unavoidable
without requiring rule and alert names to be unique - which we don't
want to do.
2017-05-24 13:52:45 +01:00
Brian Brazil
9bc68db7e6
Track staleness per rule rather than per group.
2017-05-24 13:52:45 +01:00
Brian Brazil
0451d6d31b
Add unittest for rule staleness, and rules generally.
2017-05-24 13:52:45 +01:00
Brian Brazil
0400f3cfd2
Very basic staleness handling for rules.
2017-05-24 13:52:45 +01:00
Fabian Reinartz
06c2b76cd4
Merge branch 'master' into uptsdb
2017-05-16 16:48:37 +02:00
Alexey Palazhchenko
b0e1ea7c6c
Simplify code, fix typos. ( #2719 )
2017-05-15 09:56:09 +01:00
Julius Volz
ac203ef0ee
Add externalURL template function ( #2716 )
...
This allows users to e.g. add links back to the generating Prometheus
right in their alert templates.
2017-05-13 15:47:04 +02:00
Julius Volz
fe11c5933a
Fix mutation of active alert elements by notifier ( #2656 )
...
This caused the external label application in the notifier to bleed back
into the rule manager's active alerting elements.
2017-04-26 10:29:42 -05:00
Fabian Reinartz
8ffc851147
Merge branch 'master' into dev-2.0
2017-04-04 15:17:56 +02:00
Tobias Schmidt
eaf33759fb
Register forgotten prometheus_evaluator_iterations_total metric
2017-04-02 20:32:56 -03:00
Tobias Schmidt
aaaba57184
Export number of missed rule evaluations
...
In case the execution of all rules takes longer than the configured rule
evaluation interval, one or more iterations will be skipped. This needs
to be visible to the opterator.
2017-04-02 20:03:28 -03:00
Fabian Reinartz
5772f1a7ba
retrieval/storage: adapt to new interface
...
This simplifies the interface to two add methods for
appends with labels or faster reference numbers.
2017-02-02 13:05:46 +01:00
Fabian Reinartz
ad9bc62e4c
storage: extend appender and adapt it
2017-01-13 14:48:01 +01:00
Fabian Reinartz
e94b0899ee
rules: fix tests, remove model types
2016-12-29 17:31:14 +01:00
Fabian Reinartz
f8fc1f5bb2
*: migrate ingestion to new batch Appender
2016-12-29 11:03:56 +01:00
Fabian Reinartz
fecf9532b9
*: fix misc compile errors
2016-12-25 11:42:57 +01:00
Fabian Reinartz
622ece6273
*: fix recording tests, migrate matcher types
2016-12-25 11:12:57 +01:00
Fabian Reinartz
5817cb5bde
*: migrate from model.* to promql.* types
2016-12-25 00:37:46 +01:00
Fabian Reinartz
e68a3cf21f
rules: update annotations on each iteration
2016-11-22 15:43:07 +01:00
Jonathan Lange
d78dd3593d
Set evaluation interval on Group construction
...
Prevents having object in invalid state, and allows users of public API
to construct valid Groups.
2016-11-18 16:32:30 +00:00
Jonathan Lange
31fc357cd8
Make NewGroup and Group.Eval public
...
Allows callers to execute evaluate lists of rules without first writing
them to disk.
2016-11-18 16:25:58 +00:00
Jonathan Lange
2a2da40223
Make rule evaluation publicly available
...
Means that a third-party can parse rules and run them with their own
execution model.
2016-11-18 16:12:50 +00:00
Matt Bostock
926a5ab3dd
rules/manager.go: Fix race between reload and stop
...
On one relatively large Prometheus instance (1.7M series), I noticed
that upgrades were frequently resulting in Prometheus undergoing crash
recovery on start-up.
On closer examination, I found that Prometheus was panicking on
shutdown.
It seems that our configuration management (or misconfiguration thereof)
is reloading Prometheus then immediately restarting it, which I suspect
is causing this race:
Sep 21 15:12:42 host systemd[1]: Reloading prometheus monitoring system.
Sep 21 15:12:42 host prometheus[18734]: time="2016-09-21T15:12:42Z" level=info msg="Loading configuration file /etc/prometheus/config.yaml" source="main.go:221"
Sep 21 15:12:42 host systemd[1]: Reloaded prometheus monitoring system.
Sep 21 15:12:44 host systemd[1]: Stopping prometheus monitoring system...
Sep 21 15:12:44 host prometheus[18734]: time="2016-09-21T15:12:44Z" level=warning msg="Received SIGTERM, exiting gracefully..." source="main.go:203"
Sep 21 15:12:44 host prometheus[18734]: time="2016-09-21T15:12:44Z" level=info msg="See you next time!" source="main.go:210"
Sep 21 15:12:44 host prometheus[18734]: time="2016-09-21T15:12:44Z" level=info msg="Stopping target manager..." source="targetmanager.go:90"
Sep 21 15:12:52 host prometheus[18734]: time="2016-09-21T15:12:52Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:548"
Sep 21 15:12:56 host prometheus[18734]: time="2016-09-21T15:12:56Z" level=warning msg="Error on ingesting out-of-order samples" numDropped=1 source="scrape.go:467"
Sep 21 15:12:56 host prometheus[18734]: time="2016-09-21T15:12:56Z" level=error msg="Error adding file watch for \"/etc/prometheus/targets\": no such file or directory" source="file.go:84"
Sep 21 15:12:56 host prometheus[18734]: time="2016-09-21T15:12:56Z" level=error msg="Error adding file watch for \"/etc/prometheus/targets\": no such file or directory" source="file.go:84"
Sep 21 15:13:01 host prometheus[18734]: time="2016-09-21T15:13:01Z" level=info msg="Stopping rule manager..." source="manager.go:366"
Sep 21 15:13:01 host prometheus[18734]: time="2016-09-21T15:13:01Z" level=info msg="Rule manager stopped." source="manager.go:372"
Sep 21 15:13:01 host prometheus[18734]: time="2016-09-21T15:13:01Z" level=info msg="Stopping notification handler..." source="notifier.go:325"
Sep 21 15:13:01 host prometheus[18734]: time="2016-09-21T15:13:01Z" level=info msg="Stopping local storage..." source="storage.go:381"
Sep 21 15:13:01 host prometheus[18734]: time="2016-09-21T15:13:01Z" level=info msg="Stopping maintenance loop..." source="storage.go:383"
Sep 21 15:13:01 host prometheus[18734]: panic: close of closed channel
Sep 21 15:13:01 host prometheus[18734]: goroutine 7686074 [running]:
Sep 21 15:13:01 host prometheus[18734]: panic(0xba57a0, 0xc60c42b500)
Sep 21 15:13:01 host prometheus[18734]: /usr/local/go/src/runtime/panic.go:500 +0x1a1
Sep 21 15:13:01 host prometheus[18734]: github.com/prometheus/prometheus/rules.(*Manager).ApplyConfig.func1(0xc6645a9901, 0xc420271ef0, 0xc420338ed0, 0xc60c42b4f0, 0xc6645a9900)
Sep 21 15:13:01 host prometheus[18734]: /home/build/packages/prometheus/tmp/build/gopath/src/github.com/prometheus/prometheus/rules/manager.go:412 +0x3c
Sep 21 15:13:01 host prometheus[18734]: created by github.com/prometheus/prometheus/rules.(*Manager).ApplyConfig
Sep 21 15:13:01 host prometheus[18734]: /home/build/packages/prometheus/tmp/build/gopath/src/github.com/prometheus/prometheus/rules/manager.go:423 +0x56b
Sep 21 15:13:03 host systemd[1]: prometheus.service: main process exited, code=exited, status=2/INVALIDARGUMENT
2016-09-21 22:03:02 +01:00
Julius Volz
c187308366
storage: Contextify storage interfaces.
...
This is based on https://github.com/prometheus/prometheus/pull/1997 .
This adds contexts to the relevant Storage methods and already passes
PromQL's new per-query context into the storage's query methods.
The immediate motivation supporting multi-tenancy in Frankenstein, but
this could also be used by Prometheus's normal local storage to support
cancellations and timeouts at some point.
2016-09-19 16:29:07 +02:00
Julius Volz
ed5a0f0abe
promql: Allow per-query contexts.
...
For Weaveworks' Frankenstein, we need to support multitenancy. In
Frankenstein, we initially solved this without modifying the promql
package at all: we constructed a new promql.Engine for every
query and injected a storage implementation into that engine which would
be primed to only collect data for a given user.
This is problematic to upstream, however. Prometheus assumes that there
is only one engine: the query concurrency gate is part of the engine,
and the engine contains one central cancellable context to shut down all
queries. Also, creating a new engine for every query seems like overkill.
Thus, we want to be able to pass per-query contexts into a single engine.
This change gets rid of the promql.Engine's built-in base context and
allows passing in a per-query context instead. Central cancellation of
all queries is still possible by deriving all passed-in contexts from
one central one, but this is now the responsibility of the caller. The
central query context is now created in main() and passed into the
relevant components (web handler / API, rule manager).
In a next step, the per-query context would have to be passed to the
storage implementation, so that the storage can implement multi-tenancy
or other features based on the contextual information.
2016-09-19 15:38:17 +02:00
beorn7
75bae065fd
Revert "Modify tests to adjust to reverting the /graph changes"
...
This reverts commit f1ea5bf232
.
Part two necessary for reverting the /graph revert.
2016-09-03 21:08:33 +02:00
beorn7
f1ea5bf232
Modify tests to adjust to reverting the /graph changes
...
These tests have been added after the /graph changes and therefore
already test the new syntax.
This commit has to be reverted together with the previous one to get
back to the old new state. *sigh*
2016-09-02 14:12:31 +02:00
Julius Volz
fe7b8b7fd1
Add missing license header to alerting_test.go
2016-08-13 00:11:52 +02:00
Julius Volz
da7206ec29
Fix rule HTML escaping issues
...
This was mentioned as part of https://github.com/prometheus/alertmanager/issues/452
2016-08-12 02:59:41 +02:00
Brian Brazil
6fc88d4b4d
Remove __name__ from alerts sent to AM.
...
Fixes #1861
2016-08-01 23:32:41 +01:00
Dmitry Vorobev
273e457da4
web: return status code and error message for config resource
2016-07-15 10:15:24 +02:00
Brian Brazil
0509b0f2db
Expand alert templates at eval time.
...
Fixes #1678 #1677
2016-07-12 17:13:55 +01:00
beorn7
064b57858e
Consistently use the Seconds()
method for conversion of durations
...
This also fixes one remaining case of recording integral numbers
of seconds only for a metric, i.e. this will probably fix #1796 .
2016-07-07 15:24:35 +02:00
Fabian Reinartz
f7ed2ff706
Merge pull request #1644 from prometheus/beorn7/logging
...
Add missing logging of out-of-order samples
2016-05-20 05:52:00 -07:00
beorn7
b95c096a45
Fix style issues in rules/...
2016-05-19 16:59:53 +02:00
beorn7
45e5775f9b
Add missing logging of out-of-order samples
...
So far, out-of-order samples during rule evaluation were not logged,
and neither scrape health samples. The latter are unlikely to cause
any errors. That's why I'm logging them always now. (It's alway highly
irregular should it happen.) For rules, I have used the same plumbing
as for samples, just with a different wording in the message to mark
them as a result of rule evaluation.
2016-05-19 16:22:53 +02:00
beorn7
4b574e8a61
Switch chunk encoding to type 2 where it was hardcoded type 1 before
...
The chunk encoding was hardcoded there because it mostly doesn't
matter what encoding is chosen in that test. Since type 1 is
battle-hardened enough, I'm switching to type 2 here so that we can
catch unexpected problems as a byproduct. My expectation is that the
chunk encoding doesn't matter anyway, as said, but then "unexpected
problems" contains the word "unexpected".
2016-03-20 23:32:20 +01:00
Fabian Reinartz
d89c254849
Make copying alerting state safer.
...
This considers static labels in the equality of alerts to
avoid falsely copying state from a different alert definition with
the same name across reloads.
To be safe, it also copies the state map rather than just its pointer
so that remaining collisions disappear after one evaluation interval.
2016-03-02 12:21:54 +01:00
Fabian Reinartz
bfa8aaa017
Rename notification to notifier
2016-03-01 12:39:08 +01:00
beorn7
663a1550d0
Fix the instrumentation fixes
2016-02-17 15:50:55 +01:00
Tobias Schmidt
f1f8317fa5
Fix detection of flapping alerts
...
Alerts in the resolve retention period must be transitioned to the
active state again when their condition is met.
2016-02-04 23:55:12 -05:00
Björn Rabenstein
9ea3897ea7
Merge pull request #1354 from prometheus/beorn7/storage
...
Rework the way to communicate backpressure (AKA suspended ingestion)
2016-02-01 15:10:13 +01:00
beorn7
ec08c9a391
Rework the way to communicate backpressure (AKA suspended ingestion)
...
This gives up on the idea to communicate throuh the Append() call (by
either not returning as it is now or returning an error as
suggested/explored elsewhere). Here I have added a Throttled() call,
which has the advantage that it can be called before a whole _batch_
of Append()'s. Scrapes will happen completely or not at all. Same for
rule group evaluations. That's a highly desired behavior (as discussed
elsewhere). The code is even simpler now as the whole ingestion buffer
could be removed.
Logging of throttled mode has been streamlined and will create at most
one message per minute.
2016-02-01 14:45:44 +01:00
beorn7
a7408bfb47
Unify duration parsing
...
It's actually happening in several places (and for flags, we use the
standard Go time.Duration...). This at least reduces all our
home-grown parsing to one place (in model).
2016-01-29 15:41:50 +01:00
Fabian Reinartz
a6935024e1
Remove old WITH clause in alert printing
2016-01-26 15:45:27 +01:00
Fabian Reinartz
b0adfea8d5
Fix swapped constants, improve instrumentation
2016-01-21 12:15:29 +01:00
Fabian Reinartz
a8c38c3ac5
Don't log rule evaluation failure on shutdown
2016-01-18 17:34:25 +01:00
Fabian Reinartz
6eee86dce8
Terminate rule groups during initial sleep
...
When an evaluation group runs initially, it waits a deterministic
amount of time. During that time it also has to accept
a termination singnal so shutdown doesn't hang during the first
evaluation iteration after a configuration reload.
Fixes #1307
2016-01-12 10:54:09 +01:00
Fabian Reinartz
26eb3ac2f8
Don't skip recording rule errors
2016-01-12 10:26:06 +01:00
Fabian Reinartz
37d80c4b25
Fix premature rule evaluation
...
This commit prevents rule evaluation from starting until after
the storage is ready.
2016-01-08 17:51:22 +01:00
Fabian Reinartz
0cf3c6a9ef
Add comments, rename a method
2015-12-23 12:29:28 +01:00
Fabian Reinartz
bf6abac8f4
Send resolved notifications
2015-12-17 15:42:26 +01:00
Fabian Reinartz
f69e668fc4
Improve rules/ instrumentation
...
This commit adds a counter for the total number of rule evaluations
and standardizes the units to seconds.
2015-12-17 15:42:26 +01:00
Fabian Reinartz
62075aa037
Reduce noisy no-alertmanager warning
2015-12-17 15:42:26 +01:00
Fabian Reinartz
52e5224f5a
Refactor rules/ package
2015-12-17 15:42:25 +01:00
Fabian Reinartz
e4fabe135a
Set StartsAt to time of first firing state
2015-12-17 11:36:58 +01:00
Fabian Reinartz
7c90db22ed
Use annotation based alerts in rules/
...
This commit breaks the previously used alert format.
2015-12-14 10:16:07 +01:00
Fabian Reinartz
e114ce0ff7
Refactor notification handler
2015-12-11 15:17:32 +01:00
Fabian Reinartz
e3b6ec9784
Switch to common/log
2015-10-03 10:21:43 +02:00
Fabian Reinartz
171f50706a
Fix unkeyed field errors.
2015-09-18 17:00:08 +02:00
Brian Brazil
4d196fea6b
Merge pull request #1032 from prometheus/scalar-metric
...
rules: Allow for setting labels on LHS on scalars
2015-08-26 16:56:16 +01:00
Brian Brazil
3bcdb2bbba
rules: Allow for setting labels on LHS on scalars
2015-08-26 16:54:28 +01:00
Julius Volz
995d3b831d
Fix most golint warnings.
...
This is with `golint -min_confidence=0.5`.
I left several lint warnings untouched because they were either
incorrect or I felt it was better not to change them at the moment.
2015-08-26 12:44:46 +02:00
Fabian Reinartz
d6b8da8d43
Switch promql types to common/model
2015-08-25 13:49:14 +02:00
Brian Brazil
fdf0d0642e
Cast value to float, as that's what the console templates expect.
2015-08-24 16:59:08 +01:00
Fabian Reinartz
438e232c9b
Fix grouping of import blocks
2015-08-22 09:42:45 +02:00
Fabian Reinartz
306e8468a0
Switch from client_golang/model to common/model
2015-08-21 13:33:38 +02:00
Brian Brazil
e6a67476c2
rules: Allow recorded rules expressions to be scalars.
...
This is useful if you want to build up a constant metric,
such as a set of alert thresholds that vary by label value.
2015-08-19 21:09:00 +01:00
Fabian Reinartz
7a67472fc1
Resolve relative paths on configuration loading
...
This moves the concern of resolving the files relative to the config
file into the configuration loading itself.
It also fixes #921 which did not load the cert and token files relatively.
2015-08-05 18:08:04 +02:00
Fabian Reinartz
feb8a03503
rules: load rule files relative to a base dir
2015-07-03 15:10:37 +02:00
Julius Volz
fcff35b43e
Consolidate external reachability flags into one.
...
Besides fixing https://github.com/prometheus/prometheus/issues/805 by
making the entire externally reachable server URL configurable, this
adds tests for the "globalURL" template function and makes it easier to
test other such functions in the future.
This breaks the `web.Hostname` flag (and introduces `web.external-url`).
This flag is likely only used by few users, so I hope that's
justifiable.
Fixes https://github.com/prometheus/prometheus/issues/805
2015-07-03 13:39:10 +02:00
Fabian Reinartz
f06cf664e1
rules: cleanup alerting test
2015-06-30 18:22:24 +02:00
Fabian Reinartz
9bd4f6d017
rules: preserve alert state across reloads.
2015-06-30 11:32:07 +02:00
Fabian Reinartz
4625485b84
rules: move rules*.go contents to manager*.go
2015-06-30 11:32:07 +02:00
Fabian Reinartz
749ae450c5
promql: add runbook to alert statement.
...
This commit adds the RUNBOOK keyword to alert statements. The field
is optional and expected to be a link.
2015-06-25 13:00:52 +02:00
Julius Volz
d868264bb8
Improve UI of /alerts page.
...
Changes to the UI:
- "Active Since" timestamps are now human-readable.
- Alerting rules are now pretty-printed better.
- Labels are no longer just strings, but alert bubbles (like we do on
the status page for base labels).
- Alert states and target health states are now capitalized in the
presentation layer rather than at the source.
2015-06-23 18:48:45 +02:00
Fabian Reinartz
fe301d7946
promql: remove global flags
2015-06-15 19:01:06 +02:00
Fabian Reinartz
5e13880201
General cleanup of rules.
2015-06-06 21:40:52 +02:00
Fabian Reinartz
75c920c95e
Remove DotGraph method from Rule interface
2015-06-06 21:35:59 +02:00
Fabian Reinartz
83d07516e8
Remove EvalRaw methods from Rule interface
2015-06-06 21:34:09 +02:00
Fabian Reinartz
280d11dca8
main: exit on invalid rule files on startup.
2015-06-02 18:44:41 +02:00
Fabian Reinartz
0de6edbdfc
Move pkg/ to util/
2015-06-01 21:12:32 +02:00
Fabian Reinartz
02717e6fde
Remove generic set type
2015-06-01 21:12:32 +02:00
Fabian Reinartz
dbc0d30e3e
Move string functionality to pkg/strutil
2015-06-01 21:12:32 +02:00
Fabian Reinartz
f45a5cab60
Move templates package to pkg/template
2015-06-01 21:12:31 +02:00
Fabian Reinartz
c44ac7bc26
Load rule files from entire directories
2015-06-01 21:12:31 +02:00
Julius Volz
d7c015c149
Convert pathPrefix to not have trailing slash.
2015-06-01 12:43:17 +02:00
Julius Volz
ff53d10849
Fix double slash in GeneratorURL sent to alertmanager.
...
Fixes https://github.com/prometheus/prometheus/issues/722
2015-05-23 19:16:57 +02:00
Julius Volz
267fd34156
Switch Prometheus to use github.com/prometheus/log.
...
This change is conceptually very simple, although the diff is large. It
switches logging from "github.com/golang/glog" to
"github.com/prometheus/log", while not actually changing any log
messages. V(1)-style logging has been changed to be log.Debug*().
2015-05-20 18:19:32 +02:00
Fabian Reinartz
e2ed921505
Merge branch 'master' into fabxc/servdisc
2015-05-20 14:13:08 +02:00
Mitsuhiro Tanda
3e914a8cb1
fix graph links with path prefix
2015-05-19 02:45:05 +09:00
Fabian Reinartz
bb540fd9fd
Implement config reloading on SIGHUP.
...
With this commit, sending SIGHUP to the Prometheus process will reload
and apply the configuration file. The different components attempt
to handle failing changes gracefully.
2015-05-13 16:49:46 +02:00
Fabian Reinartz
fe935179cd
Stop routing rule statements through the engine.
2015-04-29 18:01:43 +02:00
Fabian Reinartz
8d7c479fed
Merge pull request #658 from prometheus/fabxc/pql/rules-manager
...
Rename RuleManager to Manager, remove interface.
2015-04-29 16:54:21 +02:00
Fabian Reinartz
479891c9be
Rename RuleManager to Manager, remove interface.
...
This commits renames the RuleManager to Manager as the package
name is 'rules' now. The unused layer of abstraction of the
RuleManager interface is removed.
2015-04-29 16:42:10 +02:00
Fabian Reinartz
25cdff3527
Remove name
arg from Parse*
functions, enhance parsing errors.
2015-04-29 16:38:41 +02:00
Fabian Reinartz
3ca11bcaf5
Switch Prometheus to promql package.
...
This commit removes all functionality from rules/ that is now handled in
promql/.
All parts of Prometheus are changed to use the promql/ package.
2015-04-28 16:19:23 +02:00
Ceesjan Luiten
0e18784c64
Make all paths absolute to support proxies
2015-04-02 20:36:47 +02:00
Brian Brazil
941f585164
Avoid +InfYs and similar, just display +Inf.
2015-03-28 18:51:41 +00:00
beorn7
a075900f9a
Merge branch 'beorn7/persistence' into beorn7/ingestion-tweaks
2015-03-18 19:09:31 +01:00
Fabian Reinartz
624f27f4b6
Add ln, log2, log10 and exp functions to the query language.
2015-03-16 18:26:19 +01:00