prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-12-26 06:04:05 -08:00

Author	SHA1	Message	Date
Julius Volz	fe7b8b7fd1	Add missing license header to alerting_test.go	2016-08-13 00:11:52 +02:00
Julius Volz	da7206ec29	Fix rule HTML escaping issues This was mentioned as part of https://github.com/prometheus/alertmanager/issues/452	2016-08-12 02:59:41 +02:00
Brian Brazil	6fc88d4b4d	Remove __name__ from alerts sent to AM. Fixes #1861	2016-08-01 23:32:41 +01:00
Dmitry Vorobev	273e457da4	web: return status code and error message for config resource	2016-07-15 10:15:24 +02:00
Brian Brazil	0509b0f2db	Expand alert templates at eval time. Fixes #1678 #1677	2016-07-12 17:13:55 +01:00
beorn7	064b57858e	Consistently use the `Seconds()` method for conversion of durations This also fixes one remaining case of recording integral numbers of seconds only for a metric, i.e. this will probably fix #1796.	2016-07-07 15:24:35 +02:00
Fabian Reinartz	f7ed2ff706	Merge pull request #1644 from prometheus/beorn7/logging Add missing logging of out-of-order samples	2016-05-20 05:52:00 -07:00
beorn7	b95c096a45	Fix style issues in rules/...	2016-05-19 16:59:53 +02:00
beorn7	45e5775f9b	Add missing logging of out-of-order samples So far, out-of-order samples during rule evaluation were not logged, and neither scrape health samples. The latter are unlikely to cause any errors. That's why I'm logging them always now. (It's alway highly irregular should it happen.) For rules, I have used the same plumbing as for samples, just with a different wording in the message to mark them as a result of rule evaluation.	2016-05-19 16:22:53 +02:00
beorn7	4b574e8a61	Switch chunk encoding to type 2 where it was hardcoded type 1 before The chunk encoding was hardcoded there because it mostly doesn't matter what encoding is chosen in that test. Since type 1 is battle-hardened enough, I'm switching to type 2 here so that we can catch unexpected problems as a byproduct. My expectation is that the chunk encoding doesn't matter anyway, as said, but then "unexpected problems" contains the word "unexpected".	2016-03-20 23:32:20 +01:00
Fabian Reinartz	d89c254849	Make copying alerting state safer. This considers static labels in the equality of alerts to avoid falsely copying state from a different alert definition with the same name across reloads. To be safe, it also copies the state map rather than just its pointer so that remaining collisions disappear after one evaluation interval.	2016-03-02 12:21:54 +01:00
Fabian Reinartz	bfa8aaa017	Rename notification to notifier	2016-03-01 12:39:08 +01:00
beorn7	663a1550d0	Fix the instrumentation fixes	2016-02-17 15:50:55 +01:00
Tobias Schmidt	f1f8317fa5	Fix detection of flapping alerts Alerts in the resolve retention period must be transitioned to the active state again when their condition is met.	2016-02-04 23:55:12 -05:00
Björn Rabenstein	9ea3897ea7	Merge pull request #1354 from prometheus/beorn7/storage Rework the way to communicate backpressure (AKA suspended ingestion)	2016-02-01 15:10:13 +01:00
beorn7	ec08c9a391	Rework the way to communicate backpressure (AKA suspended ingestion) This gives up on the idea to communicate throuh the Append() call (by either not returning as it is now or returning an error as suggested/explored elsewhere). Here I have added a Throttled() call, which has the advantage that it can be called before a whole _batch_ of Append()'s. Scrapes will happen completely or not at all. Same for rule group evaluations. That's a highly desired behavior (as discussed elsewhere). The code is even simpler now as the whole ingestion buffer could be removed. Logging of throttled mode has been streamlined and will create at most one message per minute.	2016-02-01 14:45:44 +01:00
beorn7	a7408bfb47	Unify duration parsing It's actually happening in several places (and for flags, we use the standard Go time.Duration...). This at least reduces all our home-grown parsing to one place (in model).	2016-01-29 15:41:50 +01:00
Fabian Reinartz	a6935024e1	Remove old WITH clause in alert printing	2016-01-26 15:45:27 +01:00
Fabian Reinartz	b0adfea8d5	Fix swapped constants, improve instrumentation	2016-01-21 12:15:29 +01:00
Fabian Reinartz	a8c38c3ac5	Don't log rule evaluation failure on shutdown	2016-01-18 17:34:25 +01:00
Fabian Reinartz	6eee86dce8	Terminate rule groups during initial sleep When an evaluation group runs initially, it waits a deterministic amount of time. During that time it also has to accept a termination singnal so shutdown doesn't hang during the first evaluation iteration after a configuration reload. Fixes #1307	2016-01-12 10:54:09 +01:00
Fabian Reinartz	26eb3ac2f8	Don't skip recording rule errors	2016-01-12 10:26:06 +01:00
Fabian Reinartz	37d80c4b25	Fix premature rule evaluation This commit prevents rule evaluation from starting until after the storage is ready.	2016-01-08 17:51:22 +01:00
Fabian Reinartz	0cf3c6a9ef	Add comments, rename a method	2015-12-23 12:29:28 +01:00
Fabian Reinartz	bf6abac8f4	Send resolved notifications	2015-12-17 15:42:26 +01:00
Fabian Reinartz	f69e668fc4	Improve rules/ instrumentation This commit adds a counter for the total number of rule evaluations and standardizes the units to seconds.	2015-12-17 15:42:26 +01:00
Fabian Reinartz	62075aa037	Reduce noisy no-alertmanager warning	2015-12-17 15:42:26 +01:00
Fabian Reinartz	52e5224f5a	Refactor rules/ package	2015-12-17 15:42:25 +01:00
Fabian Reinartz	e4fabe135a	Set StartsAt to time of first firing state	2015-12-17 11:36:58 +01:00
Fabian Reinartz	7c90db22ed	Use annotation based alerts in rules/ This commit breaks the previously used alert format.	2015-12-14 10:16:07 +01:00
Fabian Reinartz	e114ce0ff7	Refactor notification handler	2015-12-11 15:17:32 +01:00
Fabian Reinartz	e3b6ec9784	Switch to common/log	2015-10-03 10:21:43 +02:00
Fabian Reinartz	171f50706a	Fix unkeyed field errors.	2015-09-18 17:00:08 +02:00
Brian Brazil	4d196fea6b	Merge pull request #1032 from prometheus/scalar-metric rules: Allow for setting labels on LHS on scalars	2015-08-26 16:56:16 +01:00
Brian Brazil	3bcdb2bbba	rules: Allow for setting labels on LHS on scalars	2015-08-26 16:54:28 +01:00
Julius Volz	995d3b831d	Fix most golint warnings. This is with `golint -min_confidence=0.5`. I left several lint warnings untouched because they were either incorrect or I felt it was better not to change them at the moment.	2015-08-26 12:44:46 +02:00
Fabian Reinartz	d6b8da8d43	Switch promql types to common/model	2015-08-25 13:49:14 +02:00
Brian Brazil	fdf0d0642e	Cast value to float, as that's what the console templates expect.	2015-08-24 16:59:08 +01:00
Fabian Reinartz	438e232c9b	Fix grouping of import blocks	2015-08-22 09:42:45 +02:00
Fabian Reinartz	306e8468a0	Switch from client_golang/model to common/model	2015-08-21 13:33:38 +02:00
Brian Brazil	e6a67476c2	rules: Allow recorded rules expressions to be scalars. This is useful if you want to build up a constant metric, such as a set of alert thresholds that vary by label value.	2015-08-19 21:09:00 +01:00
Fabian Reinartz	7a67472fc1	Resolve relative paths on configuration loading This moves the concern of resolving the files relative to the config file into the configuration loading itself. It also fixes #921 which did not load the cert and token files relatively.	2015-08-05 18:08:04 +02:00
Fabian Reinartz	feb8a03503	rules: load rule files relative to a base dir	2015-07-03 15:10:37 +02:00
Julius Volz	fcff35b43e	Consolidate external reachability flags into one. Besides fixing https://github.com/prometheus/prometheus/issues/805 by making the entire externally reachable server URL configurable, this adds tests for the "globalURL" template function and makes it easier to test other such functions in the future. This breaks the `web.Hostname` flag (and introduces `web.external-url`). This flag is likely only used by few users, so I hope that's justifiable. Fixes https://github.com/prometheus/prometheus/issues/805	2015-07-03 13:39:10 +02:00
Fabian Reinartz	f06cf664e1	rules: cleanup alerting test	2015-06-30 18:22:24 +02:00
Fabian Reinartz	9bd4f6d017	rules: preserve alert state across reloads.	2015-06-30 11:32:07 +02:00
Fabian Reinartz	4625485b84	rules: move rules.go contents to manager.go	2015-06-30 11:32:07 +02:00
Fabian Reinartz	749ae450c5	promql: add runbook to alert statement. This commit adds the RUNBOOK keyword to alert statements. The field is optional and expected to be a link.	2015-06-25 13:00:52 +02:00
Julius Volz	d868264bb8	Improve UI of /alerts page. Changes to the UI: - "Active Since" timestamps are now human-readable. - Alerting rules are now pretty-printed better. - Labels are no longer just strings, but alert bubbles (like we do on the status page for base labels). - Alert states and target health states are now capitalized in the presentation layer rather than at the source.	2015-06-23 18:48:45 +02:00
Fabian Reinartz	fe301d7946	promql: remove global flags	2015-06-15 19:01:06 +02:00
Fabian Reinartz	5e13880201	General cleanup of rules.	2015-06-06 21:40:52 +02:00
Fabian Reinartz	75c920c95e	Remove DotGraph method from Rule interface	2015-06-06 21:35:59 +02:00
Fabian Reinartz	83d07516e8	Remove EvalRaw methods from Rule interface	2015-06-06 21:34:09 +02:00
Fabian Reinartz	280d11dca8	main: exit on invalid rule files on startup.	2015-06-02 18:44:41 +02:00
Fabian Reinartz	0de6edbdfc	Move pkg/ to util/	2015-06-01 21:12:32 +02:00
Fabian Reinartz	02717e6fde	Remove generic set type	2015-06-01 21:12:32 +02:00
Fabian Reinartz	dbc0d30e3e	Move string functionality to pkg/strutil	2015-06-01 21:12:32 +02:00
Fabian Reinartz	f45a5cab60	Move templates package to pkg/template	2015-06-01 21:12:31 +02:00
Fabian Reinartz	c44ac7bc26	Load rule files from entire directories	2015-06-01 21:12:31 +02:00
Julius Volz	d7c015c149	Convert pathPrefix to not have trailing slash.	2015-06-01 12:43:17 +02:00
Julius Volz	ff53d10849	Fix double slash in GeneratorURL sent to alertmanager. Fixes https://github.com/prometheus/prometheus/issues/722	2015-05-23 19:16:57 +02:00
Julius Volz	267fd34156	Switch Prometheus to use github.com/prometheus/log. This change is conceptually very simple, although the diff is large. It switches logging from "github.com/golang/glog" to "github.com/prometheus/log", while not actually changing any log messages. V(1)-style logging has been changed to be log.Debug*().	2015-05-20 18:19:32 +02:00
Fabian Reinartz	e2ed921505	Merge branch 'master' into fabxc/servdisc	2015-05-20 14:13:08 +02:00
Mitsuhiro Tanda	3e914a8cb1	fix graph links with path prefix	2015-05-19 02:45:05 +09:00
Fabian Reinartz	bb540fd9fd	Implement config reloading on SIGHUP. With this commit, sending SIGHUP to the Prometheus process will reload and apply the configuration file. The different components attempt to handle failing changes gracefully.	2015-05-13 16:49:46 +02:00
Fabian Reinartz	fe935179cd	Stop routing rule statements through the engine.	2015-04-29 18:01:43 +02:00
Fabian Reinartz	8d7c479fed	Merge pull request #658 from prometheus/fabxc/pql/rules-manager Rename RuleManager to Manager, remove interface.	2015-04-29 16:54:21 +02:00
Fabian Reinartz	479891c9be	Rename RuleManager to Manager, remove interface. This commits renames the RuleManager to Manager as the package name is 'rules' now. The unused layer of abstraction of the RuleManager interface is removed.	2015-04-29 16:42:10 +02:00
Fabian Reinartz	25cdff3527	Remove `name` arg from `Parse*` functions, enhance parsing errors.	2015-04-29 16:38:41 +02:00
Fabian Reinartz	3ca11bcaf5	Switch Prometheus to promql package. This commit removes all functionality from rules/ that is now handled in promql/. All parts of Prometheus are changed to use the promql/ package.	2015-04-28 16:19:23 +02:00
Ceesjan Luiten	0e18784c64	Make all paths absolute to support proxies	2015-04-02 20:36:47 +02:00
Brian Brazil	941f585164	Avoid +InfYs and similar, just display +Inf.	2015-03-28 18:51:41 +00:00
beorn7	a075900f9a	Merge branch 'beorn7/persistence' into beorn7/ingestion-tweaks	2015-03-18 19:09:31 +01:00
Fabian Reinartz	624f27f4b6	Add ln, log2, log10 and exp functions to the query language.	2015-03-16 18:26:19 +01:00
Julius Volz	b2651027fc	Fix special value handling in division and modulo. This fixes https://github.com/prometheus/prometheus/issues/597	2015-03-16 14:23:40 +01:00
beorn7	be11cb2b07	Remove the sample ingestion channel. The one central sample ingestion channel has caused a variety of trouble. This commit removes it. Targets and rule evaluation call an Append method directly now. To incorporate multiple storage backends (like OpenTSDB), storage.Tee forks the Append into two different appenders. Note that the tsdb queue manager had its own queue anyway. It was a queue after a queue... Much queue, so overhead... Targets have their own little buffer (implemented as a channel) to avoid stalling during an http scrape. But a new scrape will only be started once the old one is fully ingested. The contraption of three pipelined ingesters was removed. A Target is an ingester itself now. Despite more logic in Target, things should be less confusing now. Also, remove lint and vet warnings in ast.go.	2015-03-15 14:08:22 +01:00
beorn7	13fcf1ddbc	Implement double-delta encoded chunks.	2015-03-05 20:33:26 +01:00
beorn7	9e85ab0eef	Apply the new signature/fingerprinting functions from client_golang. This requires the new version of client_golang (vendoring will follow in the next commit), which changes the fingerprinting for clientmodel.Metric.	2015-03-03 18:34:01 +01:00
Fabian Reinartz	182de6b99f	Fix unary +/- expressions. Unary expressions cause parsing errors if they are done in the lexer by tokenizing them into the number. This fix moves unary expressions to the parser.	2015-03-03 13:30:08 +01:00
Fabian Reinartz	6f754073d5	Add OR operation and vector matching options. This commits implements the OR operation between two vectors. Vector matching using the ON clause is added to limit the set of labels that define a match between two elements. Group modifiers (GROUP_LEFT/GROUP_RIGHT) to request many-to-one matching are added.	2015-03-03 11:35:10 +01:00
Julius Volz	0ac931aed1	Also support parsing float formats like "2.".	2015-03-02 12:58:05 +01:00
Julius Volz	c2ab54e9a6	Support scientific notation and special float values. This adds support for scientific notation in the expression language, as well as for all possible literal forms of +Inf/-Inf/NaN. TODO: Keep enough state in the parser/lexer to distinguish contexts in which "Inf", "NaN", etc. should be parsed as a number vs. parsed as a label name. Currently, foo{nan="bar"} would be a syntax error. However, that is an existing bug for all our reserved words. E.g. foo{sum="bar"} is a syntax error as well. This should be fixed separately.	2015-03-01 19:31:16 +01:00
beorn7	1a61bcae07	Fix plural of 'histogram'. Actually, 'histogram' is Ancient Greek and 3rd declension... ;-)	2015-02-23 15:29:26 +01:00
beorn7	17443d288b	Avoid copying of the COWMetric if we already have the metric available.	2015-02-22 01:04:52 +01:00
beorn7	9e7c3e3bcd	Add the histogram_quantile function. Since we are now getting really deep into floating point calculation, the tests had to take into account the precision loss. Since the rule tests are based on direct line matching in the output, implementing the "almost equal" semantics was pretty cumbersome, but here we are.	2015-02-22 01:04:51 +01:00
Julius Volz	42601acfde	Replace labelsToKey() with metric Fingerprint (fixes grouping bug).	2015-02-21 17:45:47 +01:00
Julius Volz	7fefccd929	Write() directly into hash and use model.SeparatorByte.	2015-02-21 17:19:13 +01:00
Julius Volz	645cf57bed	Fix aggregation grouping key calculation.	2015-02-21 14:05:50 +01:00
Julius Volz	15b2b5aa66	Add tests for invalid uses of "offset".	2015-02-18 02:56:40 +01:00
Julius Volz	67e20acc6c	Lower-case some package-internal names.	2015-02-18 02:45:54 +01:00
Julius Volz	72d7b325a1	Implement offset operator. This allows changing the time offset for individual instant and range vectors in a query. For example, this returns the value of `foo` 5 minutes in the past relative to the current query evaluation time: foo offset 5m Note that the `offset` modifier always needs to follow the selector immediately. I.e. the following would be correct: sum(foo offset 5m) // GOOD. While the following would be incorrect: sum(foo) offset 5m // INVALID. The same works for range vectors. This returns the 5-minutes-rate that `foo` had a week ago: rate(foo[5m] offset 1w) This change touches the following components: * Lexer/parser: additions to correctly parse the new `offset`/`OFFSET` keyword. * AST: vector and matrix nodes now have an additional `offset` field. This is used during their evaluation to adjust query and result times appropriately. * Query analyzer: now works on separate sets of ranges and instants per offset. Isolating different offsets from each other completely in this way keeps the preloading code relatively simple. No storage engine changes were needed by this change. The rules tests have been changed to not probe the internal implementation details of the query analyzer anymore (how many instants and ranges have been preloaded). This would also become too cumbersome to test with the new model, and measuring the result of the query should be sufficient. This fixes https://github.com/prometheus/prometheus/issues/529 This fixed https://github.com/prometheus/promdash/issues/201	2015-02-18 02:41:27 +01:00
Brian Brazil	60271d58bf	Change the 2nd argument of round to toNearest. This is more useful if you want get a multiple of 2 or 5, while still working for .001.	2015-02-05 16:13:40 +00:00
Julius Volz	82613527f3	Remove unnecessary float64() conversion in round().	2015-02-05 15:14:05 +01:00
Marko Mikulicic	8fdacbdf17	Add floor, ceil and round functions. Closes #402	2015-02-04 17:20:56 +01:00
Fabian Reinartz	fa1e90003b	Query timeout added. This is related to #454. Queries now timeout after a duration set by the -query.timeout flag. The TotalEvalTimer is now started/stopped inside any of the ast.Eval* functions.	2015-02-03 08:04:27 +01:00
Bjoern Rabenstein	26e22e6ad6	Fix rule manager shutdown.	2015-01-29 15:05:10 +01:00
Julius Volz	d4374a9265	More efficient JSON query result format. This depends on https://github.com/prometheus/client_golang/pull/51. For vectors, the result format looks like this: ```json { "version": 1, "type" : "vector", "value" : [ { "timestamp" : 1421765411.045, "value" : "65.475000", "metric" : { "quantile" : "0.5", "instance" : "http://localhost:9090/metrics", "job" : "prometheus", "__name__" : "http_request_duration_microseconds", "handler" : "/static/", "method" : "get", "code" : "304" } }, { "timestamp" : 1421765411.045, "value" : "5826.339000", "metric" : { "quantile" : "0.9", "instance" : "http://localhost:9090/metrics", "job" : "prometheus", "__name__" : "http_request_duration_microseconds", "handler" : "prometheus", "method" : "get", "code" : "200" } }, /* ... / ] } ``` For matrices, it looks like this: ```json { "version": 1, "type" : "matrix", "value" : [ { "metric" : { "quantile" : "0.99", "instance" : "http://localhost:9090/metrics", "job" : "prometheus", "__name__" : "http_request_duration_microseconds", "handler" : "/static/", "method" : "get", "code" : "200" }, "values" : [ [ 1421765547.659, "29162.953000" ], [ 1421765548.659, "29162.953000" ], [ 1421765549.659, "29162.953000" ], / ... */ ] } ] } ```	2015-01-26 13:06:22 +01:00
Brian Brazil	a31730e88b	Make 2nd arg to delta optional. Add a deriv() function. The 2nd isCounter argument to delta is ugly, make it optional as the first step of deprecating it. This will makes delta only ever applied to gauges. Add a deriv function to calculate the least squares slope of a gauge. This is more useful for prediction than delta, as it isn't as heavily influenced by outliers at the boundaries.	2015-01-23 14:50:27 +00:00
Bjoern Rabenstein	5859b74f1b	Clean up license issues. - Move CONTRIBUTORS.md to the more common AUTHORS. - Added the required NOTICE file. - Changed "Prometheus Team" to "The Prometheus Authors". - Reverted the erroneous changes to the Apache License.	2015-01-21 20:07:45 +01:00
Bjoern Rabenstein	b09453af1d	Adjust to new client_golang API.	2015-01-21 15:42:25 +01:00

1 2 3 4 5 ...

320 commits