prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-14 09:34:05 -08:00

Author	SHA1	Message	Date
beorn7	c2e9a151ab	Make all rule links link to the "Console" tab rather than "Graph" Clicking on a rule, either the name or the expression, opens the rule result (or the corresponding expression, repsectively) in the expression browser. This should by default happen in the console tab, as, more often than not, displaying it in the graph tab runs into a timeout.	2017-09-21 18:28:00 +02:00
Julius Volz	ac203ef0ee	Add externalURL template function (#2716 ) This allows users to e.g. add links back to the generating Prometheus right in their alert templates.	2017-05-13 15:47:04 +02:00
Julius Volz	fe11c5933a	Fix mutation of active alert elements by notifier (#2656 ) This caused the external label application in the notifier to bleed back into the rule manager's active alerting elements.	2017-04-26 10:29:42 -05:00
Fabian Reinartz	e68a3cf21f	rules: update annotations on each iteration	2016-11-22 15:43:07 +01:00
Jonathan Lange	2a2da40223	Make rule evaluation publicly available Means that a third-party can parse rules and run them with their own execution model.	2016-11-18 16:12:50 +00:00
Julius Volz	c187308366	storage: Contextify storage interfaces. This is based on https://github.com/prometheus/prometheus/pull/1997. This adds contexts to the relevant Storage methods and already passes PromQL's new per-query context into the storage's query methods. The immediate motivation supporting multi-tenancy in Frankenstein, but this could also be used by Prometheus's normal local storage to support cancellations and timeouts at some point.	2016-09-19 16:29:07 +02:00
Julius Volz	ed5a0f0abe	promql: Allow per-query contexts. For Weaveworks' Frankenstein, we need to support multitenancy. In Frankenstein, we initially solved this without modifying the promql package at all: we constructed a new promql.Engine for every query and injected a storage implementation into that engine which would be primed to only collect data for a given user. This is problematic to upstream, however. Prometheus assumes that there is only one engine: the query concurrency gate is part of the engine, and the engine contains one central cancellable context to shut down all queries. Also, creating a new engine for every query seems like overkill. Thus, we want to be able to pass per-query contexts into a single engine. This change gets rid of the promql.Engine's built-in base context and allows passing in a per-query context instead. Central cancellation of all queries is still possible by deriving all passed-in contexts from one central one, but this is now the responsibility of the caller. The central query context is now created in main() and passed into the relevant components (web handler / API, rule manager). In a next step, the per-query context would have to be passed to the storage implementation, so that the storage can implement multi-tenancy or other features based on the contextual information.	2016-09-19 15:38:17 +02:00
Julius Volz	da7206ec29	Fix rule HTML escaping issues This was mentioned as part of https://github.com/prometheus/alertmanager/issues/452	2016-08-12 02:59:41 +02:00
Brian Brazil	6fc88d4b4d	Remove __name__ from alerts sent to AM. Fixes #1861	2016-08-01 23:32:41 +01:00
Brian Brazil	0509b0f2db	Expand alert templates at eval time. Fixes #1678 #1677	2016-07-12 17:13:55 +01:00
beorn7	b95c096a45	Fix style issues in rules/...	2016-05-19 16:59:53 +02:00
Fabian Reinartz	d89c254849	Make copying alerting state safer. This considers static labels in the equality of alerts to avoid falsely copying state from a different alert definition with the same name across reloads. To be safe, it also copies the state map rather than just its pointer so that remaining collisions disappear after one evaluation interval.	2016-03-02 12:21:54 +01:00
Tobias Schmidt	f1f8317fa5	Fix detection of flapping alerts Alerts in the resolve retention period must be transitioned to the active state again when their condition is met.	2016-02-04 23:55:12 -05:00
beorn7	a7408bfb47	Unify duration parsing It's actually happening in several places (and for flags, we use the standard Go time.Duration...). This at least reduces all our home-grown parsing to one place (in model).	2016-01-29 15:41:50 +01:00
Fabian Reinartz	a6935024e1	Remove old WITH clause in alert printing	2016-01-26 15:45:27 +01:00
Fabian Reinartz	0cf3c6a9ef	Add comments, rename a method	2015-12-23 12:29:28 +01:00
Fabian Reinartz	bf6abac8f4	Send resolved notifications	2015-12-17 15:42:26 +01:00
Fabian Reinartz	52e5224f5a	Refactor rules/ package	2015-12-17 15:42:25 +01:00
Fabian Reinartz	7c90db22ed	Use annotation based alerts in rules/ This commit breaks the previously used alert format.	2015-12-14 10:16:07 +01:00
Julius Volz	995d3b831d	Fix most golint warnings. This is with `golint -min_confidence=0.5`. I left several lint warnings untouched because they were either incorrect or I felt it was better not to change them at the moment.	2015-08-26 12:44:46 +02:00
Fabian Reinartz	d6b8da8d43	Switch promql types to common/model	2015-08-25 13:49:14 +02:00
Fabian Reinartz	306e8468a0	Switch from client_golang/model to common/model	2015-08-21 13:33:38 +02:00
Fabian Reinartz	749ae450c5	promql: add runbook to alert statement. This commit adds the RUNBOOK keyword to alert statements. The field is optional and expected to be a link.	2015-06-25 13:00:52 +02:00
Julius Volz	d868264bb8	Improve UI of /alerts page. Changes to the UI: - "Active Since" timestamps are now human-readable. - Alerting rules are now pretty-printed better. - Labels are no longer just strings, but alert bubbles (like we do on the status page for base labels). - Alert states and target health states are now capitalized in the presentation layer rather than at the source.	2015-06-23 18:48:45 +02:00
Fabian Reinartz	5e13880201	General cleanup of rules.	2015-06-06 21:40:52 +02:00
Fabian Reinartz	75c920c95e	Remove DotGraph method from Rule interface	2015-06-06 21:35:59 +02:00
Fabian Reinartz	83d07516e8	Remove EvalRaw methods from Rule interface	2015-06-06 21:34:09 +02:00
Fabian Reinartz	0de6edbdfc	Move pkg/ to util/	2015-06-01 21:12:32 +02:00
Fabian Reinartz	02717e6fde	Remove generic set type	2015-06-01 21:12:32 +02:00
Fabian Reinartz	dbc0d30e3e	Move string functionality to pkg/strutil	2015-06-01 21:12:32 +02:00
Julius Volz	d7c015c149	Convert pathPrefix to not have trailing slash.	2015-06-01 12:43:17 +02:00
Mitsuhiro Tanda	3e914a8cb1	fix graph links with path prefix	2015-05-19 02:45:05 +09:00
Fabian Reinartz	3ca11bcaf5	Switch Prometheus to promql package. This commit removes all functionality from rules/ that is now handled in promql/. All parts of Prometheus are changed to use the promql/ package.	2015-04-28 16:19:23 +02:00
Bjoern Rabenstein	5859b74f1b	Clean up license issues. - Move CONTRIBUTORS.md to the more common AUTHORS. - Added the required NOTICE file. - Changed "Prometheus Team" to "The Prometheus Authors". - Reverted the erroneous changes to the Apache License.	2015-01-21 20:07:45 +01:00
Julius Volz	c9618d11e8	Introduce copy-on-write for metrics in AST. This depends on changes in: https://github.com/prometheus/client_golang/tree/cow-metrics. Change-Id: I80b94833a60ddf954c7cd92fd2cfbebd8dd46142	2014-12-12 20:34:55 +01:00
Bjoern Rabenstein	b1e4956142	Apply a giant code cleanup. Essentially: - Remove unused code. - Make it 'go vet' clean. The only remaining warnings are in generated code. - Make it 'golint' clean. The only remaining warnings are in gerenated code. - Smoothed out same minor things. Change-Id: I3fe5c1fbead27b0e7a9c247fee2f5a45bc2d42c6	2014-12-10 16:16:49 +01:00
Bjoern Rabenstein	f5f9f3514a	Major code cleanup. - Make it go-vet and golint clean. - Add comments, TODOs, etc. Change-Id: If1392d96f3d5b4cdde597b10c8dff1769fcfabe2	2014-11-25 17:02:53 +01:00
Julius Volz	e7ed39c9a6	Initial experimental snapshot of next-gen storage. Change-Id: Ifb8709960dbedd1d9f5efd88cdd359ee9fa9d26d	2014-11-25 17:02:00 +01:00
Brian Brazil	f525ca5d9e	Let consoles get graph links from experssions. Rename ConsoleLinkFromExpression, as we now have consoles. Change-Id: I7ed2c9c83863adb390b51121dd9736845f7bcdfc	2014-11-25 17:01:59 +01:00
Brian Brazil	e041c0cd46	Add console and alert templates with access to all data. Move rulemanager to it's own package to break cicrular dependency. Make NewTestTieredStorage available to tests, remove duplication. Change-Id: I33b321245a44aa727bfc3614a7c9ae5005b34e03	2014-05-30 16:24:56 +01:00
Julius Volz	01f652cb4c	Separate storage implementation from interfaces. This was initially motivated by wanting to distribute the rule checker tool under `tools/rule_checker`. However, this was not possible without also distributing the LevelDB dynamic libraries because the tool transitively depended on Levigo: rule checker -> query layer -> tiered storage layer -> leveldb This change separates external storage interfaces from the implementation (tiered storage, leveldb storage, memory storage) by putting them into separate packages: - storage/metric: public, implementation-agnostic interfaces - storage/metric/tiered: tiered storage implementation, including memory and LevelDB storage. I initially also considered splitting up the implementation into separate packages for tiered storage, memory storage, and LevelDB storage, but these are currently so intertwined that it would be another major project in itself. The query layers and most other parts of Prometheus now have notion of the storage implementation anymore and just use whatever implementation they get passed in via interfaces. The rule_checker is now a static binary :) Change-Id: I793bbf631a8648ca31790e7e772ecf9c2b92f7a0	2014-04-16 13:30:19 +02:00
Julius Volz	740d448983	Use custom timestamp type for sample timestamps and related code. So far we've been using Go's native time.Time for anything related to sample timestamps. Since the range of time.Time is much bigger than what we need, this has created two problems: - there could be time.Time values which were out of the range/precision of the time type that we persist to disk, therefore causing incorrectly ordered keys. One bug caused by this was: https://github.com/prometheus/prometheus/issues/367 It would be good to use a timestamp type that's more closely aligned with what the underlying storage supports. - sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit Unix timestamp (possibly even a 32-bit one). Since we store samples in large numbers, this seriously affects memory usage. Furthermore, copying/working with the data will be faster if it's smaller. MEMORY USAGE RESULTS Initial memory usage comparisons for a running Prometheus with 1 timeseries and 100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my tests, this advantage for some reason decreased a bit the more samples the timeseries had (to 5-7% for millions of samples). This I can't fully explain, but perhaps garbage collection issues were involved. WHEN TO USE THE NEW TIMESTAMP TYPE The new clientmodel.Timestamp type should be used whenever time calculations are either directly or indirectly related to sample timestamps. For example: - the timestamp of a sample itself - all kinds of watermarks - anything that may become or is compared to a sample timestamp (like the timestamp passed into Target.Scrape()). When to still use time.Time: - for measuring durations/times not related to sample timestamps, like duration telemetry exporting, timers that indicate how frequently to execute some action, etc. NOTE ON OPERATOR OPTIMIZATION TESTS We don't use operator optimization code anymore, but it still lives in the code as dead code. It still has tests, but I couldn't get all of them to pass with the new timestamp format. I commented out the failing cases for now, but we should probably remove the dead code soon. I just didn't want to do that in the same change as this. Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f	2013-12-03 09:11:28 +01:00
Julius Volz	35ee2cd3cb	Add alertmanager notification support to Prometheus. Alert definitions now also have mandatory SUMMARY and DESCRIPTION fields that get sent along a firing alert to the alert manager.	2013-07-30 17:23:41 +02:00
Matt T. Proud	30b1cf80b5	WIP - Snapshot of Moving to Client Model.	2013-06-25 15:52:42 +02:00
Julius Volz	8ee7947b1e	Ensure metric name is dropped correctly from alert labels in UI.	2013-06-14 13:03:19 +02:00
Julius Volz	0226d1ac7a	Implement alerts dashboard and expression console links.	2013-06-13 22:35:40 +02:00
Julius Volz	74cb676537	Implement Stringer interface for rules and all their children.	2013-06-07 15:54:32 +02:00
Julius Volz	51689d965d	Add debug timers to instant and range queries. This adds timers around several query-relevant code blocks. For now, the query timer stats are only logged for queries initiated through the UI. In other cases (rule evaluations), the stats are simply thrown away. My hope is that this helps us understand where queries spend time, especially in cases where they sometimes hang for unusual amounts of time.	2013-06-05 18:32:54 +02:00
Julius Volz	5b105c77fc	Repointerize fingerprints.	2013-05-21 14:28:14 +02:00
Matt T. Proud	8f4c7ece92	Destroy naked returns in half of corpus. The use of naked return values is frowned upon. This is the first of two bulk updates to remove them.	2013-05-16 10:53:25 +03:00

1 2

54 commits