prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-15 10:04:07 -08:00

Author	SHA1	Message	Date
Peter Štibraný	ae49ab5ea8	Merge remote-tracking branch 'upstream/main' into update-upstream-prometheus	2022-07-13 10:18:09 +02:00
Matthieu MOREL	ddfa9a7cc5	refactor (rules): move from github.com/pkg/errors to 'errors' and 'fmt' (#10855 ) * refactor (rules): move from github.com/pkg/errors to 'errors' and 'fmt' Signed-off-by: Matthieu MOREL <mmorel-35@users.noreply.github.com>	2022-06-17 09:54:25 +02:00
Peter Štibraný	9d51bf50db	Merge upstream Prometheus	2022-06-09 11:29:19 +02:00
Julien Pivotto	3a56817a30	Rules: set otel status to ERROR when a rule fails (#10745 ) Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2022-05-25 10:06:17 +02:00
Julien Pivotto	0d94cdf107	rules: remove classic UI code (#10730 ) Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2022-05-23 16:21:50 +02:00
Łukasz Mierzwa	d3c9c4f574	Stop rule manager before TSDB is stopped (#10680 ) During shutdown TSDB is stopped before rule manager is stopped. Since TSDB shutdown can take a long time (minutes or 10s of minutes) it keeps rule manager running while parts of Prometheus are already stopped (most notebly scrape manager). This can cause false positive alerts to fire, mostly those that rely on absent() calls since new sample appends will stop while alert queries are still evaluated. Stop rules before stopping TSDB and scrape manager to avoid this problem. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2022-05-20 23:26:06 +02:00
Jesus Vazquez	48aa5cd096	Merge remote-tracking branch 'upstream/main' into jvp/merge-prometheus-main	2022-04-12 16:40:00 +02:00
Wilbert Guo	83a2e52bc2	Add SyncForState Implementation for Ruler HA (#10070 ) * continuously syncing activeAt for alerts Signed-off-by: Yijie Qin <qinyijie@amazon.com> Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * add import Signed-off-by: Yijie Qin <qinyijie@amazon.com> Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Refactor SyncForState and add unit tests Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Format code Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Add hook for syncForState Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix go lint Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Refactor syncForState override implementation Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Add syncForState override func as argument to Update() Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix go formatting Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix circleci test errors Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Remove overrideFunc as argument to run() Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * remove the syncForState Signed-off-by: Yijie Qin <qinyijie@amazon.com> * use the override function to decide if need to replace the activeAt or not Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix test case Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix format Signed-off-by: Yijie Qin <qinyijie@amazon.com> * Trigger build Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fixing comments Signed-off-by: Yijie Qin <qinyijie@amazon.com> * return the result of map of alerts instead of single one Signed-off-by: Yijie Qin <qinyijie@amazon.com> * upper case the QueryforStateSeries Signed-off-by: Yijie Qin <qinyijie@amazon.com> * use a more generic rule group post process function type Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix indentation Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix gofmt Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix lint Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fixing naming Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix comments Signed-off-by: Yijie Qin <qinyijie@amazon.com> * add the lastEvalTimestamp as parameter Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fmt Signed-off-by: Yijie Qin <qinyijie@amazon.com> * change funcType to func Signed-off-by: Yijie Qin <qinyijie@amazon.com> Co-authored-by: Yijie Qin <qinyijie@amazon.com> Co-authored-by: Yijie Qin <63399121+qinxx108@users.noreply.github.com>	2022-03-29 02:16:46 +02:00
Alan Protasio	606ef33d91	Track and report Samples Queried per query We always track total samples queried and add those to the standard set of stats queries can report. We also allow optionally tracking per-step samples queried. This must be enabled both at the engine and query level to be tracked and rendered. The engine flag is exposed via a Prometheus feature flag, while the query flag is set when stats=all. Co-authored-by: Alan Protasio <approtas@amazon.com> Co-authored-by: Andrew Bloomgarden <blmgrdn@amazon.com> Co-authored-by: Harkishen Singh <harkishensingh@hotmail.com> Signed-off-by: Andrew Bloomgarden <blmgrdn@amazon.com>	2022-03-21 23:49:17 +01:00
Alvin Lin	cd739214dd	Log rule name when evaluating rule groups' Eval function logs anything (#10454 ) * Add benchingmark test for rule group eval Signed-off-by: Alvin Lin <alvinlin@amazon.com>	2022-03-21 19:52:20 +01:00
Ganesh Vernekar	23ce9ad9f0	Introduce evaluation delay for rule groups (#155 ) * Allow having evaluation delay for rule groups Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix lint Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Move the option to ManagerOptions Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Include evaluation_delay in the group config Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix comments Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-03-14 13:20:07 +00:00
Matej Gera	2c61d29b2a	Tracing: Migrate to OpenTelemetry library (#9724 ) Signed-off-by: Matej Gera <matejgera@gmail.com>	2022-01-25 11:08:04 +01:00
Dimitar Dimitrov	16faee8b78	Account for repeating tenants when comparing rules	2022-01-21 14:21:50 +01:00
Dimitar Dimitrov	f17d3a71aa	Improve godoc of Group.SourceTenants()	2021-11-26 14:08:21 +01:00
Dimitar Dimitrov	a97576fc00	Ignore order when comparing the source tenants of two rule groups	2021-11-26 14:07:20 +01:00
Dimitar Dimitrov	75d3c11278	Repurpose FederatedContextFunc into GroupEvaluationContextFunc	2021-11-26 14:07:19 +01:00
Dimitar Dimitrov	42a7f1e210	Add some godocs to ManagerOptions	2021-11-25 13:47:32 +01:00
Dimitar Dimitrov	6ffb81244f	Add source_tenants fields to RuleGroup	2021-11-25 13:44:29 +01:00
Björn Rabenstein	4c56a193c5	Merge pull request #9478 from prometheus/beorn7/pkg-deprecation Move packages out of deprecated pkg directory	2021-11-09 11:09:16 +01:00
beorn7	c954cd9d1d	Move packages out of deprecated pkg directory This creates a new `model` directory and moves all data-model related packages over there: exemplar labels relabel rulefmt textparse timestamp value All the others are more or less utilities and have been moved to `util`: gate logging modetimevfs pool runtime Signed-off-by: beorn7 <beorn@grafana.com>	2021-11-09 08:03:10 +01:00
Bryan Boreham	26d8ae0e41	Rules: simplify map key for stale series detection The rules manager keeps a note of which series were generated by the last run, so it can write a stale marker to those that disappeared. Since the keys are not for human eyes, we can use a simpler format and save the effort of quoting label values. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2021-11-08 22:18:48 +01:00
Mateusz Gozdek	1a6c2283a3	Format Go source files using 'gofumpt -w -s -extra' Part of #9557 Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-02 19:52:34 +01:00
Levi Harrison	dc2f1993d8	Limit number of alerts or series produced by a rule (#9260 ) * Add limit to rules Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-09-15 09:48:26 +02:00
Levi Harrison	8c29046ab2	Remove unneeded state modifications Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-08-20 16:42:31 -04:00
Levi Harrison	b5f6f8fb36	Switched to go-kit/log Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-11 12:28:36 -04:00
Levi Harrison	17ea8d006a	Added external URL access Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-05-30 23:35:26 -04:00
Owen Diehl	23999df27c	expose rule metrics fields Signed-off-by: Owen Diehl <ow.diehl@gmail.com>	2021-04-30 13:36:44 -04:00
Goutham Veeramachaneni	2efdf660b1	Increase evaluation failures on Commit() (#8770 ) I think we should increment the metric here, we're setting the rule health anyways. This means even if the "evaluation" suceeded, none of the samples made it to storage. This is a simplified solution to: https://github.com/prometheus/prometheus/pull/8410/ Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>	2021-04-29 14:28:48 +02:00
Björn Rabenstein	9549a15c6f	Merge pull request #7675 from JessicaGreben/jg/11-retroactive-rule-eval Add rule importer to backfill	2021-03-29 19:09:21 +02:00
Goutham Veeramachaneni	4b5ab80ca6	[rule] Update rule health for append/commit fails (#8619 ) * [rule] Update rule health for append/commit fails Similar to https://github.com/prometheus/prometheus/pull/8410 will provide more context. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Add test for updating health on append fails Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>	2021-03-18 15:44:33 +01:00
jessicagreben	78e84aed89	resolve merge conflict Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2021-02-24 09:47:29 -08:00
Tom Wilkie	7369561305	Combine Appender.Add and AddFast into a single Append method. (#8489 ) This moves the label lookup into TSDB, whilst still keeping the cached-ref optimisation for repeated Appends. This makes the API easier to consume and implement. In particular this change is motivated by the scrape-time-aggregation work, which I don't think is possible to implement without it as it needs access to label values. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2021-02-18 17:37:00 +05:30
jessicagreben	75654715d3	fix panics Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-11-01 07:54:04 -08:00
jessicagreben	6980bcf671	unexport backfiller Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-10-31 06:40:56 -07:00
jessicagreben	36ac0b68f1	merge master, fix conflicts	2020-10-17 08:20:21 -07:00
Łukasz Mierzwa	19c190b406	Add a rule_group_samples metric (#7977 ) This new metric allows tracking how many samples did each rule group generate. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2020-09-25 16:48:38 +01:00
李国忠	4a52faf2ae	Unnecessary go routine spawn. (#7951 ) Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>	2020-09-21 11:29:03 +01:00
jessicagreben	dfa510086b	add alignment, mv rule importer to promtool dir, add queryRange Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	2020-09-13 08:07:59 -07:00
Anonymous	8219b442c8	rules: Remove redundant RLock to avoid double RLock (#7183 ) Signed-off-by: BurtonQin <bobbqqin@gmail.com>	2020-09-07 18:58:21 +01:00
johncming	2f2a51a43a	web/api/v1: make names consistent. (#7841 ) Signed-off-by: johncming <johncming@yahoo.com>	2020-08-25 11:38:06 +01:00
Goutham Veeramachaneni	cb830b0a9c	Label rule_group_iterations metric with group name (#7823 ) * Label rule_group_iterations metric with group name evalTotal and evalFailures having the label but iterations not having it is an odd mismatch. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Remove the metrics when a group is deleted. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Initialise the metrics Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>	2020-08-19 15:29:13 +02:00
johncming	362080ba28	rules: add evaluationTimestamp when copy state. (#7775 ) Signed-off-by: johncming <johncming@yahoo.com>	2020-08-14 09:42:13 +01:00
Callum Styan	c7a17f6491	Add additional tag/log to rules Manager Eval trace span. (#7708 ) Signed-off-by: Callum Styan <callumstyan@gmail.com>	2020-08-06 08:42:20 -07:00
Annanay	7f98a744e5	Add context to Appender interface Signed-off-by: Annanay <annanayagarwal@gmail.com>	2020-07-24 19:40:51 +05:30
Owen Diehl	9ccedc0407	Arbitrary rule & group loading (#7569 ) * allows loading rule groups via an interface Signed-off-by: Owen Diehl <ow.diehl@gmail.com>	2020-07-22 15:19:34 +01:00
Julien Pivotto	b83cbacbdd	Rule manager: remove blocking channel in mail (#7631 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-07-22 00:13:24 +02:00
Frederic Branczyk	d17d88935c	rules: Use narrower interface for rule manager loading of for state (#7472 ) To load ALERT_FOR_STATE only `storage.Queryable` interface is required, so this patch uses this narrower interface for to perform this. Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>	2020-06-26 19:06:36 +01:00
Kemal Akkoyun	66dfb951c4	: Consistent Error/Warning handling for SeriesSet iterator: Allowing Async Select (#7251 ) Add errors and Warnings to SeriesSet Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Change Querier interface and refactor accordingly Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Refactor promql/engine to propagate warnings at eval stage Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Address review issues Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Make sure all the series from all Selects are pre-advanced Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Address review issues Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Separate merge series sets Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Clean Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Refactor merge querier failure handling Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Refactored and simplified fanout with improvements from incoming chunk iterator PRs. * Secondary logic is hidden, instead of weird failed series set logic we had. * Fanout is well commented * Fanout closing record all errors * MergeQuerier improved API (clearer) * deferredGenericMergeSeriesSet is not needed as we return no samples anyway for failed series sets (next = false). Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Fix formatting Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Fix CI issues Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Added final tests for error handling. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Addressed Brian's comments. * Moved hints in populate to be allocated only when needed. * Used sync.Once in secondary Querier to achieve all-or-nothing partial response logic. * Select after first Next is done will panic. NOTE: in lazySeriesSet in theory we could just panic, I think however we can totally just return error, it will panic in expand anyway. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Utilize errWithWarnings Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Fix recently introduced expansion issue Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Add tests for secondary querier error handling Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Implement lazy merge Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Add name to test cases Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Reorganize Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Address review comments Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Address review comments Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Remove redundant warnings Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Fix rebase mistake Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-06-09 17:57:31 +01:00
Julien Pivotto	fc3fb3265a	Merge pull request #7145 from prometheus/release-2.17 Backport release 2.17 into master	2020-04-20 14:08:12 +02:00
Chris Marchbanks	a7b449320d	Fix updating rule manager never finishing (#7138 ) Rather than sending a value to the done channel on a group to indicate whether or not to add stale markers to a closing rule group use an explicit boolean. This allows more functions than just run() to read from the done channel and fixes an issue where Eval() could consume the channel during an update, causing run() to never return. Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>	2020-04-18 14:32:18 +02:00

1 2 3 4 5

226 commits