This fixes#6992, which was introduced by #6777. There was an
intermediate component which translated TSDB errors into storage errors,
but that component was deleted and this bug went unnoticed, until we
were watching at the Prombench results. Without this, scrape will fail
instead of dropping samples or using "Add" when the series have been
garbage collected.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
With defer having less of a performance penalty, there is no reason
not to do those crucial operations via defer.
Context: With isolation in place, if we forget to Commit/Rollback, the
low watermark will get stuck forever.
The current code should not have any bugs, but moving to defer helps
to avoid future bugs.
This is also moving the `closeAppend` in the `Commit` implementation
itself to defer. If logging to the WAL fails, we would have missed the
`closeAppend`.
Signed-off-by: beorn7 <beorn@grafana.com>
This is technically BREAKING CHANGE, but it was like this from the beginning: I just notice that we rely in
Prometheus on remote read being sorted. This is because we use selected data from remote reads in MergeSeriesSet
which rely on sorting.
I found during work on https://github.com/prometheus/prometheus/pull/5882 that
we do so many repetitions because of this, for not good reason. I think
I found a good balance between convenience and readability with just one method.
Smaller the interface = better.
Also I don't know what TestSelectSorted was testing, but now it's testing sorting.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
This is part of https://github.com/prometheus/prometheus/pull/5882 that can be done to simplify things.
All todos I added will be fixed in follow up PRs.
* querier.Querier, querier.Appender, querier.SeriesSet, and querier.Series interfaces merged
with storage interface.go. All imports that.
* querier.SeriesIterator replaced by chunkenc.Iterator
* Added chunkenc.Iterator.Seek method and tests for xor implementation (?)
* Since we properly handle SelectParams for Select methods I adjusted min max
based on that. This should help in terms of performance for queries with functions like offset.
* added Seek to deletedIterator and test.
* storage/tsdb was removed as it was only a unnecessary glue with incompatible structs.
No logic was changed, only different source of abstractions, so no need for benchmarks.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
The function HoldDuration and Duration did the exact same thing.
Let's only keep HoldDuration() as Duration() is more confusing.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
Since we use ActiveQueryTracker to check for concurrency in
d992c36b3a it does not make sense to keep
the MaxConcurrent value as an option of the PromQL engine.
This pull request removes it from the PromQL engine options, sets the
max concurrent metric to -1 if there is no active query tracker, and use
the value of the active query tracker otherwise.
It removes dead code and also will inform people who import the promql
package that we made that change, as it breaks the EngineOpts struct.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Query Log: add origin of the rules
We don't set rule name and rule kind because the added value would be
quite low, given we have now the file, the group and the query.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
Fixes https://github.com/prometheus/common/issues/36
Move logic handling this into the labels package,
so all the cases are handled in one place and we're
less likely to have this come up again.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
With the next release of client_golang, Summaries will not have
objectives by default. To not lose the objectives we have right now,
explicitly state the current default objectives.
Signed-off-by: beorn7 <beorn@grafana.com>
* Working group name
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Working categorised by group name
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Changed group sorting in web
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Fixed group sorting and comments
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Fixed group sorting and comments with gofmt
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Added file and group name
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* reverted back to full path to yml file
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
* reload: copy state on both name and labels
Fix https://github.com/prometheus/prometheus/issues/5193
Using just name causes the linked issue - if new rules are inserted with
the same name (but different labels), the reordering will cause stale
markers to be inserted in the next eval for all shifted rules, despite
them not being stale.
Ideally we want to avoid stale markers for time series that still exist
in the new rules, with name and labels being the unique identifer.
This change adds labels to the internal map when copying the old rule
data to the new rule data. This prevents the problem of staling rules
that simply shifted order.
If labels change, it is a new time series and the old series will stale
regardless. So it should be safe to always match on name and labels when
copying state.
Signed-off-by: James Ravn <james@r-vn.org>
The previous code was defective in that it never sorted groups within a
file due to doing a multi-key sort incorrectly.
Signed-off-by: David Symonds <dsymonds@gmail.com>
* *: use latest release of staticcheck
It also fixes a couple of things in the code flagged by the additional
checks.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Use official release of staticcheck
Also run 'go list' before staticcheck to avoid failures when downloading packages.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Add evaluationTimestamp (Last Evaluation) column to display on /rules
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
* Add lastScrapeDuration ("Scrape Duration") to display on /targets
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
* Updates based on Julius' feedback
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
* Update to set timestamp to when eval started (after eval completes)
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
* Update /rules to display time since last evaluation
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
* Re-order Last Eval/Eval Time to be consistent with targets page
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
There are many more (mostly finalizers like Close/Stop/etc.), but most of
the others seemed like one couldn't do much about them anyway.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* adding information about the health and errors for Rules
adding Health() and LastError() to the Rule interface. This will allow
us to easily surface information about rules.
Signed-off-by: noqcks <benny@noqcks.io>
* updating rules.html with fields for Rule errors and health state
Signed-off-by: noqcks <benny@noqcks.io>
* fix code comment grammar & access Rule health/error info using a mutex
Signed-off-by: noqcks <benny@noqcks.io>
* s/Errors/Error/ in rules.html to remain consistent with targets.html
Signed-off-by: noqcks <benny@noqcks.io>
* adding periods to code comments in reporting/alerting
Signed-off-by: noqcks <benny@noqcks.io>
* putting health/error below mutex in struct field
Signed-off-by: noqcks <benny@noqcks.io>
Previously it would set no preconditions and check no postconditions,
as the `groups` member was empty.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
* Use testutil in rules subpackage
* Fix manager test
* Use testutil in rules subpackage
* Fix manager test
* Fix rebase
* Change to testutil for applyConfig tests
* Re-add contexts to storage.Storage.Querier()
These are needed when replacing the storage by a multi-tenant
implementation where the tenant is stored in the context.
The 1.x query interfaces already had contexts, but they got lost in 2.x.
* Convert promql.Engine to use native contexts
Clicking on a rule, either the name or the expression, opens the rule
result (or the corresponding expression, repsectively) in the
expression browser. This should by default happen in the console tab,
as, more often than not, displaying it in the graph tab runs into a
timeout.