* Remove NewPossibleNonCounterInfo until it can be made more efficient, and avoid creating empty annotations as much as possible
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
When reading the WAL this method is called with buffers from a pool, on
multiple goroutines. Pre-allocating sufficient size avoids slow growth
and many reallocations in `append`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
On a 32 bit architecture the size of int is 32 bits. Thus converting from
int64, uint64 can overflow it and flip the sign.
Try for yourself in playground:
package main
import "fmt"
func main() {
x := int64(0x1F0000001)
y := int64(1)
z := int32(x - y) // numerically this is 0x1F0000000
fmt.Printf("%v\n", z)
}
Prints -268435456 as if x was smaller.
Followup to #12650
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* Additionally wrap WBL replay error
Although WBL replay is already wrapped with errLoadWbl,
there are other errors that can happen during a WBL replay.
We should not try to repair WAL in those cases.
This commit additionally wraps the final error in Head.Init again
with errLoadWbl so that WBL replay errors can be identified properly.
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Additionally wrap WBL replay error
Although WBL replay is already wrapped with errLoadWbl,
there are other errors that can happen during a WBL replay.
We should not try to repair WAL in those cases.
This commit additionally wraps the final error in Head.Init again
with errLoadWbl so that WBL replay errors can be identified properly.
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>
Reverts change from https://github.com/prometheus/prometheus/pull/12906
The benchmarks show that it's slower when intersecting, which is a
common usage for ListPostings (when intersecting matchers from Head)
(old is before #12906, new is #12906):
│ old │ new │
│ sec/op │ sec/op vs base │
Intersect/LongPostings1-16 20.54µ ± 1% 21.11µ ± 1% +2.76% (p=0.000 n=20)
Intersect/LongPostings2-16 51.03m ± 1% 52.40m ± 2% +2.69% (p=0.000 n=20)
Intersect/ManyPostings-16 194.2m ± 3% 332.1m ± 1% +71.00% (p=0.000 n=20)
geomean 5.882m 7.161m +21.74%
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
It's implicit, but should be explicit. It is invalid to call At() after
a failed call to Next() or Seek().
Following up on https://github.com/prometheus/prometheus/pull/12906
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
The Next() call of ListPostings() was updating two values, while we can
just update the position. This is up to 30% faster for high number of
Postings.
goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/tsdb/index
cpu: 11th Gen Intel(R) Core(TM) i7-11700K @ 3.60GHz
│ old │ new │
│ sec/op │ sec/op vs base │
ListPostings/count=100-16 819.2n ± 0% 732.6n ± 0% -10.58% (p=0.000 n=20)
ListPostings/count=1000-16 2.685µ ± 1% 2.017µ ± 0% -24.88% (p=0.000 n=20)
ListPostings/count=10000-16 21.43µ ± 1% 14.81µ ± 0% -30.91% (p=0.000 n=20)
ListPostings/count=100000-16 209.4µ ± 1% 143.3µ ± 0% -31.55% (p=0.000 n=20)
ListPostings/count=1000000-16 2.086m ± 1% 1.436m ± 1% -31.18% (p=0.000 n=20)
geomean 29.02µ 21.41µ -26.22%
We're talking about microseconds here, but they just keep adding.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
This avoids situations where metrics are scraped before the data they
are trying to look at is initialized.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
The test was introduced in # but was changed during the code review and not reran with the faulty code since then.
Closes #
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Before cutting a new XOR chunk in case the chunk goes over the size
limit, check that the timestamp is in order and not equal or older
than the latest sample in the old chunk.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
TestHeadDetectsDuplcateSampleAtSizeLimit tests a regression where a
duplicate sample,is appended to the head, right when the head chunk is
at the size limit. The test adds all samples as duplicate, thus
expecting that the result has exactly half of the samples.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Return annotations (warnings and infos) from PromQL queries
This generalizes the warnings we have already used before (but only for problems with remote read) as "annotations".
Annotations can be warnings or infos (the latter could be false positives). We do not treat them different in the API for now and return them all as "warnings". It would be easy to distinguish them and return infos separately, should that appear useful in the future.
The new annotations are then used to create a lot of warnings or infos during PromQL evaluations. Partially these are things we have wanted for a long time (e.g. inform the user that they have applied `rate` to a metric that doesn't look like a counter), but the new native histograms have created even more needs for those annotations (e.g. if a query tries to aggregate float numbers with histograms).
The annotations added here are not yet complete. A prominent example would be a warning about a range too short for a rate calculation. But such a warnings is more tricky to create with good fidelity and we will tackle it later.
Another TODO is to take annotations into account when evaluating recording rules.
---------
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Case a) empty span is at the beginning of the spans.
Case b) two consequtive empty spans with positive offsets.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* Benchmark WBL
Extended WAL benchmark test with WBL parts too - added basic cases for
OOO handling - a percentage of series have a percentage of samples set
as OOO ones.
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
* Fix handling of explicit counter reset header in histograms.
Explicit counter reset were being ignored.
Also there was no unit test coverage.
Add test case for the first sample in a chunk.
Add test case for non first sample in chunk.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
---------
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Otherwise we have a highly unusual situation of over 100 chunks
in the headChunks list of each series, which heavily skews
performance.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
promql: Extend testing framework to support native histograms
This includes both the internal testing framework as well as the rules unit test feature of promtool.
This also adds a bunch of basic tests. Many of the code level tests can now be converted to tests within the framework, and more tests can be added easily.
---------
Signed-off-by: Harold Dost <h.dost@criteo.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Stephen Lang <stephen.lang@grafana.com>
Co-authored-by: Harold Dost <h.dost@criteo.com>
Co-authored-by: Stephen Lang <stephen.lang@grafana.com>
Co-authored-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>