The current implementation of `InternStrings` will only save memory
when the whole set of labels is identical to one already seen, and this
cannot happen in the one place it is called from in Prometheus,
remote-write, which already detects identical series.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This PR is a reference implementation of the proposal described in #10420.
In addition to what described in #10420, in this PR I've introduced labels.StableHash(). The idea is to offer an hashing function which doesn't change over time, and that's used by query sharding in order to get a stable behaviour over time. The implementation of labels.StableHash() is the hashing function used by Prometheus before stringlabels, and what's used by Grafana Mimir for query sharding (because built before stringlabels was a thing).
Follow up work
As mentioned in #10420, if this PR is accepted I'm also open to upload another foundamental piece used by Grafana Mimir query sharding to accelerate the query execution: an optional, configurable and fast in-memory cache for the series hashes.
Signed-off-by: Marco Pracucci <marco@pracucci.com>
This function is called very frequently when executing PromQL functions,
and we can do it much more efficiently inside Labels.
In the common case that `__name__` comes first in the labels, we simply
re-point to start at the next label, which is nearly free.
`DropMetricName` is now so cheap I removed the cache - benchmarks show
everything still goes faster.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Optimize histogram iterators
Histogram iterators allocate new objects in the AtHistogram and
AtFloatHistogram methods, which makes calculating rates over long
ranges expensive.
In #13215 we allowed an existing object to be reused
when converting an integer histogram to a float histogram. This commit follows
the same idea and allows injecting an existing object in the AtHistogram and
AtFloatHistogram methods. When the injected value is nil, iterators allocate
new histograms, otherwise they populate and return the injected object.
The commit also adds a CopyTo method to Histogram and FloatHistogram which
is used in the BufferedIterator to overwrite items in the ring instead of making
new copies.
Note that a specialized HPoint pool is needed for all of this to work
(`matrixSelectorHPool`).
---------
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Add warnings for histogramRate applied with isCounter not matching counter/gauge histogram
---------
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
The slices package is added to the standard library in Go 1.21;
we need to import from the exp area to maintain compatibility with Go 1.20.
Signed-off-by: tyltr <tylitianrui@126.com>
They are used in multiple repos, so common is a better place for them.
Several packages now don't depend on `model/textparse`, e.g.
`storage/remote`.
Also remove `metadata` struct from `api.go`, since it was identical to
a struct in the `metadata` package.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
If `cfg.TargetLabel` is a template like `$1`, it won't match any labels,
so no point calling `lb.Del` with it.
Similarly if `target` is not a valid label name, it won't match any
labels, so don't call with that either.
The intention seems to have been that a blank _value_ would delete the
target, so change that code to use `target` instead of `cfg.TargetLabel`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Thought this would be a nice check on the `Validate()` function, but
some of the test cases needed tweaking to pass.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
For `Lowercase`, `KeepEqual`, etc., we do not expand a regexp, so
the target label name must not contain anything like `${1}`.
Also for the common case that the `Replace` target does not require any
template expansion, check that the entire string passes label name
validity rules.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Append created timestamps.
Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
* Log when created timestamps are ignored
Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
* Proposed changes to Append CT PR.
Changes:
* Changed textparse Parser interface for consistency and robustness.
* Changed CT interface to be more explicit and handle validation.
* Simplified test, change scrapeManager to allow testability.
* Added TODOs.
Signed-off-by: bwplotka <bwplotka@gmail.com>
* Updates.
Signed-off-by: bwplotka <bwplotka@gmail.com>
* Addressed comments.
Signed-off-by: bwplotka <bwplotka@gmail.com>
* Refactor head_appender test
Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
* Fix linter issues
Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
* Use model.Sample in head appender test
Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
---------
Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
Co-authored-by: bwplotka <bwplotka@gmail.com>
The 'ToFloat' method on integer histograms currently allocates new memory
each time it is called.
This commit adds an optional *FloatHistogram parameter that can be used
to reuse span and bucket slices. It is up to the caller to make sure the
input float histogram is not used anymore after the call.
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
This reduces bulk and should avoid issues if a fix is made in one file
and not the other.
A few methods now call `Range()` instead of `range`, but nothing
performance-sensitive.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Fix and improve ingesting exemplars for native histograms.
See code comment for a detailed explanation of the algorithm.
Note that this changes the current behavior for all kind of samples slightly: We now allow exemplars with the same timestamp as during the last scrape if the value or the labels have changed.
Also note that we now do not ingest exemplars without timestamps for native histograms anymore.
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
---------
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: zenador <zenador@users.noreply.github.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
* Labels: reduce allocations when creating from TSDB
When reading the WAL, by passing references into the buffer we can avoid
copying strings under `-tags stringlabels`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Add `ReduceResolution` method to `Histogram` and `FloatHistogram`
This takes the original `mergeToSchema` function and turns it into a more generic `reduceResolution` function, which is the building block for the new methods.
The methods will help with addressing #12864.
---------
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
promql: Extend testing framework to support native histograms
This includes both the internal testing framework as well as the rules unit test feature of promtool.
This also adds a bunch of basic tests. Many of the code level tests can now be converted to tests within the framework, and more tests can be added easily.
---------
Signed-off-by: Harold Dost <h.dost@criteo.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Stephen Lang <stephen.lang@grafana.com>
Co-authored-by: Harold Dost <h.dost@criteo.com>
Co-authored-by: Stephen Lang <stephen.lang@grafana.com>
Co-authored-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
PR #12557 introduced the possibility of parsing multiple exemplars per
native histograms. It did so by requiring the `Exemplar` method of the
parser to be called repeatedly until it returns false. However, the
protobuf parser code wasn't correctly updated for the old case of a
single exemplar for a classic bucket (if actually parsed as a classic
bucket) and a single exemplar on a counter. In those cases, the method
would return `true` forever, yielding the same exemplar again and
again, leading to an endless loop.
With this fix, the state is now tracked and the single exemplar is
only returned once.
Signed-off-by: beorn7 <beorn@grafana.com>
Native histograms without observations and with a zero threshold of
zero look the same as classic histograms in the protobuf exposition
format. According to
https://github.com/prometheus/client_golang/issues/1127 , the idea is
to add a no-op span to those histograms to mark them as native
histograms. This commit enables Prometheus to detect that no-op span
and adds a doc comment to the proto spec describing the behavior.
Signed-off-by: beorn7 <beorn@grafana.com>
The bounds weren't really used so far, so no actual bug in the code so
far. But it's obviously confusing if the bounds returned by a
floatBucketIterator with a target schema different from the original
schema are wrong.
Signed-off-by: beorn7 <beorn@grafana.com>
If a float histogram has a zero bucket with a threshold of zero _and_
an empty zero bucket, it wasn't identified as a native histogram
because the `isNativeHistogram` helper function only looked at integer
buckets.
Signed-off-by: beorn7 <beorn@grafana.com>
Native histograms without a zero threshold aren't federated properly.
This adds a test to prove the specific failure mode, which is that
histograms with a zero threshold of zero are federated as classic
histograms.
The underlying reason is that the protobuf parser identifies a native
histogram by detecting a zero bucket or by detecting integer buckets.
Therefore, a float histogram with a zero threshold of zero and an
unpopulated zero bucket falls through the cracks (no integer buckets,
no zero bucket).
This commit also addse a test case for the latter.
Signed-off-by: beorn7 <beorn@grafana.com>
This has become a requirement for native histograms, as a single
histogram sample commonly has many buckets, so that providing many
exemplars makes sense.
Since OM text doesn't support native histograms yet, the test had to
be expanded to also support protobuf test cases.
Signed-off-by: beorn7 <beorn@grafana.com>
The problem was the following:
When trying to parse native histograms and classic histograms in
parallel, the parser would first parse the histogram proto messages as
a native histogram and then parse the same message again, but now as a
classic histogram. Afterwards, it would forget that it was dealing
with a metric family that contains native histograms and would parse
the rest of the metric family as classic histograms only. The fix is
to check again after being done with a classic histogram.
Signed-off-by: beorn7 <beorn@grafana.com>
Inline one call to `decodeString`, and skip decoding the value string
until we find a match for the name.
Do a quick check on the first character in each string,
and exit early if we've gone past - labels are sorted in order.
Also improve tests and benchmark:
* labels: test Get with varying lengths - it's not typical for Prometheus labels to all be the same length.
* extend benchmark with label not found
---------
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Instead of unpacking every individual string, we skip to the point
where there is a difference, going 8 bytes at a time where possible.
Add benchmark for Compare; extend tests too.
---------
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
This is a minor cosmetical change, but my IDE (and I guess many of them)
nests `labels_string.go` under `labels.go` because it assumes it's the
file generated by the `stringer` tool, which follows that naming
pattern.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
labels: dont compile regex matcher if we know its a literal
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Sharad <sharadgaur@gmail.com>
Handle more arithmetic operators and aggregators for native histograms
This includes operators for multiplication (formerly known as scaling), division, and subtraction. Plus aggregations for average and the avg_over_time function.
Stdvar and stddev will (for now) ignore histograms properly (rather than counting them but adding a 0 for them).
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
So far, if a target exposes a histogram with both classic and native
buckets, a native-histogram enabled Prometheus would ignore the
classic buckets. With the new scrape config option
`scrape_classic_histograms` set, both buckets will be ingested,
creating all the series of a classic histogram in parallel to the
native histogram series. For example, a histogram `foo` would create a
native histogram series `foo` and classic series called `foo_sum`,
`foo_count`, and `foo_bucket`.
This feature can be used in a migration strategy from classic to
native histograms, where it is desired to have a transition period
during which both native and classic histograms are present.
Note that two bugs in classic histogram parsing were found and fixed
as a byproduct of testing the new feature:
1. Series created from classic _gauge_ histograms didn't get the
_sum/_count/_bucket prefix set.
2. Values of classic _float_ histograms weren't parsed properly.
Signed-off-by: beorn7 <beorn@grafana.com>
* labels: respect Set after Del in Builder
The implementations are not symmetric between `Set()` and `Del()`, so
we must be careful. Add tests for this, both in labels and in relabel
where the issue was reported.
Also make the slice implementation consistent re `slices.Contains`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.
The exceptions that I have found in our codebase are just these two:
* The `if else` is followed by an additional statement before the next
condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
used. In this case, using `switch` would require tagging the `for`
loop, which probably tips the balance.
Why are `switch` statements more readable?
For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.
I'm sure the aforemention wise coders can list even more reasons.
In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.
Signed-off-by: beorn7 <beorn@grafana.com>
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.
There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).
I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.
I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.
Signed-off-by: beorn7 <beorn@grafana.com>
Add a fast path for the common case that a string is less than 127 bytes
long, to skip a shift and the loop.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This is a method used by some downstream projects; it was created to
optimize the implementation in `labels_string.go` but we should have one
for both implementations so the same code works with either.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Deleted labels are remembered, even if they were not in `base` or were
removed from `add`, so `base+add-del` could go negative.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Go spends some time initializing all the elements of these arrays to
zero, so reduce the size from 1024 to 128. This is still much bigger
than we ever expect for a set of labels.
(If someone does have more than 128 labels it will still work, but via
heap allocation.)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
It took a `Labels` where the memory could be re-used, but in practice
this hardly ever benefitted. Especially after converting `relabel.Process`
to `relabel.ProcessBuilder`.
Comparing the parameter to `nil` was a bug; `EmptyLabels` is not `nil`
so the slice was reallocated multiple times by `append`.
Lastly `Builder.Labels()` now estimates that the final size will depend
on labels added and deleted.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Although we had a different slice, the underlying memory was the same so
any changes meant we could skip some values.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Save work converting between Builder and Labels.
Also expose ProcessBuilder, so callers can supply a Builder.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This lets relabelling work on a `Builder` rather than converting to and
from `Labels` on every rule.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
The difference is modest, but we've used `slices.Sort` in lots of other
places so why not here.
name old time/op new time/op delta
Builder 1.04µs ± 3% 0.95µs ± 3% -8.27% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Builder 312B ± 0% 288B ± 0% -7.69% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Builder 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This makes the buffer the correct size for the common case that labels
have only been added. It will be too large for the case that labels are
changed, but the current buffer resize logic in `appendLabelTo` doubles
the buffer, so a small over-estimate is better.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This change removes restrictions to allow adding exemplars
to all time series. It also contains some improvements in test values
so that it is easier to track what is tested.
The advantage of doing this is having a little less error-prone tests:
"yy" is not really descriptive but "counter-test" can give people
a better idea about what is tested so it is harder to make mistakes.
Closes gh-11982
Signed-off-by: Jonatan Ivanov <jonatan.ivanov@gmail.com>
Previous code was effectively doing BigEndian.Uint64, so call that and save time.
An md5.Sum result is always 16 bytes. The first 8 are not used in the result, just as before.
Signed-off-by: Renning Bruns <ren@renmail.net>
These benchmarks were testing things related to what Prometheus does, but not testing actual Prometheus code.
Moved the label-copying benchmark into the labels package.
This commit adds an alternate implementation for `labels.Labels`, behind
a build tag `stringlabels`.
Instead of storing label names and values as individual strings, they
are all concatenated into one string in this format:
[len][name0][len][value0][len][name1][len][value1]...
The lengths are varint encoded so usually a single byte.
The previous `[]string` had 24 bytes of overhead for the slice and 16
for each label name and value; this one has 16 bytes overhead plus 1
for each name and value.
In `ScratchBuilder.Overwrite` and `Labels.Hash` we use an unsafe
conversion from string to byte slice. `Overwrite` is explicitly unsafe,
but for `Hash` this is a pure performance hack.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Parsing errors in the Prometheus HTTP format parser are very hard to
investigate since they only approximately indicate what is going wrong
in the parser and don't provide any information about the incorrect
input. As such it is very hard to tell what is wrong in the format
exposed by the application.
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
This commit adds a new 'keep_firing_for' field to Prometheus alerting
rules. The 'resolve_delay' field specifies the minimum amount of time
that an alert should remain firing, even if the expression does not
return any results.
This feature was discussed at a previous dev summit, and it was
determined that a feature like this would be useful in order to allow
the expression time to stabilize and prevent confusing resolved messages
from being propagated through Alertmanager.
This approach is simpler than having two PromQL queries, as was
sometimes discussed, and it should be easy to implement.
This commit does not include tests for the 'resolve_delay' field. This
is intentional, as the purpose of this commit is to gather comments on
the proposed design of the 'resolve_delay' field before implementing
tests. Once the design of the 'resolve_delay' field has been finalized,
a follow-up commit will be submitted with tests."
See https://github.com/prometheus/prometheus/issues/11570
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
If a (float or integer) histogram is a gauge histogram, set the
CounterResetHint accordingly. (The default value is fine for the
normal counter histograms.)
Signed-off-by: beorn7 <beorn@grafana.com>