Handle more arithmetic operators and aggregators for native histograms
This includes operators for multiplication (formerly known as scaling), division, and subtraction. Plus aggregations for average and the avg_over_time function.
Stdvar and stddev will (for now) ignore histograms properly (rather than counting them but adding a 0 for them).
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Introduces support for a new query parameter in the `/rules` API endpoint that allows filtering by rule names.
If all the rules of a group are filtered, we skip the group entirely.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
In other words: Instead of having a “polymorphous” `Point` that can
either contain a float value or a histogram value, use an `FPoint` for
floats and an `HPoint` for histograms.
This seemingly small change has a _lot_ of repercussions throughout
the codebase.
The idea here is to avoid the increase in size of `Point` arrays that
happened after native histograms had been added.
The higher-level data structures (`Sample`, `Series`, etc.) are still
“polymorphous”. The same idea could be applied to them, but at each
step the trade-offs needed to be evaluated.
The idea with this change is to do the minimum necessary to get back
to pre-histogram performance for functions that do not touch
histograms. Here are comparisons for the `changes` function. The test
data doesn't include histograms yet. Ideally, there would be no change
in the benchmark result at all.
First runtime v2.39 compared to directly prior to this commit:
```
name old time/op new time/op delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 542µs ± 1% +38.58% (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 617µs ± 2% +36.48% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.36ms ± 2% +21.58% (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 8.94ms ± 1% +14.21% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.30ms ± 1% +10.67% (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.10ms ± 1% +11.82% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 11.8ms ± 1% +12.50% (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 87.4ms ± 1% +12.63% (p=0.000 n=9+9)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 32.8ms ± 1% +8.01% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.6ms ± 2% +9.64% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 117ms ± 1% +11.69% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 876ms ± 1% +11.83% (p=0.000 n=9+10)
```
And then runtime v2.39 compared to after this commit:
```
name old time/op new time/op delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 547µs ± 1% +39.84% (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 616µs ± 2% +36.15% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.26ms ± 1% +12.20% (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 7.95ms ± 1% +1.59% (p=0.000 n=10+8)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.38ms ± 2% +13.49% (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.02ms ± 1% +9.80% (p=0.000 n=10+9)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 10.8ms ± 1% +3.08% (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 78.1ms ± 1% +0.58% (p=0.035 n=9+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 33.5ms ± 4% +10.18% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.0ms ± 1% +7.98% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 107ms ± 1% +1.92% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 775ms ± 1% -1.02% (p=0.019 n=9+9)
```
In summary, the runtime doesn't really improve with this change for
queries with just a few steps. For queries with many steps, this
commit essentially reinstates the old performance. This is good
because the many-step queries are the one that matter most (longest
absolute runtime).
In terms of allocations, though, this commit doesn't make a dent at
all (numbers not shown). The reason is that most of the allocations
happen in the sampleRingIterator (in the storage package), which has
to be addressed in a separate commit.
Signed-off-by: beorn7 <beorn@grafana.com>
* Correct statement in docs about query results returning either floats or histograms but not both.
* Move documentation for range and instant vectors under their corresponding headings.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Add API endpoints for getting scrape pool names
This adds api/v1/scrape_pools endpoint that returns the list of *names* of all the scrape pools configured.
Having it allows to find out what scrape pools are defined without having to list and parse all targets.
The second change is adding scrapePool query parameter support in api/v1/targets endpoint, that allows to
filter returned targets by only finding ones for passed scrape pool name.
Both changes allow to query for a specific scrape pool data, rather than getting all the targets for all possible scrape pools.
The problem with api/v1/targets endpoint is that it returns huge amount of data if you configure a lot of scrape pools.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
* Add a scrape pool selector on /targets page
Current targets page lists all possible targets. This works great if you only have a few scrape pools configured,
but for systems with a lot of scrape pools and targets this slow things down a lot.
Not only does the /targets page load very slowly in such case (waiting for huge API response) but it also take
a long time to render, due to huge number of elements.
This change adds a dropdown selector so it's possible to select only intersting scrape pool to view.
There's also scrapePool query param that will open selected pool automatically.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Illustrate use of named capturing group syntax available from https://github.com/google/re2/wiki/Syntax and their usage in the replacement field
Signed-off-by: Guillaume Berche <guillaume.berche@orange.com>
* Add /api/v1/format_query API endpoint for formatting queries
This uses the formatting functionality introduced in
https://github.com/prometheus/prometheus/pull/10544.
I've chosen "query" instead of "expr" in both the endpoint and parameter
names to stay consistent with the existing API endpoints. Otherwise, I
would have preferred to use the term "expr".
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add docs for /api/v1/format_query endpoint
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add note that formatting expressions removes comments
Signed-off-by: Julius Volz <julius.volz@gmail.com>
- For now this is relatively simplistic, but at least acknowledges some
of the basics, and points out some parts that might not be obvious at first.
Relates to #7192
Signed-off-by: Harold Dost <h.dost@criteo.com>
This follow a simple function-based approach to access the count and
sum fields of a native Histogram. It might be more elegant to
implement “accessors” via the dot operator, as considered in the
brainstorming doc [1]. However, that would require the introduction of
a whole new concept in PromQL. For the PoC, we should be fine with the
function-based approch. Even the obvious inefficiencies (rate'ing a
whole histogram twice when we only want to rate each the count and the
sum once) could be optimized behind the scenes.
Note that the function-based approach elegantly solves the problem of
detecting counter resets in the sum of observations in the case of
negative observations. (Since the whole native Histogram is rate'd,
the counter reset is detected for the Histogram as a whole.)
We will decide later if an “accessor” approach is really needed. It
would change the example expression for average duration in
functions.md from
histogram_sum(rate(http_request_duration_seconds[10m]))
/
histogram_count(rate(http_request_duration_seconds[10m]))
to
rate(http_request_duration_seconds.sum[10m])
/
rate(http_request_duration_seconds.count[10m])
[1]: https://docs.google.com/document/d/1ch6ru8GKg03N02jRjYriurt-CZqUVY09evPg6yKTA1s/edit
Signed-off-by: beorn7 <beorn@grafana.com>
Since Prometheus documentation is versioned, do not write down that a
specific function was added in Prom 2.0, for consistency.
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
* Provide a callout about the vector matching keywords and the group
modifiers.
Relates prometheus/docs#2106
Signed-off-by: Harold Dost <h.dost@criteo.com>
Split out the note about regex-matching into a separate paragraph to
make it more obvious. Move it up, closer to the definition.
Signed-off-by: SuperQ <superq@gmail.com>
This follows the line of argument that the invariant of not looking
ahead of the query time was merely emerging behavior and not a
documented stable feature. Any query that looks ahead of the query
time was simply invalid before the introduction of the negative offset
and the @ modifier.
Signed-off-by: beorn7 <beorn@grafana.com>
Since `/api/v1/write` is a mutating endpoint, we should still activate
the remote-write-receiver explicitly. But we should do it in the same
way as the other mutating endpoints, i.e. via a flag
`--web.enable-remote-write-receiver`.
This commit marks the feature flag as deprecated, i.e. it still works
but logs a warning on startup. This enables users to seamlessly
migrate. With the next minor release, we can start ignoring the
feature flag (but still warn a user that is trying to use it).
Signed-off-by: beorn7 <beorn@grafana.com>
* Added walreplay API endpoint
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Added starting page to react-ui
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Documented the new endpoint
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed typos
Signed-off-by: Levi Harrison <git@leviharrison.dev>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
* Removed logo
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed isResponding to isUnexpected
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed width of progress bar
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed width of progress bar
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Added DB stats object
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Updated starting page to work with new fields
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil (pt. 2)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil (pt. 3)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil (and also implementing a method this time) (pt. 4)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil (and also implementing a method this time) (pt. 5)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed const to let
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Passing nil (pt. 6)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Remove SetStats method
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Added comma
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed api
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed to triple equals
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed data response types
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Don't return pointer
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Changed version
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed interface issue
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed pointer
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed copying lock value error
Signed-off-by: Levi Harrison <git@leviharrison.dev>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Clarify documentation for label_replace() because of ambiguities
between label keys and label values.
Signed-off-by: Matthew Smedberg <matthew.smedberg@gmail.com>
* Document lack of metadata deletion in deletion API
See also https://github.com/prometheus/prometheus/issues/7932
I also fixed the other "NOTE" callouts to be formatted in such a way that our
docs system formats them as callout boxes.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* WIP
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Improve wordings
Signed-off-by: Julius Volz <julius.volz@gmail.com>
This commit adds `@ <timestamp>` modifier as per this design doc: https://docs.google.com/document/d/1uSbD3T2beM-iX4-Hp7V074bzBRiRNlqUdcWP6JTDQSs/edit.
An example query:
```
rate(process_cpu_seconds_total[1m])
and
topk(7, rate(process_cpu_seconds_total[1h] @ 1234))
```
which ranks based on last 1h rate and w.r.t. unix timestamp 1234 but actually plots the 1m rate.
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Sadly, just linking to the Histogram best practice document, as done
for `histogram_quantile`, would be confusing here because the best
practice document only deals with quantiles in the context of
Histograms and Summaries, which is very different from the context of
the `quantile` aggregator and `quantile_over_time` function, which is
already a source of a lot of confusion.
Thus, I think the least bad solution is to add a short explanation in
this section directly. There isn't even a good resource on the
internet we can link to. A lot of statisticians use φ-quantiles, but
they don't have a generally accepted name for it.
I have added the explanation after the other detailed explanations of
`count_values`, `topk` and `bottomk`. I think that fits quite nicely
into the flow.
Signed-off-by: beorn7 <beorn@grafana.com>
* add time range params to labelNames api
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* evaluate min/max time range when reading labels from the head
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* add time range params to labelValues api
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* fix test, add docs
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* add a test for head min max range
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* fix test to match comment
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* address CR comments
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* combine vars only used once
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* add time range params to labelNames api
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* evaluate min/max time range when reading labels from the head
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* add time range params to labelValues api
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* fix test, add docs
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* add a test for head min max range
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* fix test to match comment
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* address CR comments
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* combine vars only used once
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* fix test
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* restart ci
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* use range expectedLabelNames instead of range actualLabelNames in test
Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
* Added the version to tsdb stats api methods
* Updated changelog.md with references to the status page PRs
Signed-off-by: Joaquin Fernandez Campo <jfcampo@gmail.com>