mirror of
https://github.com/prometheus/prometheus.git
synced 2024-12-27 06:29:42 -08:00
c94c5b64c3
* rebase 2024-07-01, picks previous renaming to `limitk()` and `limit_ratio()` Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * gofumpt -d -extra Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * more lint fixes Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * more lint fixes+ Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * put limitk() and limit_ratio() behind --enable-feature=promql-experimental-functions Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * EnableExperimentalFunctions for TestConcurrentRangeQueries() also Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * use testutil.RequireEqual to fix tests, WIP equivalent thingie for require.Contains Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * lint fix Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * moar linting Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * rebase 2024-06-19 Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * re-add limit(2, metric) testing for N=2 common series subset Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * move `ratio = param` to default switch case, for better readability Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * gofumpt -d -extra util/testutil/cmp.go Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * early break when reaching k elems in limitk(), should have always been so (!) Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * small typo fix Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * no-change small break-loop rearrange for readability Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * remove IsNan(ratio) condition in switch-case, already handled as input validation Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * no-change adding some comments Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * no-change simplify fullMatrix() helper functions used for tests Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * add `limitk(-1, metric)` testcase, which is handled as any k < 1 case Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * engine_test.go: no-change create `requireCommonSeries() helper func (moving code into it) for readability Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * rebase 2024-06-21 Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * engine_test.go: HAPPY NOW about its code -> reorg, create and use simpleRangeQuery() function, less lines and more readable ftW \o/ Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * move limitk(), limit_ratio() testing to promql/promqltest/testdata/limit.test Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * remove stale leftover after moving tests from engine_test.go to testdata/ Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * fix flaky `limit_ratio(0.5, ...)` test case Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * Update promql/engine.go Co-authored-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * Update promql/engine.go Co-authored-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * Update promql/engine.go Co-authored-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * fix AddRatioSample() implementation to use a single conditional (instead of switch/case + fallback return) Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * docs/querying/operators.md: document r < 0 Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * add negative limit_ratio() example to docs/querying/examples.md Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * move more extensive docu examples to docs/querying/operators.md Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * typo Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * small docu fix for poor-mans-normality-check, add it to limit.test ;) Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * limit.test: expand "Poor man's normality check" to whole eval range Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * restore mistakenly removed existing small comment Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * expand poors-man-normality-check case(s) Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * Revert "expand poors-man-normality-check case(s)" This reverts commit f69e1603b2ebe69c0a100197cfbcf6f81644b564, indeed too flaky 0:) Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * remove humor from docs/querying/operators.md Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * fix signoff Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * add web/ui missing changes Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * expand limit_ratio test cases, cross-fingering they'll not be flaky Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * remove flaky test Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * add missing warnings.Merge(ws) in instant-query return shortcut Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * add missing LimitK||LimitRatio case to codemirror-promql/src/parser/parser.ts Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * fix ui-lint Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> * actually fix returned warnings :] Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> --------- Signed-off-by: JuanJo Ciarlante <juanjosec@gmail.com> Co-authored-by: Julius Volz <julius.volz@gmail.com>
108 lines
3.9 KiB
Markdown
108 lines
3.9 KiB
Markdown
---
|
|
title: Querying examples
|
|
nav_title: Examples
|
|
sort_rank: 4
|
|
---
|
|
|
|
# Query examples
|
|
|
|
## Simple time series selection
|
|
|
|
Return all time series with the metric `http_requests_total`:
|
|
|
|
http_requests_total
|
|
|
|
Return all time series with the metric `http_requests_total` and the given
|
|
`job` and `handler` labels:
|
|
|
|
http_requests_total{job="apiserver", handler="/api/comments"}
|
|
|
|
Return a whole range of time (in this case 5 minutes up to the query time)
|
|
for the same vector, making it a [range vector](../basics/#range-vector-selectors):
|
|
|
|
http_requests_total{job="apiserver", handler="/api/comments"}[5m]
|
|
|
|
Note that an expression resulting in a range vector cannot be graphed directly,
|
|
but viewed in the tabular ("Console") view of the expression browser.
|
|
|
|
Using regular expressions, you could select time series only for jobs whose
|
|
name match a certain pattern, in this case, all jobs that end with `server`:
|
|
|
|
http_requests_total{job=~".*server"}
|
|
|
|
All regular expressions in Prometheus use [RE2
|
|
syntax](https://github.com/google/re2/wiki/Syntax).
|
|
|
|
To select all HTTP status codes except 4xx ones, you could run:
|
|
|
|
http_requests_total{status!~"4.."}
|
|
|
|
## Subquery
|
|
|
|
Return the 5-minute [rate](./functions.md#rate) of the `http_requests_total` metric for the past 30 minutes, with a resolution of 1 minute.
|
|
|
|
rate(http_requests_total[5m])[30m:1m]
|
|
|
|
This is an example of a nested subquery. The subquery for the `deriv` function uses the default resolution. Note that using subqueries unnecessarily is unwise.
|
|
|
|
max_over_time(deriv(rate(distance_covered_total[5s])[30s:5s])[10m:])
|
|
|
|
## Using functions, operators, etc.
|
|
|
|
Return the per-second rate for all time series with the `http_requests_total`
|
|
metric name, as measured over the last 5 minutes:
|
|
|
|
rate(http_requests_total[5m])
|
|
|
|
Assuming that the `http_requests_total` time series all have the labels `job`
|
|
(fanout by job name) and `instance` (fanout by instance of the job), we might
|
|
want to sum over the rate of all instances, so we get fewer output time series,
|
|
but still preserve the `job` dimension:
|
|
|
|
sum by (job) (
|
|
rate(http_requests_total[5m])
|
|
)
|
|
|
|
If we have two different metrics with the same dimensional labels, we can apply
|
|
binary operators to them and elements on both sides with the same label set
|
|
will get matched and propagated to the output. For example, this expression
|
|
returns the unused memory in MiB for every instance (on a fictional cluster
|
|
scheduler exposing these metrics about the instances it runs):
|
|
|
|
(instance_memory_limit_bytes - instance_memory_usage_bytes) / 1024 / 1024
|
|
|
|
The same expression, but summed by application, could be written like this:
|
|
|
|
sum by (app, proc) (
|
|
instance_memory_limit_bytes - instance_memory_usage_bytes
|
|
) / 1024 / 1024
|
|
|
|
If the same fictional cluster scheduler exposed CPU usage metrics like the
|
|
following for every instance:
|
|
|
|
instance_cpu_time_ns{app="lion", proc="web", rev="34d0f99", env="prod", job="cluster-manager"}
|
|
instance_cpu_time_ns{app="elephant", proc="worker", rev="34d0f99", env="prod", job="cluster-manager"}
|
|
instance_cpu_time_ns{app="turtle", proc="api", rev="4d3a513", env="prod", job="cluster-manager"}
|
|
instance_cpu_time_ns{app="fox", proc="widget", rev="4d3a513", env="prod", job="cluster-manager"}
|
|
...
|
|
|
|
...we could get the top 3 CPU users grouped by application (`app`) and process
|
|
type (`proc`) like this:
|
|
|
|
topk(3, sum by (app, proc) (rate(instance_cpu_time_ns[5m])))
|
|
|
|
Assuming this metric contains one time series per running instance, you could
|
|
count the number of running instances per application like this:
|
|
|
|
count by (app) (instance_cpu_time_ns)
|
|
|
|
If we are exploring some metrics for their labels, to e.g. be able to aggregate
|
|
over some of them, we could use the following:
|
|
|
|
limitk(10, app_foo_metric_bar)
|
|
|
|
Alternatively, if we wanted the returned timeseries to be more evenly sampled,
|
|
we could use the following to get approximately 10% of them:
|
|
|
|
limit_ratio(0.1, app_foo_metric_bar)
|