mirror of
https://github.com/prometheus/prometheus.git
synced 2024-11-09 23:24:05 -08:00
Import querying documentation from prometheus/docs
This commit is contained in:
parent
299802dfd0
commit
e6cdc2d355
|
@ -1,5 +1,6 @@
|
|||
---
|
||||
title: Configuration
|
||||
sort_rank: 3
|
||||
---
|
||||
|
||||
# Configuration
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
title: Getting started
|
||||
sort_rank: 10
|
||||
sort_rank: 1
|
||||
---
|
||||
|
||||
# Getting started
|
||||
|
|
|
@ -14,3 +14,4 @@ The documentation is available alongside all the project documentation at
|
|||
- [Installing](install.md)
|
||||
- [Getting started](getting_started.md)
|
||||
- [Configuration](configuration.md)
|
||||
- [Querying](querying/basics.md)
|
||||
|
|
|
@ -1,8 +1,9 @@
|
|||
---
|
||||
title: Installing
|
||||
title: Installation
|
||||
sort_rank: 2
|
||||
---
|
||||
|
||||
# Installing
|
||||
# Installation
|
||||
|
||||
## Using pre-compiled binaries
|
||||
|
||||
|
|
417
docs/querying/api.md
Normal file
417
docs/querying/api.md
Normal file
|
@ -0,0 +1,417 @@
|
|||
---
|
||||
title: HTTP API
|
||||
sort_rank: 7
|
||||
---
|
||||
|
||||
# HTTP API
|
||||
|
||||
The current stable HTTP API is reachable under `/api/v1` on a Prometheus
|
||||
server. Any non-breaking additions will be added under that endpoint.
|
||||
|
||||
## Format overview
|
||||
|
||||
The API response format is JSON. Every successful API request returns a `2xx`
|
||||
status code.
|
||||
|
||||
Invalid requests that reach the API handlers return a JSON error object
|
||||
and one of the following HTTP response codes:
|
||||
|
||||
- `400 Bad Request` when parameters are missing or incorrect.
|
||||
- `422 Unprocessable Entity` when an expression can't be executed
|
||||
([RFC4918](http://tools.ietf.org/html/rfc4918#page-78)).
|
||||
- `503 Service Unavailable` when queries time out or abort.
|
||||
|
||||
Other non-`2xx` codes may be returned for errors occurring before the API
|
||||
endpoint is reached.
|
||||
|
||||
The JSON response envelope format is as follows:
|
||||
|
||||
```
|
||||
{
|
||||
"status": "success" | "error",
|
||||
"data": <data>,
|
||||
|
||||
// Only set if status is "error". The data field may still hold
|
||||
// additional data.
|
||||
"errorType": "<string>",
|
||||
"error": "<string>"
|
||||
}
|
||||
```
|
||||
|
||||
Input timestamps may be provided either in
|
||||
[RFC3339](https://www.ietf.org/rfc/rfc3339.txt) format or as a Unix timestamp
|
||||
in seconds, with optional decimal places for sub-second precision. Output
|
||||
timestamps are always represented as Unix timestamps in seconds.
|
||||
|
||||
Names of query parameters that may be repeated end with `[]`.
|
||||
|
||||
`<series_selector>` placeholders refer to Prometheus [time series
|
||||
selectors](basics.md#time-series-selectors) like `http_requests_total` or
|
||||
`http_requests_total{method=~"^GET|POST$"}` and need to be URL-encoded.
|
||||
|
||||
`<duration>` placeholders refer to Prometheus duration strings of the form
|
||||
`[0-9]+[smhdwy]`. For example, `5m` refers to a duration of 5 minutes.
|
||||
|
||||
## Expression queries
|
||||
|
||||
Query language expressions may be evaluated at a single instant or over a range
|
||||
of time. The sections below describe the API endpoints for each type of
|
||||
expression query.
|
||||
|
||||
### Instant queries
|
||||
|
||||
The following endpoint evaluates an instant query at a single point in time:
|
||||
|
||||
```
|
||||
GET /api/v1/query
|
||||
```
|
||||
|
||||
URL query parameters:
|
||||
|
||||
- `query=<string>`: Prometheus expression query string.
|
||||
- `time=<rfc3339 | unix_timestamp>`: Evaluation timestamp. Optional.
|
||||
- `timeout=<duration>`: Evaluation timeout. Optional. Defaults to and
|
||||
is capped by the value of the `-query.timeout` flag.
|
||||
|
||||
The current server time is used if the `time` parameter is omitted.
|
||||
|
||||
The `data` section of the query result has the following format:
|
||||
|
||||
```
|
||||
{
|
||||
"resultType": "matrix" | "vector" | "scalar" | "string",
|
||||
"result": <value>
|
||||
}
|
||||
```
|
||||
|
||||
`<value>` refers to the query result data, which has varying formats
|
||||
depending on the `resultType`. See the [expression query result
|
||||
formats](#expression-query-result-formats).
|
||||
|
||||
The following example evaluates the expression `up` at the time
|
||||
`2015-07-01T20:10:51.781Z`:
|
||||
|
||||
```json
|
||||
$ curl 'http://localhost:9090/api/v1/query?query=up&time=2015-07-01T20:10:51.781Z'
|
||||
{
|
||||
"status" : "success",
|
||||
"data" : {
|
||||
"resultType" : "vector",
|
||||
"result" : [
|
||||
{
|
||||
"metric" : {
|
||||
"__name__" : "up",
|
||||
"job" : "prometheus",
|
||||
"instance" : "localhost:9090"
|
||||
},
|
||||
"value": [ 1435781451.781, "1" ]
|
||||
},
|
||||
{
|
||||
"metric" : {
|
||||
"__name__" : "up",
|
||||
"job" : "node",
|
||||
"instance" : "localhost:9100"
|
||||
},
|
||||
"value" : [ 1435781451.781, "0" ]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Range queries
|
||||
|
||||
The following endpoint evaluates an expression query over a range of time:
|
||||
|
||||
```
|
||||
GET /api/v1/query_range
|
||||
```
|
||||
|
||||
URL query parameters:
|
||||
|
||||
- `query=<string>`: Prometheus expression query string.
|
||||
- `start=<rfc3339 | unix_timestamp>`: Start timestamp.
|
||||
- `end=<rfc3339 | unix_timestamp>`: End timestamp.
|
||||
- `step=<duration>`: Query resolution step width.
|
||||
- `timeout=<duration>`: Evaluation timeout. Optional. Defaults to and
|
||||
is capped by the value of the `-query.timeout` flag.
|
||||
|
||||
The `data` section of the query result has the following format:
|
||||
|
||||
```
|
||||
{
|
||||
"resultType": "matrix",
|
||||
"result": <value>
|
||||
}
|
||||
```
|
||||
|
||||
For the format of the `<value>` placeholder, see the [range-vector result
|
||||
format](#range-vectors).
|
||||
|
||||
The following example evaluates the expression `up` over a 30-second range with
|
||||
a query resolution of 15 seconds.
|
||||
|
||||
```json
|
||||
$ curl 'http://localhost:9090/api/v1/query_range?query=up&start=2015-07-01T20:10:30.781Z&end=2015-07-01T20:11:00.781Z&step=15s'
|
||||
{
|
||||
"status" : "success",
|
||||
"data" : {
|
||||
"resultType" : "matrix",
|
||||
"result" : [
|
||||
{
|
||||
"metric" : {
|
||||
"__name__" : "up",
|
||||
"job" : "prometheus",
|
||||
"instance" : "localhost:9090"
|
||||
},
|
||||
"values" : [
|
||||
[ 1435781430.781, "1" ],
|
||||
[ 1435781445.781, "1" ],
|
||||
[ 1435781460.781, "1" ]
|
||||
]
|
||||
},
|
||||
{
|
||||
"metric" : {
|
||||
"__name__" : "up",
|
||||
"job" : "node",
|
||||
"instance" : "localhost:9091"
|
||||
},
|
||||
"values" : [
|
||||
[ 1435781430.781, "0" ],
|
||||
[ 1435781445.781, "0" ],
|
||||
[ 1435781460.781, "1" ]
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Querying metadata
|
||||
|
||||
### Finding series by label matchers
|
||||
|
||||
The following endpoint returns the list of time series that match a certain label set.
|
||||
|
||||
```
|
||||
GET /api/v1/series
|
||||
```
|
||||
|
||||
URL query parameters:
|
||||
|
||||
- `match[]=<series_selector>`: Repeated series selector argument that selects the
|
||||
series to return. At least one `match[]` argument must be provided.
|
||||
- `start=<rfc3339 | unix_timestamp>`: Start timestamp.
|
||||
- `end=<rfc3339 | unix_timestamp>`: End timestamp.
|
||||
|
||||
The `data` section of the query result consists of a list of objects that
|
||||
contain the label name/value pairs which identify each series.
|
||||
|
||||
The following example returns all series that match either of the selectors
|
||||
`up` or `process_start_time_seconds{job="prometheus"}`:
|
||||
|
||||
```json
|
||||
$ curl -g 'http://localhost:9090/api/v1/series?match[]=up&match[]=process_start_time_seconds{job="prometheus"}'
|
||||
{
|
||||
"status" : "success",
|
||||
"data" : [
|
||||
{
|
||||
"__name__" : "up",
|
||||
"job" : "prometheus",
|
||||
"instance" : "localhost:9090"
|
||||
},
|
||||
{
|
||||
"__name__" : "up",
|
||||
"job" : "node",
|
||||
"instance" : "localhost:9091"
|
||||
},
|
||||
{
|
||||
"__name__" : "process_start_time_seconds",
|
||||
"job" : "prometheus",
|
||||
"instance" : "localhost:9090"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Querying label values
|
||||
|
||||
The following endpoint returns a list of label values for a provided label name:
|
||||
|
||||
```
|
||||
GET /api/v1/label/<label_name>/values
|
||||
```
|
||||
|
||||
The `data` section of the JSON response is a list of string label names.
|
||||
|
||||
This example queries for all label values for the `job` label:
|
||||
|
||||
```json
|
||||
$ curl http://localhost:9090/api/v1/label/job/values
|
||||
{
|
||||
"status" : "success",
|
||||
"data" : [
|
||||
"node",
|
||||
"prometheus"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Deleting series
|
||||
|
||||
The following endpoint deletes matched series entirely from a Prometheus server:
|
||||
|
||||
```
|
||||
DELETE /api/v1/series
|
||||
```
|
||||
|
||||
URL query parameters:
|
||||
|
||||
- `match[]=<series_selector>`: Repeated label matcher argument that selects the
|
||||
series to delete. At least one `match[]` argument must be provided.
|
||||
|
||||
The `data` section of the JSON response has the following format:
|
||||
|
||||
```
|
||||
{
|
||||
"numDeleted": <number of deleted series>
|
||||
}
|
||||
```
|
||||
|
||||
The following example deletes all series that match either of the selectors
|
||||
`up` or `process_start_time_seconds{job="prometheus"}`:
|
||||
|
||||
```json
|
||||
$ curl -XDELETE -g 'http://localhost:9090/api/v1/series?match[]=up&match[]=process_start_time_seconds{job="prometheus"}'
|
||||
{
|
||||
"status" : "success",
|
||||
"data" : {
|
||||
"numDeleted" : 3
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Expression query result formats
|
||||
|
||||
Expression queries may return the following response values in the `result`
|
||||
property of the `data` section. `<sample_value>` placeholders are numeric
|
||||
sample values. JSON does not support special float values such as `NaN`, `Inf`,
|
||||
and `-Inf`, so sample values are transferred as quoted JSON strings rather than
|
||||
raw numbers.
|
||||
|
||||
### Range vectors
|
||||
|
||||
Range vectors are returned as result type `matrix`. The corresponding
|
||||
`result` property has the following format:
|
||||
|
||||
```
|
||||
[
|
||||
{
|
||||
"metric": { "<label_name>": "<label_value>", ... },
|
||||
"values": [ [ <unix_time>, "<sample_value>" ], ... ]
|
||||
},
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
### Instant vectors
|
||||
|
||||
Instant vectors are returned as result type `vector`. The corresponding
|
||||
`result` property has the following format:
|
||||
|
||||
```
|
||||
[
|
||||
{
|
||||
"metric": { "<label_name>": "<label_value>", ... },
|
||||
"value": [ <unix_time>, "<sample_value>" ]
|
||||
},
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
### Scalars
|
||||
|
||||
Scalar results are returned as result type `scalar`. The corresponding
|
||||
`result` property has the following format:
|
||||
|
||||
```
|
||||
[ <unix_time>, "<scalar_value>" ]
|
||||
```
|
||||
|
||||
### Strings
|
||||
|
||||
String results are returned as result type `string`. The corresponding
|
||||
`result` property has the following format:
|
||||
|
||||
```
|
||||
[ <unix_time>, "<string_value>" ]
|
||||
```
|
||||
|
||||
## Targets
|
||||
|
||||
> This API is experimental as it is intended to be extended with targets
|
||||
> dropped due to relabelling in the future.
|
||||
|
||||
The following endpoint returns an overview of the current state of the
|
||||
Prometheus target discovery:
|
||||
|
||||
```
|
||||
GET /api/v1/targets
|
||||
```
|
||||
|
||||
Currently only the active targets are part of the response.
|
||||
|
||||
```json
|
||||
$ curl http://localhost:9090/api/v1/targets
|
||||
{
|
||||
"status": "success", [3/11]
|
||||
"data": {
|
||||
"activeTargets": [
|
||||
{
|
||||
"discoveredLabels": {
|
||||
"__address__": "127.0.0.1:9090",
|
||||
"__metrics_path__": "/metrics",
|
||||
"__scheme__": "http",
|
||||
"job": "prometheus"
|
||||
},
|
||||
"labels": {
|
||||
"instance": "127.0.0.1:9090",
|
||||
"job": "prometheus"
|
||||
},
|
||||
"scrapeUrl": "http://127.0.0.1:9090/metrics",
|
||||
"lastError": "",
|
||||
"lastScrape": "2017-01-17T15:07:44.723715405+01:00",
|
||||
"health": "up"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Alertmanagers
|
||||
|
||||
> This API is experimental as it is intended to be extended with Alertmanagers
|
||||
> dropped due to relabelling in the future.
|
||||
|
||||
The following endpoint returns an overview of the current state of the
|
||||
Prometheus alertmanager discovery:
|
||||
|
||||
```
|
||||
GET /api/v1/alertmanagers
|
||||
```
|
||||
|
||||
Currently only the active Alertmanagers are part of the response.
|
||||
|
||||
```json
|
||||
$ curl http://localhost:9090/api/v1/alertmanagers
|
||||
{
|
||||
"status": "success",
|
||||
"data": {
|
||||
"activeAlertmanagers": [
|
||||
{
|
||||
"url": "http://127.0.0.1:9090/api/v1/alerts"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
215
docs/querying/basics.md
Normal file
215
docs/querying/basics.md
Normal file
|
@ -0,0 +1,215 @@
|
|||
---
|
||||
title: Querying basics
|
||||
nav_title: Basics
|
||||
sort_rank: 1
|
||||
---
|
||||
|
||||
# Querying Prometheus
|
||||
|
||||
Prometheus provides a functional expression language that lets the user select
|
||||
and aggregate time series data in real time. The result of an expression can
|
||||
either be shown as a graph, viewed as tabular data in Prometheus's expression
|
||||
browser, or consumed by external systems via the [HTTP API](api.md).
|
||||
|
||||
## Examples
|
||||
|
||||
This document is meant as a reference. For learning, it might be easier to
|
||||
start with a couple of [examples](examples.md).
|
||||
|
||||
## Expression language data types
|
||||
|
||||
In Prometheus's expression language, an expression or sub-expression can
|
||||
evaluate to one of four types:
|
||||
|
||||
* **Instant vector** - a set of time series containing a single sample for each time series, all sharing the same timestamp
|
||||
* **Range vector** - a set of time series containing a range of data points over time for each time series
|
||||
* **Scalar** - a simple numeric floating point value
|
||||
* **String** - a simple string value; currently unused
|
||||
|
||||
Depending on the use-case (e.g. when graphing vs. displaying the output of an
|
||||
expression), only some of these types are legal as the result from a
|
||||
user-specified expression. For example, an expression that returns an instant
|
||||
vector is the only type that can be directly graphed.
|
||||
|
||||
## Literals
|
||||
|
||||
### String literals
|
||||
|
||||
Strings may be specified as literals in single quotes, double quotes or
|
||||
backticks.
|
||||
|
||||
PromQL follows the same [escaping rules as
|
||||
Go](https://golang.org/ref/spec#String_literals). In single or double quotes a
|
||||
backslash begins an escape sequence, which may be followed by `a`, `b`, `f`,
|
||||
`n`, `r`, `t`, `v` or `\`. Specific characters can be provided using octal
|
||||
(`\nnn`) or hexadecimal (`\xnn`, `\unnnn` and `\Unnnnnnnn`).
|
||||
|
||||
No escaping is processed inside backticks. Unlike Go, Prometheus does not discard newlines inside backticks.
|
||||
|
||||
Example:
|
||||
|
||||
"this is a string"
|
||||
'these are unescaped: \n \\ \t'
|
||||
`these are not unescaped: \n ' " \t`
|
||||
|
||||
### Float literals
|
||||
|
||||
Scalar float values can be literally written as numbers of the form
|
||||
`[-](digits)[.(digits)]`.
|
||||
|
||||
-2.43
|
||||
|
||||
## Time series Selectors
|
||||
|
||||
### Instant vector selectors
|
||||
|
||||
Instant vector selectors allow the selection of a set of time series and a
|
||||
single sample value for each at a given timestamp (instant): in the simplest
|
||||
form, only a metric name is specified. This results in an instant vector
|
||||
containing elements for all time series that have this metric name.
|
||||
|
||||
This example selects all time series that have the `http_requests_total` metric
|
||||
name:
|
||||
|
||||
http_requests_total
|
||||
|
||||
It is possible to filter these time series further by appending a set of labels
|
||||
to match in curly braces (`{}`).
|
||||
|
||||
This example selects only those time series with the `http_requests_total`
|
||||
metric name that also have the `job` label set to `prometheus` and their
|
||||
`group` label set to `canary`:
|
||||
|
||||
http_requests_total{job="prometheus",group="canary"}
|
||||
|
||||
It is also possible to negatively match a label value, or to match label values
|
||||
against regular expressions. The following label matching operators exist:
|
||||
|
||||
* `=`: Select labels that are exactly equal to the provided string.
|
||||
* `!=`: Select labels that are not equal to the provided string.
|
||||
* `=~`: Select labels that regex-match the provided string (or substring).
|
||||
* `!~`: Select labels that do not regex-match the provided string (or substring).
|
||||
|
||||
For example, this selects all `http_requests_total` time series for `staging`,
|
||||
`testing`, and `development` environments and HTTP methods other than `GET`.
|
||||
|
||||
http_requests_total{environment=~"staging|testing|development",method!="GET"}
|
||||
|
||||
Label matchers that match empty label values also select all time series that do
|
||||
not have the specific label set at all. Regex-matches are fully anchored.
|
||||
|
||||
Vector selectors must either specify a name or at least one label matcher
|
||||
that does not match the empty string. The following expression is illegal:
|
||||
|
||||
{job=~".*"} # Bad!
|
||||
|
||||
In contrast, these expressions are valid as they both have a selector that does not
|
||||
match empty label values.
|
||||
|
||||
{job=~".+"} # Good!
|
||||
{job=~".*",method="get"} # Good!
|
||||
|
||||
Label matchers can also be applied to metric names by matching against the internal
|
||||
`__name__` label. For example, the expression `http_requests_total` is equivalent to
|
||||
`{__name__="http_requests_total"}`. Matchers other than `=` (`!=`, `=~`, `!~`) may also be used.
|
||||
The following expression selects all metrics that have a name starting with `job:`:
|
||||
|
||||
{__name__=~"^job:.*"}
|
||||
|
||||
### Range Vector Selectors
|
||||
|
||||
Range vector literals work like instant vector literals, except that they
|
||||
select a range of samples back from the current instant. Syntactically, a range
|
||||
duration is appended in square brackets (`[]`) at the end of a vector selector
|
||||
to specify how far back in time values should be fetched for each resulting
|
||||
range vector element.
|
||||
|
||||
Time durations are specified as a number, followed immediately by one of the
|
||||
following units:
|
||||
|
||||
* `s` - seconds
|
||||
* `m` - minutes
|
||||
* `h` - hours
|
||||
* `d` - days
|
||||
* `w` - weeks
|
||||
* `y` - years
|
||||
|
||||
In this example, we select all the values we have recorded within the last 5
|
||||
minutes for all time series that have the metric name `http_requests_total` and
|
||||
a `job` label set to `prometheus`:
|
||||
|
||||
http_requests_total{job="prometheus"}[5m]
|
||||
|
||||
### Offset modifier
|
||||
|
||||
The `offset` modifier allows changing the time offset for individual
|
||||
instant and range vectors in a query.
|
||||
|
||||
For example, the following expression returns the value of
|
||||
`http_requests_total` 5 minutes in the past relative to the current
|
||||
query evaluation time:
|
||||
|
||||
http_requests_total offset 5m
|
||||
|
||||
Note that the `offset` modifier always needs to follow the selector
|
||||
immediately, i.e. the following would be correct:
|
||||
|
||||
sum(http_requests_total{method="GET"} offset 5m) // GOOD.
|
||||
|
||||
While the following would be *incorrect*:
|
||||
|
||||
sum(http_requests_total{method="GET"}) offset 5m // INVALID.
|
||||
|
||||
The same works for range vectors. This returns the 5-minutes rate that
|
||||
`http_requests_total` had a week ago:
|
||||
|
||||
rate(http_requests_total[5m] offset 1w)
|
||||
|
||||
## Operators
|
||||
|
||||
Prometheus supports many binary and aggregation operators. These are described
|
||||
in detail in the [expression language operators](operators.md) page.
|
||||
|
||||
## Functions
|
||||
|
||||
Prometheus supports several functions to operate on data. These are described
|
||||
in detail in the [expression language functions](functions.md) page.
|
||||
|
||||
## Gotchas
|
||||
|
||||
### Interpolation and staleness
|
||||
|
||||
When queries are run, timestamps at which to sample data are selected
|
||||
independently of the actual present time series data. This is mainly to support
|
||||
cases like aggregation (`sum`, `avg`, and so on), where multiple aggregated
|
||||
time series do not exactly align in time. Because of their independence,
|
||||
Prometheus needs to assign a value at those timestamps for each relevant time
|
||||
series. It does so by simply taking the newest sample before this timestamp.
|
||||
|
||||
If no stored sample is found (by default) 5 minutes before a sampling timestamp,
|
||||
no value is assigned for this time series at this point in time. This
|
||||
effectively means that time series "disappear" from graphs at times where their
|
||||
latest collected sample is older than 5 minutes.
|
||||
|
||||
NOTE: <b>NOTE:</b> Staleness and interpolation handling might change. See
|
||||
https://github.com/prometheus/prometheus/issues/398 and
|
||||
https://github.com/prometheus/prometheus/issues/581.
|
||||
|
||||
### Avoiding slow queries and overloads
|
||||
|
||||
If a query needs to operate on a very large amount of data, graphing it might
|
||||
time out or overload the server or browser. Thus, when constructing queries
|
||||
over unknown data, always start building the query in the tabular view of
|
||||
Prometheus's expression browser until the result set seems reasonable
|
||||
(hundreds, not thousands, of time series at most). Only when you have filtered
|
||||
or aggregated your data sufficiently, switch to graph mode. If the expression
|
||||
still takes too long to graph ad-hoc, pre-record it via a [recording
|
||||
rule](rules.md#recording-rules).
|
||||
|
||||
This is especially relevant for Prometheus's query language, where a bare
|
||||
metric name selector like `api_http_requests_total` could expand to thousands
|
||||
of time series with different labels. Also keep in mind that expressions which
|
||||
aggregate over many time series will generate load on the server even if the
|
||||
output is only a small number of time series. This is similar to how it would
|
||||
be slow to sum all values of a column in a relational database, even if the
|
||||
output value is only a single number.
|
83
docs/querying/examples.md
Normal file
83
docs/querying/examples.md
Normal file
|
@ -0,0 +1,83 @@
|
|||
---
|
||||
title: Querying examples
|
||||
nav_title: Examples
|
||||
sort_rank: 4
|
||||
---
|
||||
|
||||
# Query examples
|
||||
|
||||
## Simple time series selection
|
||||
|
||||
Return all time series with the metric `http_requests_total`:
|
||||
|
||||
http_requests_total
|
||||
|
||||
Return all time series with the metric `http_requests_total` and the given
|
||||
`job` and `handler` labels:
|
||||
|
||||
http_requests_total{job="apiserver", handler="/api/comments"}
|
||||
|
||||
Return a whole range of time (in this case 5 minutes) for the same vector,
|
||||
making it a range vector:
|
||||
|
||||
http_requests_total{job="apiserver", handler="/api/comments"}[5m]
|
||||
|
||||
Note that an expression resulting in a range vector cannot be graphed directly,
|
||||
but viewed in the tabular ("Console") view of the expression browser.
|
||||
|
||||
Using regular expressions, you could select time series only for jobs whose
|
||||
name match a certain pattern, in this case, all jobs that end with `server`.
|
||||
Note that this does a substring match, not a full string match:
|
||||
|
||||
http_requests_total{job=~"server$"}
|
||||
|
||||
To select all HTTP status codes except 4xx ones, you could run:
|
||||
|
||||
http_requests_total{status!~"^4..$"}
|
||||
|
||||
## Using functions, operators, etc.
|
||||
|
||||
Return the per-second rate for all time series with the `http_requests_total`
|
||||
metric name, as measured over the last 5 minutes:
|
||||
|
||||
rate(http_requests_total[5m])
|
||||
|
||||
Assuming that the `http_requests_total` time series all have the labels `job`
|
||||
(fanout by job name) and `instance` (fanout by instance of the job), we might
|
||||
want to sum over the rate of all instances, so we get fewer output time series,
|
||||
but still preserve the `job` dimension:
|
||||
|
||||
sum(rate(http_requests_total[5m])) by (job)
|
||||
|
||||
If we have two different metrics with the same dimensional labels, we can apply
|
||||
binary operators to them and elements on both sides with the same label set
|
||||
will get matched and propagated to the output. For example, this expression
|
||||
returns the unused memory in MiB for every instance (on a fictional cluster
|
||||
scheduler exposing these metrics about the instances it runs):
|
||||
|
||||
(instance_memory_limit_bytes - instance_memory_usage_bytes) / 1024 / 1024
|
||||
|
||||
The same expression, but summed by application, could be written like this:
|
||||
|
||||
sum(
|
||||
instance_memory_limit_bytes - instance_memory_usage_bytes
|
||||
) by (app, proc) / 1024 / 1024
|
||||
|
||||
If the same fictional cluster scheduler exposed CPU usage metrics like the
|
||||
following for every instance:
|
||||
|
||||
instance_cpu_time_ns{app="lion", proc="web", rev="34d0f99", env="prod", job="cluster-manager"}
|
||||
instance_cpu_time_ns{app="elephant", proc="worker", rev="34d0f99", env="prod", job="cluster-manager"}
|
||||
instance_cpu_time_ns{app="turtle", proc="api", rev="4d3a513", env="prod", job="cluster-manager"}
|
||||
instance_cpu_time_ns{app="fox", proc="widget", rev="4d3a513", env="prod", job="cluster-manager"}
|
||||
...
|
||||
|
||||
...we could get the top 3 CPU users grouped by application (`app`) and process
|
||||
type (`proc`) like this:
|
||||
|
||||
topk(3, sum(rate(instance_cpu_time_ns[5m])) by (app, proc))
|
||||
|
||||
Assuming this metric contains one time series per running instance, you could
|
||||
count the number of running instances per application like this:
|
||||
|
||||
count(instance_cpu_time_ns) by (app)
|
408
docs/querying/functions.md
Normal file
408
docs/querying/functions.md
Normal file
|
@ -0,0 +1,408 @@
|
|||
---
|
||||
title: Query functions
|
||||
nav_title: Functions
|
||||
sort_rank: 3
|
||||
---
|
||||
|
||||
# Functions
|
||||
|
||||
Some functions have default arguments, e.g. `year(v=vector(time())
|
||||
instant-vector)`. This means that there is one argument `v` which is an instant
|
||||
vector, which if not provided it will default to the value of the expression
|
||||
`vector(time())`.
|
||||
|
||||
## `abs()`
|
||||
|
||||
`abs(v instant-vector)` returns the input vector with all sample values converted to
|
||||
their absolute value.
|
||||
|
||||
## `absent()`
|
||||
|
||||
`absent(v instant-vector)` returns an empty vector if the vector passed to it
|
||||
has any elements and a 1-element vector with the value 1 if the vector passed to
|
||||
it has no elements.
|
||||
|
||||
This is useful for alerting on when no time series exist for a given metric name
|
||||
and label combination.
|
||||
|
||||
```
|
||||
absent(nonexistent{job="myjob"})
|
||||
# => {job="myjob"}
|
||||
|
||||
absent(nonexistent{job="myjob",instance=~".*"})
|
||||
# => {job="myjob"}
|
||||
|
||||
absent(sum(nonexistent{job="myjob"}))
|
||||
# => {}
|
||||
```
|
||||
|
||||
In the second example, `absent()` tries to be smart about deriving labels of the
|
||||
1-element output vector from the input vector.
|
||||
|
||||
## `ceil()`
|
||||
|
||||
`ceil(v instant-vector)` rounds the sample values of all elements in `v` up to
|
||||
the nearest integer.
|
||||
|
||||
## `changes()`
|
||||
|
||||
For each input time series, `changes(v range-vector)` returns the number of
|
||||
times its value has changed within the provided time range as an instant
|
||||
vector.
|
||||
|
||||
## `clamp_max()`
|
||||
|
||||
`clamp_max(v instant-vector, max scalar)` clamps the sample values of all
|
||||
elements in `v` to have an upper limit of `max`.
|
||||
|
||||
## `clamp_min()`
|
||||
|
||||
`clamp_min(v instant-vector, min scalar)` clamps the sample values of all
|
||||
elements in `v` to have a lower limit of `min`.
|
||||
|
||||
## `count_scalar()`
|
||||
|
||||
`count_scalar(v instant-vector)` returns the number of elements in a time series
|
||||
vector as a scalar. This is in contrast to the `count()`
|
||||
[aggregation operator](operators.md#aggregation-operators), which
|
||||
always returns a vector (an empty one if the input vector is empty) and allows
|
||||
grouping by labels via a `by` clause.
|
||||
|
||||
## `day_of_month()`
|
||||
|
||||
`day_of_month(v=vector(time()) instant-vector)` returns the day of the month
|
||||
for each of the given times in UTC. Returned values are from 1 to 31.
|
||||
|
||||
## `day_of_week()`
|
||||
|
||||
`day_of_week(v=vector(time()) instant-vector)` returns the day of the week for
|
||||
each of the given times in UTC. Returned values are from 0 to 6, where 0 means
|
||||
Sunday etc.
|
||||
|
||||
## `days_in_month()`
|
||||
|
||||
`days_in_month(v=vector(time()) instant-vector)` returns number of days in the
|
||||
month for each of the given times in UTC. Returned values are from 28 to 31.
|
||||
|
||||
## `delta()`
|
||||
|
||||
`delta(v range-vector)` calculates the difference between the
|
||||
first and last value of each time series element in a range vector `v`,
|
||||
returning an instant vector with the given deltas and equivalent labels.
|
||||
The delta is extrapolated to cover the full time range as specified in
|
||||
the range vector selector, so that it is possible to get a non-integer
|
||||
result even if the sample values are all integers.
|
||||
|
||||
The following example expression returns the difference in CPU temperature
|
||||
between now and 2 hours ago:
|
||||
|
||||
```
|
||||
delta(cpu_temp_celsius{host="zeus"}[2h])
|
||||
```
|
||||
|
||||
`delta` should only be used with gauges.
|
||||
|
||||
## `deriv()`
|
||||
|
||||
`deriv(v range-vector)` calculates the per-second derivative of the time series in a range
|
||||
vector `v`, using [simple linear regression](http://en.wikipedia.org/wiki/Simple_linear_regression).
|
||||
|
||||
`deriv` should only be used with gauges.
|
||||
|
||||
## `drop_common_labels()`
|
||||
|
||||
`drop_common_labels(instant-vector)` drops all labels that have the same name
|
||||
and value across all series in the input vector.
|
||||
|
||||
## `exp()`
|
||||
|
||||
`exp(v instant-vector)` calculates the exponential function for all elements in `v`.
|
||||
Special cases are:
|
||||
|
||||
* `Exp(+Inf) = +Inf`
|
||||
* `Exp(NaN) = NaN`
|
||||
|
||||
## `floor()`
|
||||
|
||||
`floor(v instant-vector)` rounds the sample values of all elements in `v` down
|
||||
to the nearest integer.
|
||||
|
||||
## `histogram_quantile()`
|
||||
|
||||
`histogram_quantile(φ float, b instant-vector)` calculates the φ-quantile (0 ≤ φ
|
||||
≤ 1) from the buckets `b` of a
|
||||
[histogram](https://prometheus.io/docs/concepts/metric_types/#histogram). (See
|
||||
[histograms and summaries](https://prometheus.io/docs/practices/histograms) for
|
||||
a detailed explanation of φ-quantiles and the usage of the histogram metric type
|
||||
in general.) The samples in `b` are the counts of observations in each bucket.
|
||||
Each sample must have a label `le` where the label value denotes the inclusive
|
||||
upper bound of the bucket. (Samples without such a label are silently ignored.)
|
||||
The [histogram metric type](https://prometheus.io/docs/concepts/metric_types/#histogram)
|
||||
automatically provides time series with the `_bucket` suffix and the appropriate
|
||||
labels.
|
||||
|
||||
Use the `rate()` function to specify the time window for the quantile
|
||||
calculation.
|
||||
|
||||
Example: A histogram metric is called `http_request_duration_seconds`. To
|
||||
calculate the 90th percentile of request durations over the last 10m, use the
|
||||
following expression:
|
||||
|
||||
histogram_quantile(0.9, rate(http_request_duration_seconds_bucket[10m]))
|
||||
|
||||
The quantile is calculated for each label combination in
|
||||
`http_request_duration_seconds`. To aggregate, use the `sum()` aggregator
|
||||
around the `rate()` function. Since the `le` label is required by
|
||||
`histogram_quantile()`, it has to be included in the `by` clause. The following
|
||||
expression aggregates the 90th percentile by `job`:
|
||||
|
||||
histogram_quantile(0.9, sum(rate(http_request_duration_seconds_bucket[10m])) by (job, le))
|
||||
|
||||
To aggregate everything, specify only the `le` label:
|
||||
|
||||
histogram_quantile(0.9, sum(rate(http_request_duration_seconds_bucket[10m])) by (le))
|
||||
|
||||
The `histogram_quantile()` function interpolates quantile values by
|
||||
assuming a linear distribution within a bucket. The highest bucket
|
||||
must have an upper bound of `+Inf`. (Otherwise, `NaN` is returned.) If
|
||||
a quantile is located in the highest bucket, the upper bound of the
|
||||
second highest bucket is returned. A lower limit of the lowest bucket
|
||||
is assumed to be 0 if the upper bound of that bucket is greater than
|
||||
0. In that case, the usual linear interpolation is applied within that
|
||||
bucket. Otherwise, the upper bound of the lowest bucket is returned
|
||||
for quantiles located in the lowest bucket.
|
||||
|
||||
If `b` contains fewer than two buckets, `NaN` is returned. For φ < 0, `-Inf` is
|
||||
returned. For φ > 1, `+Inf` is returned.
|
||||
|
||||
## `holt_winters()`
|
||||
|
||||
`holt_winters(v range-vector, sf scalar, tf scalar)` produces a smoothed value
|
||||
for time series based on the range in `v`. The lower the smoothing factor `sf`,
|
||||
the more importance is given to old data. The higher the trend factor `tf`, the
|
||||
more trends in the data is considered. Both `sf` and `tf` must be between 0 and
|
||||
1.
|
||||
|
||||
`holt_winters` should only be used with gauges.
|
||||
|
||||
## `hour()`
|
||||
|
||||
`hour(v=vector(time()) instant-vector)` returns the hour of the day
|
||||
for each of the given times in UTC. Returned values are from 0 to 23.
|
||||
|
||||
## `idelta()`
|
||||
|
||||
`idelta(v range-vector)`
|
||||
|
||||
`idelta(v range-vector)` calculates the difference between the last two samples
|
||||
in the range vector `v`, returning an instant vector with the given deltas and
|
||||
equivalent labels.
|
||||
|
||||
`idelta` should only be used with gauges.
|
||||
|
||||
## `increase()`
|
||||
|
||||
`increase(v range-vector)` calculates the increase in the
|
||||
time series in the range vector. Breaks in monotonicity (such as counter
|
||||
resets due to target restarts) are automatically adjusted for. The
|
||||
increase is extrapolated to cover the full time range as specified
|
||||
in the range vector selector, so that it is possible to get a
|
||||
non-integer result even if a counter increases only by integer
|
||||
increments.
|
||||
|
||||
The following example expression returns the number of HTTP requests as measured
|
||||
over the last 5 minutes, per time series in the range vector:
|
||||
|
||||
```
|
||||
increase(http_requests_total{job="api-server"}[5m])
|
||||
```
|
||||
|
||||
`increase` should only be used with counters. It is syntactic sugar
|
||||
for `rate(v)` multiplied by the number of seconds under the specified
|
||||
time range window, and should be used primarily for human readability.
|
||||
Use `rate` in recording rules so that increases are tracked consistently
|
||||
on a per-second basis.
|
||||
|
||||
## `irate()`
|
||||
|
||||
`irate(v range-vector)` calculates the per-second instant rate of increase of
|
||||
the time series in the range vector. This is based on the last two data points.
|
||||
Breaks in monotonicity (such as counter resets due to target restarts) are
|
||||
automatically adjusted for.
|
||||
|
||||
The following example expression returns the per-second rate of HTTP requests
|
||||
looking up to 5 minutes back for the two most recent data points, per time
|
||||
series in the range vector:
|
||||
|
||||
```
|
||||
irate(http_requests_total{job="api-server"}[5m])
|
||||
```
|
||||
|
||||
`irate` should only be used when graphing volatile, fast-moving counters.
|
||||
Use `rate` for alerts and slow-moving counters, as brief changes
|
||||
in the rate can reset the `FOR` clause and graphs consisting entirely of rare
|
||||
spikes are hard to read.
|
||||
|
||||
Note that when combining `irate()` with an
|
||||
[aggregation operator](operators.md#aggregation-operators) (e.g. `sum()`)
|
||||
or a function aggregating over time (any function ending in `_over_time`),
|
||||
always take a `irate()` first, then aggregate. Otherwise `irate()` cannot detect
|
||||
counter resets when your target restarts.
|
||||
|
||||
## `label_join()`
|
||||
|
||||
For each timeseries in `v`, `label_join(v instant-vector, dst_label string, separator string, src_label_1 string, src_label_2 string, ...)` joins all the values of all the `src_labels`
|
||||
using `separator` and returns the timeseries with the label `dst_label` containing the joined value.
|
||||
There can be any number of `src_labels` in this function.
|
||||
|
||||
This example will return a vector with each time series having a `foo` label with the value `a,b,c` added to it:
|
||||
|
||||
```
|
||||
label_join(up{job="api-server",src1="a",src2="b",src3="c"}, "foo", ",", "src1", "src2", "src3")
|
||||
```
|
||||
|
||||
## `label_replace()`
|
||||
|
||||
For each timeseries in `v`, `label_replace(v instant-vector, dst_label string,
|
||||
replacement string, src_label string, regex string)` matches the regular
|
||||
expression `regex` against the label `src_label`. If it matches, then the
|
||||
timeseries is returned with the label `dst_label` replaced by the expansion of
|
||||
`replacement`. `$1` is replaced with the first matching subgroup, `$2` with the
|
||||
second etc. If the regular expression doesn't match then the timeseries is
|
||||
returned unchanged.
|
||||
|
||||
This example will return a vector with each time series having a `foo`
|
||||
label with the value `a` added to it:
|
||||
|
||||
```
|
||||
label_replace(up{job="api-server",service="a:c"}, "foo", "$1", "service", "(.*):.*")
|
||||
```
|
||||
|
||||
## `ln()`
|
||||
|
||||
`ln(v instant-vector)` calculates the natural logarithm for all elements in `v`.
|
||||
Special cases are:
|
||||
|
||||
* `ln(+Inf) = +Inf`
|
||||
* `ln(0) = -Inf`
|
||||
* `ln(x < 0) = NaN`
|
||||
* `ln(NaN) = NaN`
|
||||
|
||||
## `log2()`
|
||||
|
||||
`log2(v instant-vector)` calculates the binary logarithm for all elements in `v`.
|
||||
The special cases are equivalent to those in `ln`.
|
||||
|
||||
## `log10()`
|
||||
|
||||
`log10(v instant-vector)` calculates the decimal logarithm for all elements in `v`.
|
||||
The special cases are equivalent to those in `ln`.
|
||||
|
||||
## `minute()`
|
||||
|
||||
`minute(v=vector(time()) instant-vector)` returns the minute of the hour for each
|
||||
of the given times in UTC. Returned values are from 0 to 59.
|
||||
|
||||
## `month()`
|
||||
|
||||
`month(v=vector(time()) instant-vector)` returns the month of the year for each
|
||||
of the given times in UTC. Returned values are from 1 to 12, where 1 means
|
||||
January etc.
|
||||
|
||||
## `predict_linear()`
|
||||
|
||||
`predict_linear(v range-vector, t scalar)` predicts the value of time series
|
||||
`t` seconds from now, based on the range vector `v`, using [simple linear
|
||||
regression](http://en.wikipedia.org/wiki/Simple_linear_regression).
|
||||
|
||||
`predict_linear` should only be used with gauges.
|
||||
|
||||
## `rate()`
|
||||
|
||||
`rate(v range-vector)` calculates the per-second average rate of increase of the
|
||||
time series in the range vector. Breaks in monotonicity (such as counter
|
||||
resets due to target restarts) are automatically adjusted for. Also, the
|
||||
calculation extrapolates to the ends of the time range, allowing for missed
|
||||
scrapes or imperfect alignment of scrape cycles with the range's time period.
|
||||
|
||||
The following example expression returns the per-second rate of HTTP requests as measured
|
||||
over the last 5 minutes, per time series in the range vector:
|
||||
|
||||
```
|
||||
rate(http_requests_total{job="api-server"}[5m])
|
||||
```
|
||||
|
||||
`rate` should only be used with counters. It is best suited for alerting,
|
||||
and for graphing of slow-moving counters.
|
||||
|
||||
Note that when combining `rate()` with an aggregation operator (e.g. `sum()`)
|
||||
or a function aggregating over time (any function ending in `_over_time`),
|
||||
always take a `rate()` first, then aggregate. Otherwise `rate()` cannot detect
|
||||
counter resets when your target restarts.
|
||||
|
||||
## `resets()`
|
||||
|
||||
For each input time series, `resets(v range-vector)` returns the number of
|
||||
counter resets within the provided time range as an instant vector. Any
|
||||
decrease in the value between two consecutive samples is interpreted as a
|
||||
counter reset.
|
||||
|
||||
`resets` should only be used with counters.
|
||||
|
||||
## `round()`
|
||||
|
||||
`round(v instant-vector, to_nearest=1 scalar)` rounds the sample values of all
|
||||
elements in `v` to the nearest integer. Ties are resolved by rounding up. The
|
||||
optional `to_nearest` argument allows specifying the nearest multiple to which
|
||||
the sample values should be rounded. This multiple may also be a fraction.
|
||||
|
||||
## `scalar()`
|
||||
|
||||
Given a single-element input vector, `scalar(v instant-vector)` returns the
|
||||
sample value of that single element as a scalar. If the input vector does not
|
||||
have exactly one element, `scalar` will return `NaN`.
|
||||
|
||||
## `sort()`
|
||||
|
||||
`sort(v instant-vector)` returns vector elements sorted by their sample values,
|
||||
in ascending order.
|
||||
|
||||
## `sort_desc()`
|
||||
|
||||
Same as `sort`, but sorts in descending order.
|
||||
|
||||
## `sqrt()`
|
||||
|
||||
`sqrt(v instant-vector)` calculates the square root of all elements in `v`.
|
||||
|
||||
## `time()`
|
||||
|
||||
`time()` returns the number of seconds since January 1, 1970 UTC. Note that
|
||||
this does not actually return the current time, but the time at which the
|
||||
expression is to be evaluated.
|
||||
|
||||
## `vector()`
|
||||
|
||||
`vector(s scalar)` returns the scalar `s` as a vector with no labels.
|
||||
|
||||
## `year()`
|
||||
|
||||
`year(v=vector(time()) instant-vector)` returns the year
|
||||
for each of the given times in UTC.
|
||||
|
||||
## `<aggregation>_over_time()`
|
||||
|
||||
The following functions allow aggregating each series of a given range vector
|
||||
over time and return an instant vector with per-series aggregation results:
|
||||
|
||||
* `avg_over_time(range-vector)`: the average value of all points in the specified interval.
|
||||
* `min_over_time(range-vector)`: the minimum value of all points in the specified interval.
|
||||
* `max_over_time(range-vector)`: the maximum value of all points in the specified interval.
|
||||
* `sum_over_time(range-vector)`: the sum of all values in the specified interval.
|
||||
* `count_over_time(range-vector)`: the count of all values in the specified interval.
|
||||
* `quantile_over_time(scalar, range-vector)`: the φ-quantile (0 ≤ φ ≤ 1) of the values in the specified interval.
|
||||
* `stddev_over_time(range-vector)`: the population standard deviation of the values in the specified interval.
|
||||
* `stdvar_over_time(range-vector)`: the population standard variance of the values in the specified interval.
|
||||
|
||||
Note that all values in the specified interval have the same weight in the
|
||||
aggregation even if the values are not equally spaced throughout the interval.
|
4
docs/querying/index.md
Normal file
4
docs/querying/index.md
Normal file
|
@ -0,0 +1,4 @@
|
|||
---
|
||||
title: Querying
|
||||
sort_rank: 4
|
||||
---
|
250
docs/querying/operators.md
Normal file
250
docs/querying/operators.md
Normal file
|
@ -0,0 +1,250 @@
|
|||
---
|
||||
title: Operators
|
||||
sort_rank: 2
|
||||
---
|
||||
|
||||
# Operators
|
||||
|
||||
## Binary operators
|
||||
|
||||
Prometheus's query language supports basic logical and arithmetic operators.
|
||||
For operations between two instant vectors, the [matching behavior](#vector-matching)
|
||||
can be modified.
|
||||
|
||||
### Arithmetic binary operators
|
||||
|
||||
The following binary arithmetic operators exist in Prometheus:
|
||||
|
||||
* `+` (addition)
|
||||
* `-` (subtraction)
|
||||
* `*` (multiplication)
|
||||
* `/` (division)
|
||||
* `%` (modulo)
|
||||
* `^` (power/exponentiation)
|
||||
|
||||
Binary arithmetic operators are defined between scalar/scalar, vector/scalar,
|
||||
and vector/vector value pairs.
|
||||
|
||||
**Between two scalars**, the behavior is obvious: they evaluate to another
|
||||
scalar that is the result of the operator applied to both scalar operands.
|
||||
|
||||
**Between an instant vector and a scalar**, the operator is applied to the
|
||||
value of every data sample in the vector. E.g. if a time series instant vector
|
||||
is multiplied by 2, the result is another vector in which every sample value of
|
||||
the original vector is multiplied by 2.
|
||||
|
||||
**Between two instant vectors**, a binary arithmetic operator is applied to
|
||||
each entry in the left-hand-side vector and its [matching element](#vector-matching)
|
||||
in the right hand vector. The result is propagated into the result vector and the metric
|
||||
name is dropped. Entries for which no matching entry in the right-hand vector can be
|
||||
found are not part of the result.
|
||||
|
||||
### Comparison binary operators
|
||||
|
||||
The following binary comparison operators exist in Prometheus:
|
||||
|
||||
* `==` (equal)
|
||||
* `!=` (not-equal)
|
||||
* `>` (greater-than)
|
||||
* `<` (less-than)
|
||||
* `>=` (greater-or-equal)
|
||||
* `<=` (less-or-equal)
|
||||
|
||||
Comparison operators are defined between scalar/scalar, vector/scalar,
|
||||
and vector/vector value pairs. By default they filter. Their behaviour can be
|
||||
modified by providing `bool` after the operator, which will return `0` or `1`
|
||||
for the value rather than filtering.
|
||||
|
||||
**Between two scalars**, the `bool` modifier must be provided and these
|
||||
operators result in another scalar that is either `0` (`false`) or `1`
|
||||
(`true`), depending on the comparison result.
|
||||
|
||||
**Between an instant vector and a scalar**, these operators are applied to the
|
||||
value of every data sample in the vector, and vector elements between which the
|
||||
comparison result is `false` get dropped from the result vector. If the `bool`
|
||||
modifier is provided, vector elements that would be dropped instead have the value
|
||||
`0` and vector elements that would be kept have the value `1`.
|
||||
|
||||
**Between two instant vectors**, these operators behave as a filter by default,
|
||||
applied to matching entries. Vector elements for which the expression is not
|
||||
true or which do not find a match on the other side of the expression get
|
||||
dropped from the result, while the others are propagated into a result vector
|
||||
with their original (left-hand-side) metric names and label values.
|
||||
If the `bool` modifier is provided, vector elements that would have been
|
||||
dropped instead have the value `0` and vector elements that would be kept have
|
||||
the value `1` with the left-hand-side metric names and label values.
|
||||
|
||||
### Logical/set binary operators
|
||||
|
||||
These logical/set binary operators are only defined between instant vectors:
|
||||
|
||||
* `and` (intersection)
|
||||
* `or` (union)
|
||||
* `unless` (complement)
|
||||
|
||||
`vector1 and vector2` results in a vector consisting of the elements of
|
||||
`vector1` for which there are elements in `vector2` with exactly matching
|
||||
label sets. Other elements are dropped. The metric name and values are carried
|
||||
over from the left-hand-side vector.
|
||||
|
||||
`vector1 or vector2` results in a vector that contains all original elements
|
||||
(label sets + values) of `vector1` and additionally all elements of `vector2`
|
||||
which do not have matching label sets in `vector1`.
|
||||
|
||||
`vector1 unless vector2` results in a vector consisting of the elements of
|
||||
`vector1` for which there are no elements in `vector2` with exactly matching
|
||||
label sets. All matching elements in both vectors are dropped.
|
||||
|
||||
## Vector matching
|
||||
|
||||
Operations between vectors attempt to find a matching element in the right-hand-side
|
||||
vector for each entry in the left-hand side. There are two basic types of
|
||||
matching behavior:
|
||||
|
||||
**One-to-one** finds a unique pair of entries from each side of the operation.
|
||||
In the default case, that is an operation following the format `vector1 <operator> vector2`.
|
||||
Two entries match if they have the exact same set of labels and corresponding values.
|
||||
The `ignoring` keyword allows ignoring certain labels when matching, while the
|
||||
`on` keyword allows reducing the set of considered labels to a provided list:
|
||||
|
||||
<vector expr> <bin-op> ignoring(<label list>) <vector expr>
|
||||
<vector expr> <bin-op> on(<label list>) <vector expr>
|
||||
|
||||
Example input:
|
||||
|
||||
method_code:http_errors:rate5m{method="get", code="500"} 24
|
||||
method_code:http_errors:rate5m{method="get", code="404"} 30
|
||||
method_code:http_errors:rate5m{method="put", code="501"} 3
|
||||
method_code:http_errors:rate5m{method="post", code="500"} 6
|
||||
method_code:http_errors:rate5m{method="post", code="404"} 21
|
||||
|
||||
method:http_requests:rate5m{method="get"} 600
|
||||
method:http_requests:rate5m{method="del"} 34
|
||||
method:http_requests:rate5m{method="post"} 120
|
||||
|
||||
Example query:
|
||||
|
||||
method_code:http_errors:rate5m{code="500"} / ignoring(code) method:http_requests:rate5m
|
||||
|
||||
This returns a result vector containing the fraction of HTTP requests with status code
|
||||
of 500 for each method, as measured over the last 5 minutes. Without `ignoring(code)` there
|
||||
would have been no match as the metrics do not share the same set of labels.
|
||||
The entries with methods `put` and `del` have no match and will not show up in the result:
|
||||
|
||||
{method="get"} 0.04 // 24 / 600
|
||||
{method="post"} 0.05 // 6 / 120
|
||||
|
||||
**Many-to-one** and **one-to-many** matchings refer to the case where each vector element on
|
||||
the "one"-side can match with multiple elements on the "many"-side. This has to
|
||||
be explicitly requested using the `group_left` or `group_right` modifier, where
|
||||
left/right determines which vector has the higher cardinality.
|
||||
|
||||
<vector expr> <bin-op> ignoring(<label list>) group_left(<label list>) <vector expr>
|
||||
<vector expr> <bin-op> ignoring(<label list>) group_right(<label list>) <vector expr>
|
||||
<vector expr> <bin-op> on(<label list>) group_left(<label list>) <vector expr>
|
||||
<vector expr> <bin-op> on(<label list>) group_right(<label list>) <vector expr>
|
||||
|
||||
The label list provided with the group modifier contains additional labels from
|
||||
the "one"-side to be included in the result metrics. For `on` a label can only
|
||||
appear in one of the lists. Every time series of the result vector must be
|
||||
uniquely identifiable.
|
||||
|
||||
_Grouping modifiers can only be used for
|
||||
[comparison](#comparison-binary-operators) and
|
||||
[arithmetic](#arithmetic-binary-operators). Operations as `and`, `unless` and
|
||||
`or` operations match with all possible entries in the right vector by
|
||||
default._
|
||||
|
||||
Example query:
|
||||
|
||||
method_code:http_errors:rate5m / ignoring(code) group_left method:http_requests:rate5m
|
||||
|
||||
In this case the left vector contains more than one entry per `method` label
|
||||
value. Thus, we indicate this using `group_left`. The elements from the right
|
||||
side are now matched with multiple elements with the same `method` label on the
|
||||
left:
|
||||
|
||||
{method="get", code="500"} 0.04 // 24 / 600
|
||||
{method="get", code="404"} 0.05 // 30 / 600
|
||||
{method="post", code="500"} 0.05 // 6 / 120
|
||||
{method="post", code="404"} 0.175 // 21 / 120
|
||||
|
||||
_Many-to-one and one-to-many matching are advanced use cases that should be carefully considered.
|
||||
Often a proper use of `ignoring(<labels>)` provides the desired outcome._
|
||||
|
||||
## Aggregation operators
|
||||
|
||||
Prometheus supports the following built-in aggregation operators that can be
|
||||
used to aggregate the elements of a single instant vector, resulting in a new
|
||||
vector of fewer elements with aggregated values:
|
||||
|
||||
* `sum` (calculate sum over dimensions)
|
||||
* `min` (select minimum over dimensions)
|
||||
* `max` (select maximum over dimensions)
|
||||
* `avg` (calculate the average over dimensions)
|
||||
* `stddev` (calculate population standard deviation over dimensions)
|
||||
* `stdvar` (calculate population standard variance over dimensions)
|
||||
* `count` (count number of elements in the vector)
|
||||
* `count_values` (count number of elements with the same value)
|
||||
* `bottomk` (smallest k elements by sample value)
|
||||
* `topk` (largest k elements by sample value)
|
||||
* `quantile` (calculate φ-quantile (0 ≤ φ ≤ 1) over dimensions)
|
||||
|
||||
These operators can either be used to aggregate over **all** label dimensions
|
||||
or preserve distinct dimensions by including a `without` or `by` clause.
|
||||
|
||||
<aggr-op>([parameter,] <vector expression>) [without|by (<label list>)] [keep_common]
|
||||
|
||||
`parameter` is only required for `count_values`, `quantile`, `topk` and
|
||||
`bottomk`. `without` removes the listed labels from the result vector, while
|
||||
all other labels are preserved the output. `by` does the opposite and drops
|
||||
labels that are not listed in the `by` clause, even if their label values are
|
||||
identical between all elements of the vector. The `keep_common` clause allows
|
||||
keeping those extra labels (labels that are identical between elements, but not
|
||||
in the `by` clause).
|
||||
|
||||
`count_values` outputs one time series per unique sample value. Each series has
|
||||
an additional label. The name of that label is given by the aggregation
|
||||
parameter, and the label value is the unique sample value. The value of each
|
||||
time series is the number of times that sample value was present.
|
||||
|
||||
`topk` and `bottomk` are different from other aggregators in that a subset of
|
||||
the input samples, including the original labels, are returned in the result
|
||||
vector. `by` and `without` are only used to bucket the input vector.
|
||||
|
||||
Example:
|
||||
|
||||
If the metric `http_requests_total` had time series that fan out by
|
||||
`application`, `instance`, and `group` labels, we could calculate the total
|
||||
number of seen HTTP requests per application and group over all instances via:
|
||||
|
||||
sum(http_requests_total) without (instance)
|
||||
|
||||
If we are just interested in the total of HTTP requests we have seen in **all**
|
||||
applications, we could simply write:
|
||||
|
||||
sum(http_requests_total)
|
||||
|
||||
To count the number of binaries running each build version we could write:
|
||||
|
||||
count_values("version", build_version)
|
||||
|
||||
To get the 5 largest HTTP requests counts across all instances we could write:
|
||||
|
||||
topk(5, http_requests_total)
|
||||
|
||||
## Binary operator precedence
|
||||
|
||||
The following list shows the precedence of binary operators in Prometheus, from
|
||||
highest to lowest.
|
||||
|
||||
1. `^`
|
||||
2. `*`, `/`, `%`
|
||||
3. `+`, `-`
|
||||
4. `==`, `!=`, `<=`, `<`, `>=`, `>`
|
||||
5. `and`, `unless`
|
||||
6. `or`
|
||||
|
||||
Operators on the same precedence level are left-associative. For example,
|
||||
`2 * 3 % 2` is equivalent to `(2 * 3) % 2`. However `^` is right associative,
|
||||
so `2 ^ 3 ^ 2` is equivalent to `2 ^ (3 ^ 2)`.
|
66
docs/querying/rules.md
Normal file
66
docs/querying/rules.md
Normal file
|
@ -0,0 +1,66 @@
|
|||
---
|
||||
title: Recording rules
|
||||
sort_rank: 6
|
||||
---
|
||||
|
||||
# Defining recording rules
|
||||
|
||||
## Configuring rules
|
||||
|
||||
Prometheus supports two types of rules which may be configured and then
|
||||
evaluated at regular intervals: recording rules and [alerting
|
||||
rules](https://prometheus.io/docs/alerting/rules/). To include rules in
|
||||
Prometheus, create a file containing the necessary rule statements and have
|
||||
Prometheus load the file via the `rule_files` field in the [Prometheus
|
||||
configuration](../configuration.md).
|
||||
|
||||
The rule files can be reloaded at runtime by sending `SIGHUP` to the Prometheus
|
||||
process. The changes are only applied if all rule files are well-formatted.
|
||||
|
||||
## Syntax-checking rules
|
||||
|
||||
To quickly check whether a rule file is syntactically correct without starting
|
||||
a Prometheus server, install and run Prometheus's `promtool` command-line
|
||||
utility tool:
|
||||
|
||||
```bash
|
||||
go get github.com/prometheus/prometheus/cmd/promtool
|
||||
promtool check-rules /path/to/example.rules
|
||||
```
|
||||
|
||||
When the file is syntactically valid, the checker prints a textual
|
||||
representation of the parsed rules to standard output and then exits with
|
||||
a `0` return status.
|
||||
|
||||
If there are any syntax errors, it prints an error message to standard error
|
||||
and exits with a `1` return status. On invalid input arguments the exit status
|
||||
is `2`.
|
||||
|
||||
## Recording rules
|
||||
|
||||
Recording rules allow you to precompute frequently needed or computationally
|
||||
expensive expressions and save their result as a new set of time series.
|
||||
Querying the precomputed result will then often be much faster than executing
|
||||
the original expression every time it is needed. This is especially useful for
|
||||
dashboards, which need to query the same expression repeatedly every time they
|
||||
refresh.
|
||||
|
||||
To add a new recording rule, add a line of the following syntax to your rule
|
||||
file:
|
||||
|
||||
<new time series name>[{<label overrides>}] = <expression to record>
|
||||
|
||||
Some examples:
|
||||
|
||||
# Saving the per-job HTTP in-progress request count as a new set of time series:
|
||||
job:http_inprogress_requests:sum = sum(http_inprogress_requests) by (job)
|
||||
|
||||
# Drop or rewrite labels in the result time series:
|
||||
new_time_series{label_to_change="new_value",label_to_drop=""} = old_time_series
|
||||
|
||||
Recording rules are evaluated at the interval specified by the
|
||||
`evaluation_interval` field in the Prometheus configuration. During each
|
||||
evaluation cycle, the right-hand-side expression of the rule statement is
|
||||
evaluated at the current instant in time and the resulting sample vector is
|
||||
stored as a new set of time series with the current timestamp and a new metric
|
||||
name (and perhaps an overridden set of labels).
|
Loading…
Reference in a new issue