Document the native histogram feature flag and PromQL (#11446)

Signed-off-by: beorn7 <beorn@grafana.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2024-12-24 21:24:05 -08:00 · 2022-10-14 14:46:12 +02:00 · 2022-10-14 14:46:12 +02:00 · 41035469d3
parent 50529b4804
commit 41035469d3
4 changed files with 210 additions and 45 deletions
--- a/docs/feature_flags.md
+++ b/docs/feature_flags.md
@ -103,3 +103,26 @@ When enabled, the default ports for HTTP (`:80`) or HTTPS (`:443`) will _not_ be
 the address used to scrape a target (the value of the `__address_` label), contrary to the default behavior.
 In addition, if a default HTTP or HTTPS port has already been added either in a static configuration or
 by a service discovery mechanism and the respective scheme is specified (`http` or `https`), that port will be removed.
+
+## Native Histograms
+
+`--enable-feature=native-histograms`
+
+When enabled, Prometheus will ingest native histograms (formerly also known as
+sparse histograms or high-res histograms). Native histograms are still highly
+experimental. Expect breaking changes to happen (including those rendering the
+TSDB unreadable).
+
+Native histograms are currently only supported in the traditional Prometheus
+protobuf exposition format. This feature flag therefore also enables a new (and
+also experimental) protobuf parser, through which _all_ metrics are ingested
+(i.e. not only native histograms). Prometheus will try to negotiate the
+protobuf format first. The instrumented target needs to support the protobuf
+format, too, _and_ it needs to expose native histograms. The protobuf format
+allows to expose conventional and native histograms side by side. With this
+feature flag disabled, Prometheus will continue to parse the conventional
+histogram (albeit via the text format). With this flag enabled, Prometheus will
+still ingest those conventional histograms that do not come with a
+corresponding native histogram. However, if a native histogram is present,
+Prometheus will ignore the corresponding conventional histogram, with the
+notable exception of exemplars, which are always ingested.
--- a/docs/querying/basics.md
+++ b/docs/querying/basics.md
@ -32,6 +32,16 @@ expression), only some of these types are legal as the result from a
 user-specified expression. For example, an expression that returns an instant
 vector is the only type that can be directly graphed.

+_Notes about the experimental native histograms:_
+
+* Ingesting native histograms has to be enabled via a [feature
+  flag](../feature_flags/#native-histograms).
+* Once native histograms have been ingested into the TSDB (and even after
+  disabling the feature flag again), both instant vectors and range vectors may
+  now contain samples that aren't simple floating point numbers (float samples)
+  but complete histograms (histogram samples). A vector may contain a mix of
+  float samples and histogram samples.
+
 ## Literals

 ### String literals
--- a/docs/querying/functions.md
+++ b/docs/querying/functions.md
@ -11,6 +11,22 @@ instant-vector)`. This means that there is one argument `v` which is an instant
 vector, which if not provided it will default to the value of the expression
 `vector(time())`.

+_Notes about the experimental native histograms:_
+
+* Ingesting native histograms has to be enabled via a [feature
+  flag](../feature_flags/#native-histograms). As long as no native histograms
+  have been ingested into the TSDB, all functions will behave as usual.
+* Functions that do not explicitly mention native histograms in their
+  documentation (see below) effectively treat a native histogram as a float
+  sample of value 0. (This is confusing and will change before native
+  histograms become a stable feature.)
+* Functions that do already act on native histograms might still change their
+  behavior in the future.
+* If a function requires the same bucket layout between multiple native
+  histograms it acts on, it will automatically convert them
+  appropriately. (With the currently supported bucket schemas, that's always
+  possible.)
+
 ## `abs()`

 `abs(v instant-vector)` returns the input vector with all sample values converted to
@ -19,8 +35,8 @@ their absolute value.
 ## `absent()`

 `absent(v instant-vector)` returns an empty vector if the vector passed to it
-has any elements and a 1-element vector with the value 1 if the vector passed to
-it has no elements.
+has any elements (floats or native histograms) and a 1-element vector with the
+value 1 if the vector passed to it has no elements.

 This is useful for alerting on when no time series exist for a given metric name
 and label combination.
@ -42,8 +58,8 @@ of the 1-element output vector from the input vector.
 ## `absent_over_time()`

 `absent_over_time(v range-vector)` returns an empty vector if the range vector
-passed to it has any elements and a 1-element vector with the value 1 if the
-range vector passed to it has no elements.
+passed to it has any elements (floats or native histograms) and a 1-element
+vector with the value 1 if the range vector passed to it has no elements.

 This is useful for alerting on when no time series exist for a given metric name
 and label combination for a certain amount of time.
@ -130,7 +146,14 @@ between now and 2 hours ago:
 delta(cpu_temp_celsius{host="zeus"}[2h])
 ```

-`delta` should only be used with gauges.
+`delta` acts on native histograms by calculating a new histogram where each
+compononent (sum and count of observations, buckets) is the difference between
+the respective component in the first and last native histogram in
+`v`. However, each element in `v` that contains a mix of float and native
+histogram samples within the range, will be missing from the result vector.
+
+`delta` should only be used with gauges and native histograms where the
+components behave like gauges (so-called gauge histograms).

 ## `deriv()`

@ -156,15 +179,19 @@ to the nearest integer.

 ## `histogram_count()` and `histogram_sum()`

+_Both functions only act on native histograms, which are an experimental
+feature. The behavior of these functions may change in future versions of
+Prometheus, including their removal from PromQL._
+
 `histogram_count(v instant-vector)` returns the count of observations stored in
-a native Histogram. Samples that are not native Histograms are ignored and do
+a native histogram. Samples that are not native histograms are ignored and do
 not show up in the returned vector.

 Similarly, `histogram_sum(v instant-vector)` returns the sum of observations
-stored in a native Histogram.
+stored in a native histogram.

 Use `histogram_count` in the following way to calculate a rate of observations
-(in this case corresponding to “requests per second”) from a native Histogram:
+(in this case corresponding to “requests per second”) from a native histogram:

    histogram_count(rate(http_request_duration_seconds[10m]))

@ -177,57 +204,121 @@ observed values (in this case corresponding to “average request duration”):

 ## `histogram_fraction()`

-TODO(beorn7): Add documentation.
+_This function only acts on native histograms, which are an experimental
+feature. The behavior of this function may change in future versions of
+Prometheus, including its removal from PromQL._
+
+For a native histogram, `histogram_fraction(lower scalar, upper scalar, v
+instant-vector)` returns the estimated fraction of observations between the
+provided lower and upper values. Samples that are not native histograms are
+ignored and do not show up in the returned vector.
+
+For example, the following expression calculates the fraction of HTTP requests
+over the last hour that took 200ms or less:
+    
+    histogram_fraction(0, 0.2, rate(http_request_duration_seconds[1h]))
+
+The error of the estimation depends on the resolution of the underlying native
+histogram and how closely the provided boundaries are aligned with the bucket
+boundaries in the histogram.
+
+`+Inf` and `-Inf` are valid boundary values. For example, if the histogram in
+the expression above included negative observations (which shouldn't be the
+case for request durations), the appropriate lower boundary to include all
+observations less than or equal 0.2 would be `-Inf` rather than `0`.
+
+Whether the provided boundaries are inclusive or exclusive is only relevant if
+the provided boundaries are precisely aligned with bucket boundaries in the
+underlying native histogram. In this case, the behavior depends on the schema
+definition of the histogram. The currently supported schemas all feature
+inclusive upper boundaries and exclusive lower boundaries for positive values
+(and vice versa for negative values). Without a precise alignment of
+boundaries, the function uses linear interpolation to estimate the
+fraction. With the resulting uncertainty, it becomes irrelevant if the
+boundaries are inclusive or exclusive.

 ## `histogram_quantile()`

-TODO(beorn7): This needs a lot of updates for Histograms as sample value types.
+`histogram_quantile(φ scalar, b instant-vector)` calculates the φ-quantile (0 ≤
+φ ≤ 1) from a [conventional
+histogram](https://prometheus.io/docs/concepts/metric_types/#histogram) or from
+a native histogram. (See [histograms and
+summaries](https://prometheus.io/docs/practices/histograms) for a detailed
+explanation of φ-quantiles and the usage of the (conventional) histogram metric
+type in general.)

-`histogram_quantile(φ scalar, b instant-vector)` calculates the φ-quantile (0 ≤ φ
-≤ 1) from the buckets `b` of a
-[histogram](https://prometheus.io/docs/concepts/metric_types/#histogram). (See
-[histograms and summaries](https://prometheus.io/docs/practices/histograms) for
-a detailed explanation of φ-quantiles and the usage of the histogram metric type
-in general.) The samples in `b` are the counts of observations in each bucket.
-Each sample must have a label `le` where the label value denotes the inclusive
-upper bound of the bucket. (Samples without such a label are silently ignored.)
-The [histogram metric type](https://prometheus.io/docs/concepts/metric_types/#histogram)
-automatically provides time series with the `_bucket` suffix and the appropriate
-labels.
+_Note that native histograms are an experimental feature. The behavior of this
+function when dealing with native histograms may change in future versions of
+Prometheus._
+
+The conventional float samples in `b` are considered the counts of observations
+in each bucket of one or more conventional histograms. Each float sample must
+have a label `le` where the label value denotes the inclusive upper bound of
+the bucket. (Float samples without such a label are silently ignored.) The
+other labels and the metric name are used to identify the buckets belonging to
+each conventional histogram. The [histogram metric
+type](https://prometheus.io/docs/concepts/metric_types/#histogram)
+automatically provides time series with the `_bucket` suffix and the
+appropriate labels.
+
+The native histogram samples in `b` are treated each individually as a separate
+histogram to calculate the quantile from.
+
+As long as no naming collisions arise, `b` may contain a mix of conventional
+and native histograms.

 Use the `rate()` function to specify the time window for the quantile
 calculation.

-Example: A histogram metric is called `http_request_duration_seconds`. To
-calculate the 90th percentile of request durations over the last 10m, use the
-following expression:
+Example: A histogram metric is called `http_request_duration_seconds` (and
+therefore the metric name for the buckets of a conventional histogram is
+`http_request_duration_seconds_bucket`). To calculate the 90th percentile of request
+durations over the last 10m, use the following expression in case
+`http_request_duration_seconds` is a conventional histogram:

    histogram_quantile(0.9, rate(http_request_duration_seconds_bucket[10m]))

+For a native histogram, use the following expression instead:
+
+    histogram_quantile(0.9, rate(http_request_duration_seconds[10m]))
+
 The quantile is calculated for each label combination in
 `http_request_duration_seconds`. To aggregate, use the `sum()` aggregator
 around the `rate()` function. Since the `le` label is required by
-`histogram_quantile()`, it has to be included in the `by` clause. The following
-expression aggregates the 90th percentile by `job`:
+`histogram_quantile()` to deal with conventional histograms, it has to be
+included in the `by` clause. The following expression aggregates the 90th
+percentile by `job` for conventional histograms:

    histogram_quantile(0.9, sum by (job, le) (rate(http_request_duration_seconds_bucket[10m])))
+	
+When aggregating native histograms, the expression simplifies to:

-To aggregate everything, specify only the `le` label:
+    histogram_quantile(0.9, sum by (job) (rate(http_request_duration_seconds[10m])))
+
+To aggregate all conventional histograms, specify only the `le` label:

    histogram_quantile(0.9, sum by (le) (rate(http_request_duration_seconds_bucket[10m])))

-The `histogram_quantile()` function interpolates quantile values by
-assuming a linear distribution within a bucket. The highest bucket
-must have an upper bound of `+Inf`. (Otherwise, `NaN` is returned.) If
-a quantile is located in the highest bucket, the upper bound of the
-second highest bucket is returned. A lower limit of the lowest bucket
-is assumed to be 0 if the upper bound of that bucket is greater than
-0. In that case, the usual linear interpolation is applied within that
-bucket. Otherwise, the upper bound of the lowest bucket is returned
-for quantiles located in the lowest bucket.
+With native histograms, aggregating everything works as usual without any `by` clause:
+
+    histogram_quantile(0.9, sum(rate(http_request_duration_seconds[10m])))
+
+The `histogram_quantile()` function interpolates quantile values by
+assuming a linear distribution within a bucket. 
+
+If `b` has 0 observations, `NaN` is returned. For φ < 0, `-Inf` is
+returned. For φ > 1, `+Inf` is returned. For φ = `NaN`, `NaN` is returned.
+
+The following is only relevant for conventional histograms: If `b` contains
+fewer than two buckets, `NaN` is returned. The highest bucket must have an
+upper bound of `+Inf`. (Otherwise, `NaN` is returned.) If a quantile is located
+in the highest bucket, the upper bound of the second highest bucket is
+returned. A lower limit of the lowest bucket is assumed to be 0 if the upper
+bound of that bucket is greater than
+0. In that case, the usual linear interpolation is applied within that
+bucket. Otherwise, the upper bound of the lowest bucket is returned for
+quantiles located in the lowest bucket. 

-If `b` has 0 observations, `NaN` is returned. If `b` contains fewer than two buckets,
-`NaN` is returned. For φ < 0, `-Inf` is returned. For φ > 1, `+Inf` is returned. For φ = `NaN`, `NaN` is returned.

 ## `holt_winters()`

@ -269,11 +360,17 @@ over the last 5 minutes, per time series in the range vector:
 increase(http_requests_total{job="api-server"}[5m])
 ```

-`increase` should only be used with counters. It is syntactic sugar
-for `rate(v)` multiplied by the number of seconds under the specified
-time range window, and should be used primarily for human readability.
-Use `rate` in recording rules so that increases are tracked consistently
-on a per-second basis.
+`increase` acts on native histograms by calculating a new histogram where each
+compononent (sum and count of observations, buckets) is the increase between
+the respective component in the first and last native histogram in
+`v`. However, each element in `v` that contains a mix of float and native
+histogram samples within the range, will be missing from the result vector.
+
+`increase` should only be used with counters and native histograms where the
+components behave like counters. It is syntactic sugar for `rate(v)` multiplied
+by the number of seconds under the specified time range window, and should be
+used primarily for human readability.  Use `rate` in recording rules so that
+increases are tracked consistently on a per-second basis.

 ## `irate()`

@ -385,8 +482,15 @@ over the last 5 minutes, per time series in the range vector:
 rate(http_requests_total{job="api-server"}[5m])
 ```

-`rate` should only be used with counters. It is best suited for alerting,
-and for graphing of slow-moving counters.
+`rate` acts on native histograms by calculating a new histogram where each
+compononent (sum and count of observations, buckets) is the rate of increase
+between the respective component in the first and last native histogram in
+`v`. However, each element in `v` that contains a mix of float and native
+histogram samples within the range, will be missing from the result vector.
+
+`rate` should only be used with counters and native histograms where the
+components behave like counters. It is best suited for alerting, and for
+graphing of slow-moving counters.

 Note that when combining `rate()` with an aggregation operator (e.g. `sum()`)
 or a function aggregating over time (any function ending in `_over_time`),
--- a/docs/querying/operators.md
+++ b/docs/querying/operators.md
@ -306,3 +306,31 @@ highest to lowest.
 Operators on the same precedence level are left-associative. For example,
 `2 * 3 % 2` is equivalent to `(2 * 3) % 2`. However `^` is right associative,
 so `2 ^ 3 ^ 2` is equivalent to `2 ^ (3 ^ 2)`.
+
+## Operators for native histograms
+
+Native histograms are an experimental feature. Ingesting native histograms has
+to be enabled via a [feature flag](../feature_flags/#native-histograms). Once
+native histograms have been ingested, they can be queried (even after the
+feature flag has been disabled again). However, the operator support for native
+histograms is still very limited.
+
+Logical/set binary operators work as expected even if histogram samples are
+involved. They only check for the existence of a vector element and don't
+change their behavior depending on the sample type of an element (float or
+histogram).
+
+The binary `+` operator between two native histograms and the `sum` aggregation
+operator to aggregate native histograms are fully supported. Even if the
+histograms involved have different bucket layouts, the buckets are
+automatically converted appropriately so that the operation can be
+performed. (With the currently supported bucket schemas, that's always
+possible.) If either operator has to sum up a mix of histogram samples and
+float samples, the corresponding vector element is removed from the output
+vector entirely.
+
+All other operators do not behave in a meaningful way. They either treat the
+histogram sample as if it were a float sample of value 0, or (in case of
+arithmetic operations between a scalar and a vector) they leave the histogram
+sample unchanged. This behavior will change to a meaningful one before native
+histograms are a stable feature.