prometheus/promql/promqltest/testdata/name_label_dropping.test
Jorge Creixell e9e3d64b7c
PromQL engine: Delay deletion of __name__ label to the end of the query evaluation (#14477)
PromQL engine: Delay deletion of __name__ label to the end of the query evaluation

  - This change allows optionally preserving the `__name__` label via the `label_replace` and `label_join` functions, and helps prevent the dreaded "vector cannot contain metrics with the same labelset" error.
  - The implementation extends the `Series` and `Sample` structs with a boolean flag indicating whether the `__name__` label should be deleted at the end of the query evaluation.
  - The `label_replace` and `label_join` functions can still access the value of the `__name__` label, even if it has been previously marked for deletion. If  `__name__` is used as target label, it won't be dropped at the end of the query evaluation.
  - Fixes https://github.com/prometheus/prometheus/issues/11397
  - See https://github.com/jcreixell/prometheus/pull/2 for previous discussion, including the decision to create this PR and benchmark it before considering other alternatives (like refactoring `labels.Labels`).
  - See https://github.com/jcreixell/prometheus/pull/1 for an alternative implementation using a special label instead of boolean flags.
  - Note: a feature flag  `promql-delayed-name-removal` has been added as it changes the behavior of some "weird" queries (see https://github.com/prometheus/prometheus/issues/11397#issuecomment-1451998792)

Example (this always fails, as `__name__` is being dropped by `count_over_time`):

```
count_over_time({__name__!=""}[1m])

=> Error executing query: vector cannot contain metrics with the same labelset
```

Before:

```
label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)")

=> Error executing query: vector cannot contain metrics with the same labelset
```

After:

```
label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)")

=>
count_go_gc_cycles_automatic_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1
count_go_gc_cycles_forced_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1
...
```

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

---------

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
2024-08-29 15:50:39 +02:00

85 lines
2.9 KiB
Plaintext

# Test for __name__ label drop.
load 5m
metric{env="1"} 0 60 120
another_metric{env="1"} 60 120 180
# Does not drop __name__ for vector selector
eval instant at 15m metric{env="1"}
metric{env="1"} 120
# Drops __name__ for unary operators
eval instant at 15m -metric
{env="1"} -120
# Drops __name__ for binary operators
eval instant at 15m metric + another_metric
{env="1"} 300
# Does not drop __name__ for binary comparison operators
eval instant at 15m metric <= another_metric
metric{env="1"} 120
# Drops __name__ for binary comparison operators with "bool" modifier
eval instant at 15m metric <= bool another_metric
{env="1"} 1
# Drops __name__ for vector-scalar operations
eval instant at 15m metric * 2
{env="1"} 240
# Drops __name__ for instant-vector functions
eval instant at 15m clamp(metric, 0, 100)
{env="1"} 100
# Drops __name__ for range-vector functions
eval instant at 15m rate(metric{env="1"}[10m])
{env="1"} 0.2
# Does not drop __name__ for last_over_time function
eval instant at 15m last_over_time(metric{env="1"}[10m])
metric{env="1"} 120
# Drops name for other _over_time functions
eval instant at 15m max_over_time(metric{env="1"}[10m])
{env="1"} 120
# Allows relabeling (to-be-dropped) __name__ via label_replace
eval instant at 15m label_replace(rate({env="1"}[10m]), "my_name", "rate_$1", "__name__", "(.+)")
{my_name="rate_metric", env="1"} 0.2
{my_name="rate_another_metric", env="1"} 0.2
# Allows preserving __name__ via label_replace
eval instant at 15m label_replace(rate({env="1"}[10m]), "__name__", "rate_$1", "__name__", "(.+)")
rate_metric{env="1"} 0.2
rate_another_metric{env="1"} 0.2
# Allows relabeling (to-be-dropped) __name__ via label_join
eval instant at 15m label_join(rate({env="1"}[10m]), "my_name", "_", "__name__")
{my_name="metric", env="1"} 0.2
{my_name="another_metric", env="1"} 0.2
# Allows preserving __name__ via label_join
eval instant at 15m label_join(rate({env="1"}[10m]), "__name__", "_", "__name__", "env")
metric_1{env="1"} 0.2
another_metric_1{env="1"} 0.2
# Does not drop metric names fro aggregation operators
eval instant at 15m sum by (__name__, env) (metric{env="1"})
metric{env="1"} 120
# Aggregation operators by __name__ lead to duplicate labelset errors (aggregation is partitioned by not yet removed __name__ label)
# This is an accidental side effect of delayed __name__ label dropping
eval_fail instant at 15m sum by (__name__) (rate({env="1"}[10m]))
# Aggregation operators aggregate metrics with same labelset and to-be-dropped names
# This is an accidental side effect of delayed __name__ label dropping
eval instant at 15m sum(rate({env="1"}[10m])) by (env)
{env="1"} 0.4
# Aggregationk operators propagate __name__ label dropping information
eval instant at 15m topk(10, sum by (__name__, env) (metric{env="1"}))
metric{env="1"} 120
eval instant at 15m topk(10, sum by (__name__, env) (rate(metric{env="1"}[10m])))
{env="1"} 0.2