PromQL engine: Delay deletion of __name__ label to the end of the query evaluation
- This change allows optionally preserving the `__name__` label via the `label_replace` and `label_join` functions, and helps prevent the dreaded "vector cannot contain metrics with the same labelset" error.
- The implementation extends the `Series` and `Sample` structs with a boolean flag indicating whether the `__name__` label should be deleted at the end of the query evaluation.
- The `label_replace` and `label_join` functions can still access the value of the `__name__` label, even if it has been previously marked for deletion. If `__name__` is used as target label, it won't be dropped at the end of the query evaluation.
- Fixes https://github.com/prometheus/prometheus/issues/11397
- See https://github.com/jcreixell/prometheus/pull/2 for previous discussion, including the decision to create this PR and benchmark it before considering other alternatives (like refactoring `labels.Labels`).
- See https://github.com/jcreixell/prometheus/pull/1 for an alternative implementation using a special label instead of boolean flags.
- Note: a feature flag `promql-delayed-name-removal` has been added as it changes the behavior of some "weird" queries (see https://github.com/prometheus/prometheus/issues/11397#issuecomment-1451998792)
Example (this always fails, as `__name__` is being dropped by `count_over_time`):
```
count_over_time({__name__!=""}[1m])
=> Error executing query: vector cannot contain metrics with the same labelset
```
Before:
```
label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)")
=> Error executing query: vector cannot contain metrics with the same labelset
```
After:
```
label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)")
=>
count_go_gc_cycles_automatic_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1
count_go_gc_cycles_forced_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1
...
```
Signed-off-by: Jorge Creixell <jcreixell@gmail.com>
---------
Signed-off-by: Jorge Creixell <jcreixell@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
The RunBuiltinTests function accepts a concrete type which makes
it hard to exclude certain tests from the suite. It would be great
if we could skip tests which might not be critical in order to unblock
updates.
By accepting an interface instead, we can inject a custom implementation
which would skips select test cases.
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Add ability to assert that a query fails with a particular error message
This also adds documentation for the test scripting language in general,
including the new feature.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
---------
Signed-off-by: Charles Korn <charles.korn@grafana.com>