prometheus/docs/migration.md
Jan Fajerski f131cdd4c5
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Build Prometheus for common architectures (0) (push) Waiting to run
CI / Build Prometheus for common architectures (1) (push) Waiting to run
CI / Build Prometheus for common architectures (2) (push) Waiting to run
CI / Build Prometheus for all architectures (0) (push) Waiting to run
CI / Build Prometheus for all architectures (1) (push) Waiting to run
CI / Build Prometheus for all architectures (10) (push) Waiting to run
CI / Build Prometheus for all architectures (11) (push) Waiting to run
CI / Build Prometheus for all architectures (2) (push) Waiting to run
CI / Build Prometheus for all architectures (3) (push) Waiting to run
CI / Build Prometheus for all architectures (4) (push) Waiting to run
CI / Build Prometheus for all architectures (5) (push) Waiting to run
CI / Build Prometheus for all architectures (6) (push) Waiting to run
CI / Build Prometheus for all architectures (7) (push) Waiting to run
CI / Build Prometheus for all architectures (8) (push) Waiting to run
CI / Build Prometheus for all architectures (9) (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
3.0 migration guide (#15099)
* docs: 2 to 3 migration guide

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* docs/stability: add 3.0 section

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* docs/migration: details on enabling legacy name validation

Signed-off-by: Owen Williams <owen.williams@grafana.com>\

* migration: add log format and `le` normalization

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* migration: add new enable_http2 default for remote write

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

---------

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
Signed-off-by: Owen Williams <owen.williams@grafana.com>
Co-authored-by: Owen Williams <owen.williams@grafana.com>
2024-10-25 12:30:13 +02:00

201 lines
10 KiB
Markdown

---
title: Migration
sort_rank: 10
---
# Prometheus 3.0 migration guide
In line with our [stability promise](https://prometheus.io/docs/prometheus/latest/stability/),
the Prometheus 3.0 release contains a number of backwards incompatible changes.
This document offers guidance on migrating from Prometheus 2.x to Prometheus 3.0 and newer versions.
## Flags
- The following feature flags have been removed and they have been added to the
default behavior of Prometheus v3:
- `promql-at-modifier`
- `promql-negative-offset`
- `remote-write-receiver`
- `new-service-discovery-manager`
- `expand-external-labels`
Environment variable references `${var}` or `$var` in external label values
are replaced according to the values of the current environment variables.
References to undefined variables are replaced by the empty string.
The `$` character can be escaped by using `$$`.
- `no-default-scrape-port`
Prometheus v3 will no longer add ports to scrape targets according to the
specified scheme. Target will now appear in labels as configured.
If you rely on scrape targets like
`https://example.com/metrics` or `http://exmaple.com/metrics` to be
represented as `https://example.com/metrics:443` and
`http://example.com/metrics:80` respectively, add them to your target URLs
- `agent`
Instead use the dedicated `--agent` cli flag.
Prometheus v3 will log a warning if you continue to pass these to
`--enable-feature`.
## Configuration
- The scrape job level configuration option `scrape_classic_histograms` has been
renamed to `always_scrape_classic_histograms`. If you use the
`--enable-feature=native-histograms` feature flag to ingest native histograms
and you also want to ingest classic histograms that an endpoint might expose
along with native histograms, be sure to add this configuration or change your
configuration from the old name.
- The `http_config.enable_http2` in `remote_write` items default has been
changed to `false`. In Prometheus v2 the remote write http client would
default to use http2. In order to parallelize multiple remote write queues
across multiple sockets its preferable to not default to http2.
If you prefer to use http2 for remote write you must now set
`http_config.enable_http2: true` in your `remote_write` configuration section.
## PromQL
- The `.` pattern in regular expressions in PromQL matches newline characters.
With this change a regular expressions like `.*` matches strings that include
`\n`. This applies to matchers in queries and relabel configs. For example the
following regular expressions now match the accompanying strings, wheras in
Prometheus v2 these combinations didn't match.
| Regex | Additional matches |
| ----- | ------ |
| ".*" | "foo\n", "Foo\nBar" |
| "foo.?bar" | "foo\nbar" |
| "foo.+bar" | "foo\nbar" |
If you want Prometheus v3 to behave like v2 did, you will have to change your
regular expressions by replacing all `.` patterns with `[^\n]`, e.g.
`foo[^\n]*`.
- Lookback and range selectors are left open and right closed (previously left
closed and right closed). This change affects queries when the evaluation time
perfectly aligns with the sample timestamps. For example assume querying a
timeseries with even spaced samples exactly 1 minute apart. Before Prometheus
3.x, range query with `5m` will mostly return 5 samples. But if the query
evaluation aligns perfectly with a scrape, it would return 6 samples. In
Prometheus 3.x queries like this will always return 5 samples.
This change has likely few effects for everyday use, except for some sub query
use cases.
Query front-ends that align queries usually align sub-queries to multiples of
the step size. These sub queries will likely be affected.
Tests are more likely to affected. To fix those either adjust the expected
number of samples or extend to range by less then one sample interval.
- The `holt_winters` function has been renamed to `double_exponential_smoothing`
and is now guarded by the `promql-experimental-functions` feature flag.
If you want to keep using holt_winters, you have to do both of these things:
- Rename holt_winters to double_exponential_smoothing in your queries.
- Pass `--enable-feature=promql-experimental-functions` in your Prometheus
cli invocation..
## Scrape protocols
Prometheus v3 is more strict concerning the Content-Type header received when
scraping. Prometheus v2 would default to the standard Prometheus text protocol
if the target being scraped did not specify a Content-Type header or if the
header was unparsable or unrecognised. This could lead to incorrect data being
parsed in the scrape. Prometheus v3 will now fail the scrape in such cases.
If a scrape target is not providing the correct Content-Type header the
fallback protocol can be specified using the fallback_scrape_protocol
parameter. See [Prometheus scrape_config documentation.](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config)
This is a breaking change as scrapes that may have succeeded with Prometheus v2
may now fail if this fallback protocol is not specified.
## Miscellaneous
### TSDB format and downgrade
The TSDB format has been changed in Prometheus v2.55 in preparation for changes
to the index format. Consequently a Prometheus v3 tsdb can only be read by a
Prometheus v2.55 or newer.
Before upgrading to Prometheus v3 please upgrade to v2.55 first and confirm
Prometheus works as expected. Only then continue with the upgrade to v3.
### TSDB Storage contract
TSDB compatible storage is now expected to return results matching the specified
selectors. This might impact some third party implementations, most likely
implementing `remote_read`.
This contract is not explicitly enforced, but can cause undefined behavior.
### UTF-8 names
Prometheus v3 supports UTF-8 in metric and label names. This means metric and
label names can change after upgrading according to what is exposed by
endpoints. Furthermore, metric and label names that would have previously been
flagged as invalid no longer will be.
Users wishing to preserve the original validation behavior can update their
prometheus yaml configuration to specify the legacy validation scheme:
```
global:
metric_name_validation_scheme: legacy
```
Or on a per-scrape basis:
```
scrape_configs:
- job_name: job1
metric_name_validation_scheme: utf8
- job_name: job2
metric_name_validation_scheme: legacy
```
### Log message format
Prometheus v3 has adopted `log/slog` over the previous `go-kit/log`. This
results in a change of log message format. An example of the old log format is:
```
ts=2024-10-23T22:01:06.074Z caller=main.go:627 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2024-10-23T22:01:06.074Z caller=main.go:671 level=info msg="Starting Prometheus Server" mode=server version="(version=, branch=, revision=91d80252c3e528728b0f88d254dd720f6be07cb8-modified)"
ts=2024-10-23T22:01:06.074Z caller=main.go:676 level=info build_context="(go=go1.23.0, platform=linux/amd64, user=, date=, tags=unknown)"
ts=2024-10-23T22:01:06.074Z caller=main.go:677 level=info host_details="(Linux 5.15.0-124-generic #134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 2024 x86_64 gigafips (none))"
```
a similar sequence in the new log format looks like this:
```
time=2024-10-24T00:03:07.542+02:00 level=INFO source=/home/user/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:640 msg="No time or size retention was set so using the default time retention" duration=15d
time=2024-10-24T00:03:07.542+02:00 level=INFO source=/home/user/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:681 msg="Starting Prometheus Server" mode=server version="(version=, branch=, revision=7c7116fea8343795cae6da42960cacd0207a2af8)"
time=2024-10-24T00:03:07.542+02:00 level=INFO source=/home/user/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:686 msg="operational information" build_context="(go=go1.23.0, platform=linux/amd64, user=, date=, tags=unknown)" host_details="(Linux 5.15.0-124-generic #134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 2024 x86_64 gigafips (none))" fd_limits="(soft=1048576, hard=1048576)" vm_limits="(soft=unlimited, hard=unlimited)"
```
### `le` and `quantile` label values
In Prometheus v3, the values of the `le` label of classic histograms and the
`quantile` label of summaries are normalized upon ingestions. In Prometheus v2
the value of these labels depended on the scrape protocol (protobuf vs text
format) in some situations. This led to label values changing based on the
scrape protocol. E.g. a metric exposed as `my_classic_hist{le="1"}` would be
ingested as `my_classic_hist{le="1"}` via the text format, but as
`my_classic_hist{le="1.0"}` via protobuf. This changed the identity of the
metric and caused problems when querying the metric.
In Prometheus v3 these label values will always be normalized to a float like
representation. I.e. the above example will always result in
`my_classic_hist{le="1.0"}` being ingested into prometheus, no matter via which
protocol. The effect of this change is that alerts, recording rules and
dashboards that directly reference label values as whole numbers such as
`le="1"` will stop working.
Ways to deal with this change either globally or on a per metric basis:
- Fix references to integer `le`, `quantile` label values, but otherwise do
nothing and accept that some queries that span the transition time will produce
inaccurate or unexpected results.
_This is the recommended solution._
- Use `metric_relabel_config` to retain the old labels when scraping targets.
This should **only** be applied to metrics that currently produce such labels.
```yaml
metric_relabel_configs:
- source_labels:
- quantile
target_label: quantile
regex: (\d+)\.0+
- source_labels:
- le
- __name__
target_label: le
regex: (\d+)\.0+;.*_bucket
```
# Prometheus 2.0 migration guide
For the Prometheus 1.8 to 2.0 please refer to the [Prometheus v2.55 documentation](https://prometheus.io/docs/prometheus/2.55/migration/).