prometheus/docs/migration.md
Jan Fajerski f131cdd4c5
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Build Prometheus for common architectures (0) (push) Waiting to run
CI / Build Prometheus for common architectures (1) (push) Waiting to run
CI / Build Prometheus for common architectures (2) (push) Waiting to run
CI / Build Prometheus for all architectures (0) (push) Waiting to run
CI / Build Prometheus for all architectures (1) (push) Waiting to run
CI / Build Prometheus for all architectures (10) (push) Waiting to run
CI / Build Prometheus for all architectures (11) (push) Waiting to run
CI / Build Prometheus for all architectures (2) (push) Waiting to run
CI / Build Prometheus for all architectures (3) (push) Waiting to run
CI / Build Prometheus for all architectures (4) (push) Waiting to run
CI / Build Prometheus for all architectures (5) (push) Waiting to run
CI / Build Prometheus for all architectures (6) (push) Waiting to run
CI / Build Prometheus for all architectures (7) (push) Waiting to run
CI / Build Prometheus for all architectures (8) (push) Waiting to run
CI / Build Prometheus for all architectures (9) (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
3.0 migration guide (#15099)
* docs: 2 to 3 migration guide

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* docs/stability: add 3.0 section

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* docs/migration: details on enabling legacy name validation

Signed-off-by: Owen Williams <owen.williams@grafana.com>\

* migration: add log format and `le` normalization

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* migration: add new enable_http2 default for remote write

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

---------

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
Signed-off-by: Owen Williams <owen.williams@grafana.com>
Co-authored-by: Owen Williams <owen.williams@grafana.com>
2024-10-25 12:30:13 +02:00

10 KiB

title sort_rank
Migration 10

Prometheus 3.0 migration guide

In line with our stability promise, the Prometheus 3.0 release contains a number of backwards incompatible changes. This document offers guidance on migrating from Prometheus 2.x to Prometheus 3.0 and newer versions.

Flags

  • The following feature flags have been removed and they have been added to the default behavior of Prometheus v3:

    • promql-at-modifier
    • promql-negative-offset
    • remote-write-receiver
    • new-service-discovery-manager
    • expand-external-labels Environment variable references ${var} or $var in external label values are replaced according to the values of the current environment variables.
      References to undefined variables are replaced by the empty string. The $ character can be escaped by using $$.
    • no-default-scrape-port Prometheus v3 will no longer add ports to scrape targets according to the specified scheme. Target will now appear in labels as configured. If you rely on scrape targets like https://example.com/metrics or http://exmaple.com/metrics to be represented as https://example.com/metrics:443 and http://example.com/metrics:80 respectively, add them to your target URLs
      • agent Instead use the dedicated --agent cli flag.

    Prometheus v3 will log a warning if you continue to pass these to --enable-feature.

Configuration

  • The scrape job level configuration option scrape_classic_histograms has been renamed to always_scrape_classic_histograms. If you use the --enable-feature=native-histograms feature flag to ingest native histograms and you also want to ingest classic histograms that an endpoint might expose along with native histograms, be sure to add this configuration or change your configuration from the old name.
  • The http_config.enable_http2 in remote_write items default has been changed to false. In Prometheus v2 the remote write http client would default to use http2. In order to parallelize multiple remote write queues across multiple sockets its preferable to not default to http2. If you prefer to use http2 for remote write you must now set http_config.enable_http2: true in your remote_write configuration section.

PromQL

  • The . pattern in regular expressions in PromQL matches newline characters. With this change a regular expressions like .* matches strings that include \n. This applies to matchers in queries and relabel configs. For example the following regular expressions now match the accompanying strings, wheras in Prometheus v2 these combinations didn't match.
Regex Additional matches
".*" "foo\n", "Foo\nBar"
"foo.?bar" "foo\nbar"
"foo.+bar" "foo\nbar"

If you want Prometheus v3 to behave like v2 did, you will have to change your regular expressions by replacing all . patterns with [^\n], e.g.
foo[^\n]*.

  • Lookback and range selectors are left open and right closed (previously left closed and right closed). This change affects queries when the evaluation time perfectly aligns with the sample timestamps. For example assume querying a timeseries with even spaced samples exactly 1 minute apart. Before Prometheus 3.x, range query with 5m will mostly return 5 samples. But if the query evaluation aligns perfectly with a scrape, it would return 6 samples. In Prometheus 3.x queries like this will always return 5 samples. This change has likely few effects for everyday use, except for some sub query use cases. Query front-ends that align queries usually align sub-queries to multiples of the step size. These sub queries will likely be affected. Tests are more likely to affected. To fix those either adjust the expected number of samples or extend to range by less then one sample interval.
  • The holt_winters function has been renamed to double_exponential_smoothing and is now guarded by the promql-experimental-functions feature flag. If you want to keep using holt_winters, you have to do both of these things:
    • Rename holt_winters to double_exponential_smoothing in your queries.
    • Pass --enable-feature=promql-experimental-functions in your Prometheus cli invocation..

Scrape protocols

Prometheus v3 is more strict concerning the Content-Type header received when scraping. Prometheus v2 would default to the standard Prometheus text protocol if the target being scraped did not specify a Content-Type header or if the header was unparsable or unrecognised. This could lead to incorrect data being parsed in the scrape. Prometheus v3 will now fail the scrape in such cases.

If a scrape target is not providing the correct Content-Type header the fallback protocol can be specified using the fallback_scrape_protocol parameter. See Prometheus scrape_config documentation.

This is a breaking change as scrapes that may have succeeded with Prometheus v2 may now fail if this fallback protocol is not specified.

Miscellaneous

TSDB format and downgrade

The TSDB format has been changed in Prometheus v2.55 in preparation for changes to the index format. Consequently a Prometheus v3 tsdb can only be read by a Prometheus v2.55 or newer. Before upgrading to Prometheus v3 please upgrade to v2.55 first and confirm Prometheus works as expected. Only then continue with the upgrade to v3.

TSDB Storage contract

TSDB compatible storage is now expected to return results matching the specified selectors. This might impact some third party implementations, most likely implementing remote_read. This contract is not explicitly enforced, but can cause undefined behavior.

UTF-8 names

Prometheus v3 supports UTF-8 in metric and label names. This means metric and label names can change after upgrading according to what is exposed by endpoints. Furthermore, metric and label names that would have previously been flagged as invalid no longer will be.

Users wishing to preserve the original validation behavior can update their prometheus yaml configuration to specify the legacy validation scheme:

global:
  metric_name_validation_scheme: legacy

Or on a per-scrape basis:

scrape_configs:
  - job_name: job1
    metric_name_validation_scheme: utf8
  - job_name: job2
    metric_name_validation_scheme: legacy

Log message format

Prometheus v3 has adopted log/slog over the previous go-kit/log. This results in a change of log message format. An example of the old log format is:

ts=2024-10-23T22:01:06.074Z caller=main.go:627 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2024-10-23T22:01:06.074Z caller=main.go:671 level=info msg="Starting Prometheus Server" mode=server version="(version=, branch=, revision=91d80252c3e528728b0f88d254dd720f6be07cb8-modified)"
ts=2024-10-23T22:01:06.074Z caller=main.go:676 level=info build_context="(go=go1.23.0, platform=linux/amd64, user=, date=, tags=unknown)"
ts=2024-10-23T22:01:06.074Z caller=main.go:677 level=info host_details="(Linux 5.15.0-124-generic #134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 2024 x86_64 gigafips (none))"

a similar sequence in the new log format looks like this:

time=2024-10-24T00:03:07.542+02:00 level=INFO source=/home/user/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:640 msg="No time or size retention was set so using the default time retention" duration=15d
time=2024-10-24T00:03:07.542+02:00 level=INFO source=/home/user/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:681 msg="Starting Prometheus Server" mode=server version="(version=, branch=, revision=7c7116fea8343795cae6da42960cacd0207a2af8)"
time=2024-10-24T00:03:07.542+02:00 level=INFO source=/home/user/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:686 msg="operational information" build_context="(go=go1.23.0, platform=linux/amd64, user=, date=, tags=unknown)" host_details="(Linux 5.15.0-124-generic #134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 2024 x86_64 gigafips (none))" fd_limits="(soft=1048576, hard=1048576)" vm_limits="(soft=unlimited, hard=unlimited)"

le and quantile label values

In Prometheus v3, the values of the le label of classic histograms and the quantile label of summaries are normalized upon ingestions. In Prometheus v2 the value of these labels depended on the scrape protocol (protobuf vs text format) in some situations. This led to label values changing based on the scrape protocol. E.g. a metric exposed as my_classic_hist{le="1"} would be ingested as my_classic_hist{le="1"} via the text format, but as my_classic_hist{le="1.0"} via protobuf. This changed the identity of the metric and caused problems when querying the metric. In Prometheus v3 these label values will always be normalized to a float like representation. I.e. the above example will always result in my_classic_hist{le="1.0"} being ingested into prometheus, no matter via which protocol. The effect of this change is that alerts, recording rules and dashboards that directly reference label values as whole numbers such as le="1" will stop working.

Ways to deal with this change either globally or on a per metric basis:

  • Fix references to integer le, quantile label values, but otherwise do nothing and accept that some queries that span the transition time will produce inaccurate or unexpected results. This is the recommended solution.
  • Use metric_relabel_config to retain the old labels when scraping targets. This should only be applied to metrics that currently produce such labels.
    metric_relabel_configs:
      - source_labels:
          - quantile
        target_label: quantile
        regex: (\d+)\.0+
      - source_labels:
          - le
          - __name__
        target_label: le
        regex: (\d+)\.0+;.*_bucket

Prometheus 2.0 migration guide

For the Prometheus 1.8 to 2.0 please refer to the Prometheus v2.55 documentation.