mirror of
https://github.com/prometheus/prometheus.git
synced 2025-01-19 01:30:53 -08:00
e9e3d64b7c
PromQL engine: Delay deletion of __name__ label to the end of the query evaluation - This change allows optionally preserving the `__name__` label via the `label_replace` and `label_join` functions, and helps prevent the dreaded "vector cannot contain metrics with the same labelset" error. - The implementation extends the `Series` and `Sample` structs with a boolean flag indicating whether the `__name__` label should be deleted at the end of the query evaluation. - The `label_replace` and `label_join` functions can still access the value of the `__name__` label, even if it has been previously marked for deletion. If `__name__` is used as target label, it won't be dropped at the end of the query evaluation. - Fixes https://github.com/prometheus/prometheus/issues/11397 - See https://github.com/jcreixell/prometheus/pull/2 for previous discussion, including the decision to create this PR and benchmark it before considering other alternatives (like refactoring `labels.Labels`). - See https://github.com/jcreixell/prometheus/pull/1 for an alternative implementation using a special label instead of boolean flags. - Note: a feature flag `promql-delayed-name-removal` has been added as it changes the behavior of some "weird" queries (see https://github.com/prometheus/prometheus/issues/11397#issuecomment-1451998792) Example (this always fails, as `__name__` is being dropped by `count_over_time`): ``` count_over_time({__name__!=""}[1m]) => Error executing query: vector cannot contain metrics with the same labelset ``` Before: ``` label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)") => Error executing query: vector cannot contain metrics with the same labelset ``` After: ``` label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)") => count_go_gc_cycles_automatic_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1 count_go_gc_cycles_forced_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1 ... ``` Signed-off-by: Jorge Creixell <jcreixell@gmail.com> --------- Signed-off-by: Jorge Creixell <jcreixell@gmail.com> Signed-off-by: Björn Rabenstein <github@rabenste.in>
9.3 KiB
9.3 KiB
title |
---|
prometheus |
prometheus
The Prometheus monitoring server
Flags
Flag | Description | Default |
---|---|---|
-h , --help |
Show context-sensitive help (also try --help-long and --help-man). | |
--version |
Show application version. | |
--config.file |
Prometheus configuration file path. | prometheus.yml |
--web.listen-address ... |
Address to listen on for UI, API, and telemetry. Can be repeated. | 0.0.0.0:9090 |
--auto-gomemlimit.ratio |
The ratio of reserved GOMEMLIMIT memory to the detected maximum container or system memory | 0.9 |
--web.config.file |
[EXPERIMENTAL] Path to configuration file that can enable TLS or authentication. | |
--web.read-timeout |
Maximum duration before timing out read of the request, and closing idle connections. | 5m |
--web.max-connections |
Maximum number of simultaneous connections across all listeners. | 512 |
--web.external-url |
The URL under which Prometheus is externally reachable (for example, if Prometheus is served via a reverse proxy). Used for generating relative and absolute links back to Prometheus itself. If the URL has a path portion, it will be used to prefix all HTTP endpoints served by Prometheus. If omitted, relevant URL components will be derived automatically. | |
--web.route-prefix |
Prefix for the internal routes of web endpoints. Defaults to path of --web.external-url. | |
--web.user-assets |
Path to static asset directory, available at /user. | |
--web.enable-lifecycle |
Enable shutdown and reload via HTTP request. | false |
--web.enable-admin-api |
Enable API endpoints for admin control actions. | false |
--web.enable-remote-write-receiver |
Enable API endpoint accepting remote write requests. | false |
--web.remote-write-receiver.accepted-protobuf-messages |
List of the remote write protobuf messages to accept when receiving the remote writes. Supported values: prometheus.WriteRequest, io.prometheus.write.v2.Request | prometheus.WriteRequest |
--web.console.templates |
Path to the console template directory, available at /consoles. | consoles |
--web.console.libraries |
Path to the console library directory. | console_libraries |
--web.page-title |
Document title of Prometheus instance. | Prometheus Time Series Collection and Processing Server |
--web.cors.origin |
Regex for CORS origin. It is fully anchored. Example: 'https?://(domain1|domain2).com' | .* |
--storage.tsdb.path |
Base path for metrics storage. Use with server mode only. | data/ |
--storage.tsdb.retention |
[DEPRECATED] How long to retain samples in storage. This flag has been deprecated, use "storage.tsdb.retention.time" instead. Use with server mode only. | |
--storage.tsdb.retention.time |
How long to retain samples in storage. When this flag is set it overrides "storage.tsdb.retention". If neither this flag nor "storage.tsdb.retention" nor "storage.tsdb.retention.size" is set, the retention time defaults to 15d. Units Supported: y, w, d, h, m, s, ms. Use with server mode only. | |
--storage.tsdb.retention.size |
Maximum number of bytes that can be stored for blocks. A unit is required, supported units: B, KB, MB, GB, TB, PB, EB. Ex: "512MB". Based on powers-of-2, so 1KB is 1024B. Use with server mode only. | |
--storage.tsdb.no-lockfile |
Do not create lockfile in data directory. Use with server mode only. | false |
--storage.tsdb.head-chunks-write-queue-size |
Size of the queue through which head chunks are written to the disk to be m-mapped, 0 disables the queue completely. Experimental. Use with server mode only. | 0 |
--storage.agent.path |
Base path for metrics storage. Use with agent mode only. | data-agent/ |
--storage.agent.wal-compression |
Compress the agent WAL. Use with agent mode only. | true |
--storage.agent.retention.min-time |
Minimum age samples may be before being considered for deletion when the WAL is truncated Use with agent mode only. | |
--storage.agent.retention.max-time |
Maximum age samples may be before being forcibly deleted when the WAL is truncated Use with agent mode only. | |
--storage.agent.no-lockfile |
Do not create lockfile in data directory. Use with agent mode only. | false |
--storage.remote.flush-deadline |
How long to wait flushing sample on shutdown or config reload. | 1m |
--storage.remote.read-sample-limit |
Maximum overall number of samples to return via the remote read interface, in a single query. 0 means no limit. This limit is ignored for streamed response types. Use with server mode only. | 5e7 |
--storage.remote.read-concurrent-limit |
Maximum number of concurrent remote read calls. 0 means no limit. Use with server mode only. | 10 |
--storage.remote.read-max-bytes-in-frame |
Maximum number of bytes in a single frame for streaming remote read response types before marshalling. Note that client might have limit on frame size as well. 1MB as recommended by protobuf by default. Use with server mode only. | 1048576 |
--rules.alert.for-outage-tolerance |
Max time to tolerate prometheus outage for restoring "for" state of alert. Use with server mode only. | 1h |
--rules.alert.for-grace-period |
Minimum duration between alert and restored "for" state. This is maintained only for alerts with configured "for" time greater than grace period. Use with server mode only. | 10m |
--rules.alert.resend-delay |
Minimum amount of time to wait before resending an alert to Alertmanager. Use with server mode only. | 1m |
--rules.max-concurrent-evals |
Global concurrency limit for independent rules that can run concurrently. When set, "query.max-concurrency" may need to be adjusted accordingly. Use with server mode only. | 4 |
--alertmanager.notification-queue-capacity |
The capacity of the queue for pending Alertmanager notifications. Use with server mode only. | 10000 |
--alertmanager.drain-notification-queue-on-shutdown |
Send any outstanding Alertmanager notifications when shutting down. If false, any outstanding Alertmanager notifications will be dropped when shutting down. Use with server mode only. | true |
--query.lookback-delta |
The maximum lookback duration for retrieving metrics during expression evaluations and federation. Use with server mode only. | 5m |
--query.timeout |
Maximum time a query may take before being aborted. Use with server mode only. | 2m |
--query.max-concurrency |
Maximum number of queries executed concurrently. Use with server mode only. | 20 |
--query.max-samples |
Maximum number of samples a single query can load into memory. Note that queries will fail if they try to load more samples than this into memory, so this also limits the number of samples a query can return. Use with server mode only. | 50000000 |
--scrape.name-escaping-scheme |
Method for escaping legacy invalid names when sending to Prometheus that does not support UTF-8. Can be one of "values", "underscores", or "dots". | values |
--enable-feature ... |
Comma separated feature names to enable. Valid options: agent, auto-gomaxprocs, auto-gomemlimit, concurrent-rule-eval, created-timestamp-zero-ingestion, delayed-compaction, exemplar-storage, expand-external-labels, extra-scrape-metrics, memory-snapshot-on-shutdown, native-histograms, new-service-discovery-manager, no-default-scrape-port, otlp-write-receiver, promql-experimental-functions, promql-delayed-name-removal, promql-per-step-stats, remote-write-receiver (DEPRECATED), utf8-names. See https://prometheus.io/docs/prometheus/latest/feature_flags/ for more details. | |
--log.level |
Only log messages with the given severity or above. One of: [debug, info, warn, error] | info |
--log.format |
Output format of log messages. One of: [logfmt, json] | logfmt |