Commit graph

12946 commits

Author SHA1 Message Date
gotjosh 63b09944b8
Use labels.Len() instead of manually counting the labels
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-30 12:25:48 +01:00
gotjosh ccfafae36d
Rename QueryforStateSeries to QueryForStateSeries
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-30 12:19:18 +01:00
gotjosh 151f6e0ed6
Add an assertion on the count of alerts before adding an active alert
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-30 12:17:56 +01:00
Arve Knudsen 9189507569 prometheusremotewrite: Add PrometheusConverter.FromMetrics benchmark
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-04-30 13:13:37 +02:00
Arve Knudsen 99f3051f45 OTLP: Use PrometheusConverter directly
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-04-30 13:10:27 +02:00
Arve Knudsen 7f81065b01
Merge pull request #13966 from komisan19/refactor/add_max_func_to_maxTimestamp
refactor: replace maxTimestamp with standard max function
2024-04-30 11:50:52 +02:00
Arve Knudsen 759ca8b207
Merge branch 'main' into refactor/add_max_func_to_maxTimestamp
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-04-30 11:50:37 +02:00
Jiekun 0cd3a22a18
docs: [ovh sd] Added missing label for OVH dedicated server in service discovery doc
Signed-off-by: Jiekun <zhujiekun@52tt.com>
2024-04-30 17:35:28 +08:00
Jesus Vazquez 7554384dac
otlp: Prometheus to own its own copy of the otlptranslator package (#13991)
After a lot of productive discussion between the Prometheus and
OpenTelemetry community we decided that it made sense for Prometheus to
own its own copy of the code in charge for handling OTLP ingestion
traffic.

This commit is removing the README and update-copy.sh files that had the
previous steps to update the code.

Also it is updating the licensing of all the files to make sure the
OpenTelemetry provenance is explicit and to state the new ownership.

Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-04-30 11:29:52 +02:00
guangwu 9fda9443d4
fix(promql/query_logger): close file in error handling (#13948)
Signed-off-by: guoguangwu <guoguangwug@gmail.com>
2024-04-30 10:47:10 +02:00
komisan19 b974a99279 fix
Signed-off-by: komisan19 <18901496+komisan19@users.noreply.github.com>
2024-04-30 10:45:50 +09:00
Arthur Silva Sens 34ee8c6078
Merge pull request #13982 from tesla59/tesla/storage-doc
docs: storage.md: clarify storage.tsdb.retention.time description
2024-04-29 15:33:27 -03:00
Arthur Silva Sens 3d42466894
Merge pull request #13997 from bboreham/api-marshalling
bugfix: API: encode empty Vector/Matrix as [] not null
2024-04-29 15:26:01 -03:00
Bryan Boreham 5c8ffaa77c bugfix: API: encode empty Vector/Matrix as []
If the underlying data is `nil` the default encoding
will render `"null"` which is not accepted by
(some) Prometheus client libraries.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-29 19:08:10 +01:00
Bryan Boreham 00247b5d87 test: API: check empty responses
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-29 19:08:10 +01:00
Bryan Boreham e0a00f45db refactor: API: separate typed and unsafe marshalling
The typed versions are used when we call from one marshaller to another.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-29 19:08:10 +01:00
Bryan Boreham 66a1c3daad refactor: API: be explicit that we marshal empty objects
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-29 19:08:10 +01:00
Bryan Boreham c8aed6b0ec tests: API: Let nil expected response mean skip check
When we want to check just the json encoding.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-29 19:08:10 +01:00
Bryan Boreham 5a339ba359 tests: API: Use jsoniter when encoding
So that tests use the same encoding as the api.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-29 19:08:10 +01:00
Bryan Boreham 2c4a36376d tests: API: simplify check of error response
Since we already use require.JSONEq in similar cases.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-29 19:08:10 +01:00
Julien ed4e50e792
Merge pull request #13992 from heckler1/heckler1/discovery-client-go
discovery(k8s): Only register client-go metrics adapters when needed
2024-04-29 11:18:49 +02:00
Neeraj Gartia 99f9d32499
UTF-8: updates UI parser to support UTF-8 characters (#13590)
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-04-29 11:14:01 +02:00
Heyoxe f7e923c3bb
fix(scaleway-sd): use public IPs if no private IP present (#13941)
* fix(scaleway-sd): use public IPs if no private IP present
* tests(scaleway-sd): add instance  with routed public ip and no private ip

---------

Signed-off-by: Heyoxe <32708033+Heyoxe@users.noreply.github.com>
2024-04-27 15:01:30 +01:00
Nishant Singh c8b23980c9
Update docs/storage.md
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Nishant Singh <nishant@heim.id>
2024-04-27 13:50:50 +05:30
Nishant Singh 801314901c
Update docs/storage.md
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Nishant Singh <nishant@heim.id>
2024-04-27 13:50:41 +05:30
Stephen Heckler 31a4217784 discovery(k8s): Only register client-go metrics adapters when needed
Previously the metrics adapters for client-go were registered in an init function.
This resulted in clobbering default metrics providers when these packages are imported
into an application that leverages the default client-go metrics registry.

Instead, let's only register these adapters when requested.

Signed-off-by: Stephen Heckler <sheckler@cloudflare.com>
2024-04-25 12:33:29 -05:00
Arthur Silva Sens 0305490e4e
Merge pull request #13987 from prometheus/nativeHis-flag-ingestion
bugfix: Decouple native histogram ingestions and protobuf parsing
2024-04-25 10:56:40 -03:00
George Robinson dde2e5eb73
Improve comments around resending resolved alerts (#13990)
Signed-off-by: George Robinson <george.robinson@grafana.com>
2024-04-25 14:18:50 +02:00
Arthur Silva Sens 9195d51469
Prepare v2.52 release
Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
2024-04-24 17:19:43 -03:00
Arthur Silva Sens 7aacef9b42
bugfix: Decouple native histogram ingestions and protobuf parsing
Up until this point, if a scrape was done with the protobuf format Prometheus would always try to ingest native histograms even with the feature flag disabled. This causes problems with other feature-flags that depend on the protobuf format, like 'created-timestamp-zero-ingestion'. This commit decouples native histogram parsing from ingestion, making sure ingestion only happens when the 'native-histogram' feature-flag is enabled.

Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
2024-04-24 17:02:52 -03:00
gotjosh cc2207148e
fix typo
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-24 19:20:57 +01:00
gotjosh 2de2fee035
Allow the result map for the series set before hand with a hint.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-24 19:10:34 +01:00
gotjosh 6cfc584308
- Add a changelog entry
- Improve variable name of the map produced by the series set

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-24 19:02:47 +01:00
gotjosh fa75985c1c
Use the string representation of the labels instead of the hash
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-24 18:46:05 +01:00
gotjosh 276201598c
Fix tests and a bug with the series lookup logic.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-24 18:46:05 +01:00
gotjosh e6dcbd2e26
bug: nil check against the series set not errors
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-24 18:46:05 +01:00
gotjosh 4daaa59c08
Rule Manager: Only query once per alert rule when restoring alert state
Prometheus restores alert state between restarts and updates. For each rule, it looks at the alerts that are meant to be active and then queries the `ALERTS_FOR_STATE` series for _each_ alert within the rules.

If the alert rule has 120 instances (or series) it'll execute the same query with slightly different labels.

This PR changes the approach so that we only query once per alert rule and then match the corresponding alert that we're about to restore against the series-set. While the approach might use a bit more memory at start-up (if even?) the restore proccess is only ran once per restart so I'd consider this a big win.

This builds on top of #13974

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-24 18:46:05 +01:00
gotjosh 4ac78063ee
Merge pull request #13974 from prometheus/measure-restore-time-rules
Rule Manager: Add `rule_group_last_restore_duration_seconds` to measure restore time per rule group
2024-04-24 16:04:30 +01:00
Alan Protasio d15869af32
Avoid creating new slices for labels values on postings for matchers (#13958)
* Avoid creating new slices for labels values on postings for matchers

Signed-off-by: alanprot <alanprot@gmail.com>

* refactor

Signed-off-by: alanprot <alanprot@gmail.com>

---------

Signed-off-by: alanprot <alanprot@gmail.com>
2024-04-24 16:41:33 +02:00
gotjosh 5beb2fe005
Improve the metric description
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-24 15:24:35 +01:00
gotjosh d672eda979
Add a changelog entry
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-24 14:31:18 +01:00
gotjosh 381a77ac1e
Change variable name to restoreStartTime from now and introduce a log line to record total time
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-24 14:21:11 +01:00
Arthur Silva Sens 29505211af
Merge pull request #13986 from prometheus/main
promtool: Fix panic on extended tsdb analyze (#13976)
2024-04-24 09:37:09 -03:00
Will Hegedus bd1878700b
promtool: Fix panic on extended tsdb analyze (#13976)
Currently, running promtool tsdb analyze with the --extended flag
will cause an 'index out of range' error if running it
against a block that does not have any native histogram chunks.

This change ensures that promtool won't try to display data that doesn't exist.

Signed-off-by: Will Hegedus <whegedus@linode.com>
2024-04-24 11:35:34 +10:00
tesla59 5e638b7f44 docs: storage.md: clarify storage.tsdb.retention.time description
Signed-off-by: tesla59 <nishant@heim.id>
2024-04-24 02:58:25 +05:30
gotjosh e7219e3d36
Rule Manager: Add rule_group_last_restore_duration_seconds to measure restore time per rule group
When a rule group changes or prometheus is restarted we need to ensure we restore the active alerts that were firing for a corresponding rule, for that Prometheus uses the `ALERTS_FOR_STATE` series to query the previous state and restore it. If a given rule has high cardinality (think 100s of 1000s for series) this proccess can take a bit of time - this is the first of a series of PRs to improve this problem and I'd like to start with exposing the time it takes to restore a rule group as a gauge.

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2024-04-23 09:57:08 +01:00
Arthur Silva Sens 76b0318ed5
Merge pull request #13962 from prometheus/dependabot/go_modules/github.com/aws/aws-sdk-go-1.51.25
build(deps): bump github.com/aws/aws-sdk-go from 1.51.24 to 1.51.25
2024-04-22 09:26:07 -03:00
Arthur Silva Sens a903ef83ee
Merge pull request #13961 from prometheus/dependabot/go_modules/github.com/hetznercloud/hcloud-go/v2-2.7.2
build(deps): bump github.com/hetznercloud/hcloud-go/v2 from 2.7.1 to 2.7.2
2024-04-22 09:25:47 -03:00
komisan19 3d84d4d6dc fix
Signed-off-by: komisan19 <18901496+komisan19@users.noreply.github.com>
2024-04-22 19:04:00 +09:00
komisan19 5ab24a06d0 refactor: add max func to maxTimestamp
Signed-off-by: komisan19 <18901496+komisan19@users.noreply.github.com>
2024-04-21 23:39:25 +09:00