Commit graph

334 commits

Author SHA1 Message Date
Marco Pracucci 7ad111b27e
Merge remote-tracking branch 'remotes/prometheus/main' into update-upstream 2023-07-06 15:13:54 +02:00
Marco Pracucci 0ad967ebd3
Merge pull request #508 from grafana/sync-upstream-2023-06-30
Sync upstream 2023 06 30
2023-07-06 14:54:33 +02:00
Charles Korn 5e8b550ad5
Add missing InstallCodec call to benchmark
@bboreham noticed that `BenchmarkRespond` was missing a call to `InstallCodec` in 
https://github.com/prometheus/prometheus/pull/11905#discussion_r1251255249
2023-07-05 10:36:39 +10:00
Julien Pivotto 0186ec7873
Merge pull request #12516 from vinted/convert_queryopts_to_interface
promql: convert QueryOpts to interface
2023-07-04 23:38:31 +02:00
Jesus Vazquez 3961006613 Create release candidate 2.45.0-rc.0 (#12435)
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
2023-07-04 15:01:01 +00:00
Arthur Silva Sens 873411ffce Add feature flag to squash metadata from /api/v1/metadata (#12391)
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2023-07-04 15:01:01 +00:00
Julien Pivotto 986fde06b2
Merge pull request #11688 from damnever/fix/datamodelvalidation-remotewriteapi
Validate the metric names and labels in the remote write handler
2023-07-04 13:52:02 +02:00
Giedrius Statkevičius 3f230fc9f8 promql: convert QueryOpts to interface
Convert QueryOpts to an interface so that downstream projects like
https://github.com/thanos-community/promql-engine could extend the query
options with engine specific options that are not in the original
engine.

Will be used to enable query analysis per-query.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2023-07-03 16:20:31 +03:00
Julien Pivotto e043b273a6
Merge pull request #12439 from prometheus/release-2.45
Merge release 2.45.0 back to main
2023-06-17 10:16:48 +02:00
Arthur Silva Sens 1ea477f4bc
Add feature flag to squash metadata from /api/v1/metadata (#12391)
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2023-06-12 16:17:20 +01:00
Jesus Vazquez bfa466d00f
Create release candidate 2.45.0-rc.0 (#12435)
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
2023-06-07 12:29:04 +02:00
Jeanette Tan 1be0816b46 Merge remote-tracking branch 'upstream/main' 2023-05-23 00:20:36 +08:00
Baskar Shanmugam 905a0bd63a
Added 'limit' query parameter support to /api/v1/status/tsdb endpoint (#12336)
* Added 'topN' query parameter support to /api/v1/status/tsdb endpoint

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

* Updated query parameter for tsdb status to 'limit'

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

* Corrected Stats() parameter name from topN to limit

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

* Fixed p.Stats CI failure

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

---------

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>
2023-05-22 14:37:07 +02:00
Jeanette Tan 0fccba0db9 Merge remote-tracking branch 'upstream/main' 2023-04-26 21:25:21 +08:00
Vladimir Varankin d281ebb178 web: display GOMEMLIMIT in runtime info
Signed-off-by: Vladimir Varankin <vladimir@varank.in>
2023-04-23 20:24:34 +02:00
Julien Pivotto 8f1dc4a70f
Merge pull request #12248 from yeya24/consistent-response
Use same error for instant and range query when 400
2023-04-21 11:44:20 +02:00
Julien Pivotto e2512078e5
Merge pull request #12241 from mmorel-35/linter/nilerr
enable gocritic, unconvert and unused linters
2023-04-20 15:13:31 +02:00
gotjosh 2f22c8b7f8
Merge pull request #12270 from prometheus/gotjosh/allow-filtering-of-rules-by-name-api
Rules API: Allow filtering by rule name
2023-04-20 12:03:08 +01:00
gotjosh e78be38cc0
don't show empty groups
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-20 11:20:20 +01:00
Matthieu MOREL bae9a21200
Merge branch 'main' into linter/nilerr
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-19 19:56:39 +02:00
beorn7 5b53aa1108 style: Replace else if cascades with switch
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.

The exceptions that I have found in our codebase are just these two:

* The `if else` is followed by an additional statement before the next
  condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
  used. In this case, using `switch` would require tagging the `for`
  loop, which probably tips the balance.

Why are `switch` statements more readable?

For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.

I'm sure the aforemention wise coders can list even more reasons.

In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:22:31 +02:00
beorn7 c3c7d44d84 lint: Adjust to the lint warnings raised by current versions of golint-ci
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.

There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).

I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.

I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:10:10 +02:00
gotjosh 96b6463f25
review comments
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-18 16:26:32 +01:00
Đurica Yuri Nikolić d162bb51b6
Merge pull request #485 from grafana/yuri/bring-prometheus-upstream
Synch with Prometheus up to 2023-04-18 (b028112)
2023-04-18 15:07:26 +02:00
gotjosh f3394bf7a1
Rules API: Allow filtering by rule name
Introduces support for a new query parameter in the `/rules` API endpoint that allows filtering by rule names.

If all the rules of a group are filtered, we skip the group entirely.

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-18 10:12:08 +01:00
Ben Ye fd3630b9a3 add ctx to QueryEngine interface
Signed-off-by: Ben Ye <benye@amazon.com>
2023-04-17 21:32:38 -07:00
Jeanette Tan 894f657c48 Fix bugs from merge 2023-04-14 18:23:02 +08:00
Jeanette Tan 1570114ae1 Merge remote-tracking branch 'upstream/main' 2023-04-14 17:34:40 +08:00
Matthieu MOREL fb3eb21230 enable gocritic, unconvert and unused linters
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-13 19:20:22 +00:00
beorn7 c0879d64cf promql: Separate Point into FPoint and HPoint
In other words: Instead of having a “polymorphous” `Point` that can
either contain a float value or a histogram value, use an `FPoint` for
floats and an `HPoint` for histograms.

This seemingly small change has a _lot_ of repercussions throughout
the codebase.

The idea here is to avoid the increase in size of `Point` arrays that
happened after native histograms had been added.

The higher-level data structures (`Sample`, `Series`, etc.) are still
“polymorphous”. The same idea could be applied to them, but at each
step the trade-offs needed to be evaluated.

The idea with this change is to do the minimum necessary to get back
to pre-histogram performance for functions that do not touch
histograms. Here are comparisons for the `changes` function. The test
data doesn't include histograms yet. Ideally, there would be no change
in the benchmark result at all.

First runtime v2.39 compared to directly prior to this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     542µs ± 1%  +38.58%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     617µs ± 2%  +36.48%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.36ms ± 2%  +21.58%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    8.94ms ± 1%  +14.21%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.30ms ± 1%  +10.67%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.10ms ± 1%  +11.82%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    11.8ms ± 1%  +12.50%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    87.4ms ± 1%  +12.63%  (p=0.000 n=9+9)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    32.8ms ± 1%   +8.01%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.6ms ± 2%   +9.64%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     117ms ± 1%  +11.69%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     876ms ± 1%  +11.83%  (p=0.000 n=9+10)
```

And then runtime v2.39 compared to after this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     547µs ± 1%  +39.84%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     616µs ± 2%  +36.15%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.26ms ± 1%  +12.20%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    7.95ms ± 1%   +1.59%  (p=0.000 n=10+8)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.38ms ± 2%  +13.49%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.02ms ± 1%   +9.80%  (p=0.000 n=10+9)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    10.8ms ± 1%   +3.08%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    78.1ms ± 1%   +0.58%  (p=0.035 n=9+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    33.5ms ± 4%  +10.18%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.0ms ± 1%   +7.98%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     107ms ± 1%   +1.92%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     775ms ± 1%   -1.02%  (p=0.019 n=9+9)
```

In summary, the runtime doesn't really improve with this change for
queries with just a few steps. For queries with many steps, this
commit essentially reinstates the old performance. This is good
because the many-step queries are the one that matter most (longest
absolute runtime).

In terms of allocations, though, this commit doesn't make a dent at
all (numbers not shown). The reason is that most of the allocations
happen in the sampleRingIterator (in the storage package), which has
to be addressed in a separate commit.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:16 +02:00
Ben Ye fb67d368a2 use consistent error for instant and range query 400
Signed-off-by: Ben Ye <benye@amazon.com>
2023-04-11 13:45:34 -07:00
Xiaochao Dong (@damnever) 2b7202c4cc Validate the metric names and labels in the remote write handler
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2023-04-05 19:09:05 +08:00
Ganesh Vernekar 41649ceb1b
Merge remote-tracking branch 'upstream/main' into codesome/sync-prom
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2023-03-22 08:35:08 +05:30
pbudner 46683eadf7 fix: advertise correct flag to enable remote write receiver
Signed-off-by: pbudner <mail@pascalbudner.de>
2023-03-11 13:50:52 +01:00
Yuri Nikolic 752416d0d8 Fixing conflicts with commit 58d3f148bf 2023-03-08 17:39:31 +01:00
Julien Pivotto db2d759b81 Add support for lookbackdelta per query via the API
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-08 00:30:05 +01:00
Charles Korn 94d858d2f4
Correct license header in web/api/v1/codec_test.go.
This was incorrectly added as part of #428.

Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-02-27 13:34:52 +11:00
Charles Korn 5d62640e9b
Handle case where default codec cannot encode the response.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-02-24 14:47:24 +11:00
Charles Korn 374c3f4dec
Implement fully-featured content negotiation for API requests, and allow overriding the default API codec.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-02-24 14:04:43 +11:00
Fish-pro 43d77f7c41 Use http constants instead of string
Signed-off-by: Fish-pro <zechun.chen@daocloud.io>
2023-02-10 10:21:05 +08:00
Charles Korn d2d23d9849
Expose QueryData so that implementations of Codec.CanEncode() can perform a type assertion against Response.Data. (#427)
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-02-03 05:38:40 +01:00
Charles Korn d9063441c1
Add extension point for returning different content types from API endpoints (#412)
* Add initial sketch of Codec interface.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Introduce JSON codec.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Expose Response type so that consuming applications (eg. Mimir) can implement their own Codecs.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add sketch of what supporting different codecs could look like.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Rename fallbackCodec to defaultCodec.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Remove defaultCodec as a field on API.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Rename AddCodec() and clarify expected behaviour.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Modify TestRespond to test JsonCodec directly.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Refactor existing respond() test in preparation for content negotiation test cases.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add tests for content negotiation.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add missing documentation comments.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Add another test case.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Rename JsonCodec to JSONCodec.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Fix linting issue.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Fallback to JSON codec if no acceptable codec can be found for the Accept header.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

* Move custom jsoniter code into json_codec.go.

Signed-off-by: Charles Korn <charles.korn@grafana.com>

---------

Signed-off-by: Charles Korn <charles.korn@grafana.com>
2023-02-01 09:33:50 +01:00
Marco Pracucci 3db77b4491
API: change HTTP status code tracked in metrics form 503/422 to 499 if a request is canceled
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-01-26 13:06:37 +01:00
Julien Pivotto 2c408289f8 Add stabilizing to UI
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-01-19 11:33:54 +01:00
Julien Pivotto ce55e5074d Add 'keep_firing_for' field to alerting rules
This commit adds a new 'keep_firing_for' field to Prometheus alerting
rules. The 'resolve_delay' field specifies the minimum amount of time
that an alert should remain firing, even if the expression does not
return any results.

This feature was discussed at a previous dev summit, and it was
determined that a feature like this would be useful in order to allow
the expression time to stabilize and prevent confusing resolved messages
from being propagated through Alertmanager.

This approach is simpler than having two PromQL queries, as was
sometimes discussed, and it should be easy to implement.

This commit does not include tests for the 'resolve_delay' field.  This
is intentional, as the purpose of this commit is to gather comments on
the proposed design of the 'resolve_delay' field before implementing
tests. Once the design of the 'resolve_delay' field has been finalized,
a follow-up commit will be submitted with tests."

See https://github.com/prometheus/prometheus/issues/11570

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-01-13 12:11:39 +01:00
Łukasz Mierzwa e1b7082008
Show individual scrape pools on /targets page (#11142)
* Add API endpoints for getting scrape pool names

This adds api/v1/scrape_pools endpoint that returns the list of *names* of all the scrape pools configured.
Having it allows to find out what scrape pools are defined without having to list and parse all targets.

The second change is adding scrapePool query parameter support in api/v1/targets endpoint, that allows to
filter returned targets by only finding ones for passed scrape pool name.

Both changes allow to query for a specific scrape pool data, rather than getting all the targets for all possible scrape pools.
The problem with api/v1/targets endpoint is that it returns huge amount of data if you configure a lot of scrape pools.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>

* Add a scrape pool selector on /targets page

Current targets page lists all possible targets. This works great if you only have a few scrape pools configured,
but for systems with a lot of scrape pools and targets this slow things down a lot.
Not only does the /targets page load very slowly in such case (waiting for huge API response) but it also take
a long time to render, due to huge number of elements.
This change adds a dropdown selector so it's possible to select only intersting scrape pool to view.
There's also scrapePool query param that will open selected pool automatically.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2022-12-23 11:55:08 +01:00
Bryan Boreham fd57569683 Update package web tests for new labels.Labels type
Use `FromStrings` instead of assuming the data structure.

And don't sort individual labels, since `labels.Labels` are always sorted.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-12-19 15:22:09 +00:00
Ganesh Vernekar e3719d670b
Merge remote-tracking branch 'upstream/main' into sparsehistogram
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-10-25 14:38:56 -04:00
Alan Protasio 5ac12ac351
api: Wrapped promQL based API errors with returnAPIError function (#11356)
* wrap api error on get series/labels on `returnAPIError` function

Signed-off-by: Alan Protasio <approtas@amazon.com>

* lint

Signed-off-by: Alan Protasio <approtas@amazon.com>

* query exemplars

Signed-off-by: Alan Protasio <approtas@amazon.com>

Signed-off-by: Alan Protasio <approtas@amazon.com>
2022-10-20 11:17:00 +02:00
Jesus Vazquez e934d0f011 Merge 'main' into sparsehistogram
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
2022-10-05 22:14:49 +02:00