Commit graph

1488 commits

Author SHA1 Message Date
cui fliter 6e7ac76981
fix problematic link (#12405)
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-05-29 10:26:11 +02:00
Baskar Shanmugam 905a0bd63a
Added 'limit' query parameter support to /api/v1/status/tsdb endpoint (#12336)
* Added 'topN' query parameter support to /api/v1/status/tsdb endpoint

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

* Updated query parameter for tsdb status to 'limit'

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

* Corrected Stats() parameter name from topN to limit

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

* Fixed p.Stats CI failure

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

---------

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>
2023-05-22 14:37:07 +02:00
Bryan Boreham a073e04a9b
Merge pull request #12366 from prometheus/release-2.44
Merge release 2.44 back to main
2023-05-16 18:06:29 +01:00
Bryan Boreham 1ac5131f69
Release 2.44.0 (#12364)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-14 07:13:03 +01:00
beorn7 9e500345f3 textparse/scrape: Add option to scrape both classic and native histograms
So far, if a target exposes a histogram with both classic and native
buckets, a native-histogram enabled Prometheus would ignore the
classic buckets. With the new scrape config option
`scrape_classic_histograms` set, both buckets will be ingested,
creating all the series of a classic histogram in parallel to the
native histogram series. For example, a histogram `foo` would create a
native histogram series `foo` and classic series called `foo_sum`,
`foo_count`, and `foo_bucket`.

This feature can be used in a migration strategy from classic to
native histograms, where it is desired to have a transition period
during which both native and classic histograms are present.

Note that two bugs in classic histogram parsing were found and fixed
as a byproduct of testing the new feature:

1. Series created from classic _gauge_ histograms didn't get the
   _sum/_count/_bucket prefix set.
2. Values of classic _float_ histograms weren't parsed properly.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-05-13 01:32:25 +02:00
Bryan Boreham 94d9367bbf
Create 2.44.0-rc.2 (#12341)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-08 11:30:29 +01:00
Bryan Boreham 3d26faade4
Create 2.44.0-rc.1 (#12323)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-03 16:18:28 +01:00
Bryan Boreham aeccf9e770 Bump version to 2.44.0-rc0
Including CHANGELOG.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-04-24 15:38:43 +00:00
Vladimir Varankin d281ebb178 web: display GOMEMLIMIT in runtime info
Signed-off-by: Vladimir Varankin <vladimir@varank.in>
2023-04-23 20:24:34 +02:00
Julien Pivotto 8f1dc4a70f
Merge pull request #12248 from yeya24/consistent-response
Use same error for instant and range query when 400
2023-04-21 11:44:20 +02:00
Julien Pivotto e2512078e5
Merge pull request #12241 from mmorel-35/linter/nilerr
enable gocritic, unconvert and unused linters
2023-04-20 15:13:31 +02:00
gotjosh 2f22c8b7f8
Merge pull request #12270 from prometheus/gotjosh/allow-filtering-of-rules-by-name-api
Rules API: Allow filtering by rule name
2023-04-20 12:03:08 +01:00
gotjosh e78be38cc0
don't show empty groups
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-20 11:20:20 +01:00
Matthieu MOREL bae9a21200
Merge branch 'main' into linter/nilerr
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-19 19:56:39 +02:00
beorn7 5b53aa1108 style: Replace else if cascades with switch
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.

The exceptions that I have found in our codebase are just these two:

* The `if else` is followed by an additional statement before the next
  condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
  used. In this case, using `switch` would require tagging the `for`
  loop, which probably tips the balance.

Why are `switch` statements more readable?

For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.

I'm sure the aforemention wise coders can list even more reasons.

In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:22:31 +02:00
beorn7 c3c7d44d84 lint: Adjust to the lint warnings raised by current versions of golint-ci
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.

There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).

I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.

I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:10:10 +02:00
gotjosh 96b6463f25
review comments
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-18 16:26:32 +01:00
gotjosh f3394bf7a1
Rules API: Allow filtering by rule name
Introduces support for a new query parameter in the `/rules` API endpoint that allows filtering by rule names.

If all the rules of a group are filtered, we skip the group entirely.

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-18 10:12:08 +01:00
Ben Ye fd3630b9a3 add ctx to QueryEngine interface
Signed-off-by: Ben Ye <benye@amazon.com>
2023-04-17 21:32:38 -07:00
Matthieu MOREL fb3eb21230 enable gocritic, unconvert and unused linters
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-13 19:20:22 +00:00
beorn7 817a2396cb Name float values as "floats", not as "values"
In the past, every sample value was a float, so it was fine to call a
variable holding such a float "value" or "sample". With native
histograms, a sample might have a histogram value. And a histogram
value is still a value. Calling a float value just "value" or "sample"
or "V" is therefore misleading. Over the last few commits, I already
renamed many variables, but this cleans up a few more places where the
changes are more invasive.

Note that we do not to attempt naming in the JSON APIs or in the
protobufs. That would be quite a disruption. However, internally, we
can call variables as we want, and we should go with the option of
avoiding misunderstandings.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:24 +02:00
beorn7 630bcb494b storage: Use separate sample types for histogram vs. float
Previously, we had one “polymorphous” `sample` type in the `storage`
package. This commit breaks it up into `fSample`, `hSample`, and
`fhSample`, each still implementing the `tsdbutil.Sample` interface.

This reduces allocations in `sampleRing.Add` but inflicts the penalty
of the interface wrapper, which makes things worse in total.

This commit therefore just demonstrates the step taken. The next
commit will tackle the interface overhead problem.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:24 +02:00
beorn7 c0879d64cf promql: Separate Point into FPoint and HPoint
In other words: Instead of having a “polymorphous” `Point` that can
either contain a float value or a histogram value, use an `FPoint` for
floats and an `HPoint` for histograms.

This seemingly small change has a _lot_ of repercussions throughout
the codebase.

The idea here is to avoid the increase in size of `Point` arrays that
happened after native histograms had been added.

The higher-level data structures (`Sample`, `Series`, etc.) are still
“polymorphous”. The same idea could be applied to them, but at each
step the trade-offs needed to be evaluated.

The idea with this change is to do the minimum necessary to get back
to pre-histogram performance for functions that do not touch
histograms. Here are comparisons for the `changes` function. The test
data doesn't include histograms yet. Ideally, there would be no change
in the benchmark result at all.

First runtime v2.39 compared to directly prior to this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     542µs ± 1%  +38.58%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     617µs ± 2%  +36.48%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.36ms ± 2%  +21.58%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    8.94ms ± 1%  +14.21%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.30ms ± 1%  +10.67%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.10ms ± 1%  +11.82%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    11.8ms ± 1%  +12.50%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    87.4ms ± 1%  +12.63%  (p=0.000 n=9+9)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    32.8ms ± 1%   +8.01%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.6ms ± 2%   +9.64%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     117ms ± 1%  +11.69%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     876ms ± 1%  +11.83%  (p=0.000 n=9+10)
```

And then runtime v2.39 compared to after this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     547µs ± 1%  +39.84%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     616µs ± 2%  +36.15%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.26ms ± 1%  +12.20%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    7.95ms ± 1%   +1.59%  (p=0.000 n=10+8)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.38ms ± 2%  +13.49%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.02ms ± 1%   +9.80%  (p=0.000 n=10+9)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    10.8ms ± 1%   +3.08%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    78.1ms ± 1%   +0.58%  (p=0.035 n=9+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    33.5ms ± 4%  +10.18%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.0ms ± 1%   +7.98%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     107ms ± 1%   +1.92%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     775ms ± 1%   -1.02%  (p=0.019 n=9+9)
```

In summary, the runtime doesn't really improve with this change for
queries with just a few steps. For queries with many steps, this
commit essentially reinstates the old performance. This is good
because the many-step queries are the one that matter most (longest
absolute runtime).

In terms of allocations, though, this commit doesn't make a dent at
all (numbers not shown). The reason is that most of the allocations
happen in the sampleRingIterator (in the storage package), which has
to be addressed in a separate commit.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:16 +02:00
Ben Ye fb67d368a2 use consistent error for instant and range query 400
Signed-off-by: Ben Ye <benye@amazon.com>
2023-04-11 13:45:34 -07:00
Hayk Davtyan 408f31f786
[WebUI/ScrapePoolList] Case-insensitive search of "Scrape Pools" (#12207)
Signed-off-by: hayk96 <hayko5999@gmail.com>
2023-04-02 11:37:58 +02:00
Alison 3ac49d4ae2
Set SourceMap to false to fix UI in MCR builds (#12175)
* Set sourceMap to false

Signed-off-by: Alison Burgess <alburgess@microsoft.com>

* add sourcemap=false to package.json build

Signed-off-by: Alison Burgess <alburgess@microsoft.com>

* set sourcemap to true in tsconfig

Signed-off-by: Alison Burgess <alburgess@microsoft.com>

---------

Signed-off-by: Alison Burgess <alburgess@microsoft.com>
Co-authored-by: Alison Burgess <alburgess@microsoft.com>
2023-03-23 22:25:48 +01:00
Julien Pivotto de50efbf7a
Merge pull request #12165 from prometheus/release-2.43
Merge 2.43 in main
2023-03-21 17:26:28 +01:00
Julien Pivotto 1070c9b06c Release 2.43.0
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-21 13:07:51 +01:00
Julien Pivotto 2c6168be5f Release 2.43.0-rc.1
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-16 20:21:40 +01:00
pbudner 46683eadf7 fix: advertise correct flag to enable remote write receiver
Signed-off-by: pbudner <mail@pascalbudner.de>
2023-03-11 13:50:52 +01:00
Julien Pivotto b6d91e8bf8 Release 2.43.0-rc.0
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-09 14:30:02 +01:00
Julien Pivotto db2d759b81 Add support for lookbackdelta per query via the API
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-08 00:30:05 +01:00
Julien Pivotto 8e4350dd59 Directly include SVG logo in the page.
Use HTML <svg> element to include Prometheus logo in the status page.

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-04 00:36:07 +01:00
Shan Aminzadeh 3f6f5d3357
Scope GroupBy labels to metric (#11914)
Signed-off-by: Shan Aminzadeh <shan.aminzadeh@chronosphere.io>
2023-02-17 10:23:16 +01:00
Fish-pro 43d77f7c41 Use http constants instead of string
Signed-off-by: Fish-pro <zechun.chen@daocloud.io>
2023-02-10 10:21:05 +08:00
Julien Pivotto c70d85baed
Merge pull request #11916 from prometheus/release-2.42
Merge 2.42 to main
2023-02-02 10:17:27 +01:00
Kemal Akkoyun 225c61122d
Cut v2.42.0 (#11912)
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
2023-01-31 18:04:46 +01:00
Kemal Akkoyun 7b9cc7eea3
Cut v2.42.0-rc.0 (#11902)
* Cut v2.42.0-rc.0

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add missing log items

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
2023-01-27 13:43:47 +01:00
Bartlomiej Plotka 6dcfb71740
Merge pull request #11897 from pracucci/propose-to-change-query-canceled-status-code
API: change HTTP status code from 503/422 to 499 if a request is canceled
2023-01-26 13:51:43 +00:00
Marco Pracucci 3db77b4491
API: change HTTP status code tracked in metrics form 503/422 to 499 if a request is canceled
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-01-26 13:06:37 +01:00
Kemal Akkoyun bab3b4e3c3 Upgrade dependencies
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
2023-01-26 09:19:50 +01:00
Kemal Akkoyun ae597cac62
Upgrade several UI dependencies (#11894)
* build(deps): bump @codemirror/autocomplete in /web/ui

Bumps [@codemirror/autocomplete](https://github.com/codemirror/autocomplete) from 6.3.0 to 6.4.0.
- [Release notes](https://github.com/codemirror/autocomplete/releases)
- [Changelog](https://github.com/codemirror/autocomplete/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codemirror/autocomplete/compare/6.3.0...6.4.0)

---
updated-dependencies:
- dependency-name: "@codemirror/autocomplete"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump @fortawesome/fontawesome-svg-core in /web/ui

Bumps [@fortawesome/fontawesome-svg-core](https://github.com/FortAwesome/Font-Awesome) from 6.2.0 to 6.2.1.
- [Release notes](https://github.com/FortAwesome/Font-Awesome/releases)
- [Changelog](https://github.com/FortAwesome/Font-Awesome/blob/6.x/CHANGELOG.md)
- [Commits](https://github.com/FortAwesome/Font-Awesome/compare/6.2.0...6.2.1)

---
updated-dependencies:
- dependency-name: "@fortawesome/fontawesome-svg-core"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-25 19:49:14 +01:00
Julien Pivotto 44ef49805c
Merge pull request #11890 from prometheus/release-2.41
Merge back Release 2.41
2023-01-25 11:22:46 +01:00
Shan Aminzadeh cdfd18ce00
[lezer-promql] Fix package.json main to point to correct cjs module (#11888)
Signed-off-by: Shan Aminzadeh <shan.aminzadeh@chronosphere.io>
2023-01-25 11:00:59 +01:00
Julien Pivotto 2c408289f8 Add stabilizing to UI
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-01-19 11:33:54 +01:00
Julien Pivotto ce55e5074d Add 'keep_firing_for' field to alerting rules
This commit adds a new 'keep_firing_for' field to Prometheus alerting
rules. The 'resolve_delay' field specifies the minimum amount of time
that an alert should remain firing, even if the expression does not
return any results.

This feature was discussed at a previous dev summit, and it was
determined that a feature like this would be useful in order to allow
the expression time to stabilize and prevent confusing resolved messages
from being propagated through Alertmanager.

This approach is simpler than having two PromQL queries, as was
sometimes discussed, and it should be easy to implement.

This commit does not include tests for the 'resolve_delay' field.  This
is intentional, as the purpose of this commit is to gather comments on
the proposed design of the 'resolve_delay' field before implementing
tests. Once the design of the 'resolve_delay' field has been finalized,
a follow-up commit will be submitted with tests."

See https://github.com/prometheus/prometheus/issues/11570

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-01-13 12:11:39 +01:00
beorn7 d121db7a65
federate: Fix PeekBack usage
In most cases, there is no sample at `maxt`, so `PeekBack` has to be
used. So far, `PeekBack` did not return a float histogram, and we
disregarded even any returned normal histogram. This fixes both, and
also tweaks the unit test to discover the problem (by using an earlier
timestamp than "now" for the samples in the TSDB).

Signed-off-by: beorn7 <beorn@grafana.com>
2023-01-12 20:43:02 +05:30
Ganesh Vernekar 7a88bc3581
Test federation with native histograms
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2023-01-12 20:43:02 +05:30
Ganesh Vernekar 33f880d123
Add native histogram support in federation
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2023-01-12 20:42:59 +05:30
Levi Harrison 3b4cbf8da4
Inject readiness state through context (#11617)
Signed-off-by: Levi Harrison <git@leviharrison.dev>

Signed-off-by: Levi Harrison <git@leviharrison.dev>
2023-01-09 00:04:00 +01:00