Return annotations (warnings and infos) from PromQL queries
This generalizes the warnings we have already used before (but only for problems with remote read) as "annotations".
Annotations can be warnings or infos (the latter could be false positives). We do not treat them different in the API for now and return them all as "warnings". It would be easy to distinguish them and return infos separately, should that appear useful in the future.
The new annotations are then used to create a lot of warnings or infos during PromQL evaluations. Partially these are things we have wanted for a long time (e.g. inform the user that they have applied `rate` to a metric that doesn't look like a counter), but the new native histograms have created even more needs for those annotations (e.g. if a query tries to aggregate float numbers with histograms).
The annotations added here are not yet complete. A prominent example would be a warning about a range too short for a rate calculation. But such a warnings is more tricky to create with good fidelity and we will tackle it later.
Another TODO is to take annotations into account when evaluating recording rules.
---------
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
* Move /targets page discovered labels to expandable section
The current tooltip for showing the pre-relabeling discovered labels for a
target is notoriously unreliable and can get cut off when there are many
labels. This PR introduces a (hopefully unobtuse enough) expander/collapser
button for the discovered labels of each target, and then the discovered labels
are shown in a more persistent way underneath the final target labels, instead
of using a tooltip.
Fixes https://github.com/prometheus/prometheus/issues/9175#issuecomment-1713074341
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Remove obsolete test snapshot
Signed-off-by: Julius Volz <julius.volz@gmail.com>
---------
Signed-off-by: Julius Volz <julius.volz@gmail.com>
This commit adds the option --include-workspace-root in ui_release.sh
npm scripts in order to also include the version in web/ui/pagkage jsons
files when bumping the version. This also avoids issues when building
directly with npm install on some systems.
Signed-off-by: Daniel Mellado <dmellado@redhat.com>
So far, `ValidateHistogram` would not detect if the count did not
include the count in the zero bucket. This commit fixes the problem
and updates all the tests that have been undetected offenders so far.
Note that this problem would only ever create false negatives, so we
never falsely rejected to store a histogram because of it.
On the other hand, `ValidateFloatHistogram` has been to strict with
the count being at least as large as the sum of the counts in all the
buckets. Float precision issues could create false positives here, see
products of PromQL evaluations, it's actually quite hard to put an
upper limit no the floating point imprecision. Users could produce the
weirdest expressions, maxing out float precision problems. Therefore,
this commit simply removes that particular check from
`ValidateFloatHistogram`.
Signed-off-by: beorn7 <beorn@grafana.com>
It's possible (quite common on Kubernetes) to have a service discovery
return thousands of targets then drop most of them in relabel rules.
The main place this data is used is to display in the web UI, where
you don't want thousands of lines of display.
The new limit is `keep_dropped_targets`, which defaults to 0
for backwards-compatibility.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Add OTLP Ingestion endpoint
We copy files from the otel-collector-contrib. See the README in
`storage/remote/otlptranslator/README.md`.
This supersedes: https://github.com/prometheus/prometheus/pull/11965
Signed-off-by: gouthamve <gouthamve@gmail.com>
* Return a 200 OK
It is what the OTEL Golang SDK expect :(
https://github.com/open-telemetry/opentelemetry-go/issues/4363
Signed-off-by: Goutham <gouthamve@gmail.com>
---------
Signed-off-by: gouthamve <gouthamve@gmail.com>
Signed-off-by: Goutham <gouthamve@gmail.com>
Native histograms without a zero threshold aren't federated properly.
This adds a test to prove the specific failure mode, which is that
histograms with a zero threshold of zero are federated as classic
histograms.
The underlying reason is that the protobuf parser identifies a native
histogram by detecting a zero bucket or by detecting integer buckets.
Therefore, a float histogram with a zero threshold of zero and an
unpopulated zero bucket falls through the cracks (no integer buckets,
no zero bucket).
This commit also addse a test case for the latter.
Signed-off-by: beorn7 <beorn@grafana.com>
Convert QueryOpts to an interface so that downstream projects like
https://github.com/thanos-community/promql-engine could extend the query
options with engine specific options that are not in the original
engine.
Will be used to enable query analysis per-query.
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
So far, if a target exposes a histogram with both classic and native
buckets, a native-histogram enabled Prometheus would ignore the
classic buckets. With the new scrape config option
`scrape_classic_histograms` set, both buckets will be ingested,
creating all the series of a classic histogram in parallel to the
native histogram series. For example, a histogram `foo` would create a
native histogram series `foo` and classic series called `foo_sum`,
`foo_count`, and `foo_bucket`.
This feature can be used in a migration strategy from classic to
native histograms, where it is desired to have a transition period
during which both native and classic histograms are present.
Note that two bugs in classic histogram parsing were found and fixed
as a byproduct of testing the new feature:
1. Series created from classic _gauge_ histograms didn't get the
_sum/_count/_bucket prefix set.
2. Values of classic _float_ histograms weren't parsed properly.
Signed-off-by: beorn7 <beorn@grafana.com>
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.
The exceptions that I have found in our codebase are just these two:
* The `if else` is followed by an additional statement before the next
condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
used. In this case, using `switch` would require tagging the `for`
loop, which probably tips the balance.
Why are `switch` statements more readable?
For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.
I'm sure the aforemention wise coders can list even more reasons.
In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.
Signed-off-by: beorn7 <beorn@grafana.com>
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.
There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).
I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.
I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.
Signed-off-by: beorn7 <beorn@grafana.com>
Introduces support for a new query parameter in the `/rules` API endpoint that allows filtering by rule names.
If all the rules of a group are filtered, we skip the group entirely.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
In the past, every sample value was a float, so it was fine to call a
variable holding such a float "value" or "sample". With native
histograms, a sample might have a histogram value. And a histogram
value is still a value. Calling a float value just "value" or "sample"
or "V" is therefore misleading. Over the last few commits, I already
renamed many variables, but this cleans up a few more places where the
changes are more invasive.
Note that we do not to attempt naming in the JSON APIs or in the
protobufs. That would be quite a disruption. However, internally, we
can call variables as we want, and we should go with the option of
avoiding misunderstandings.
Signed-off-by: beorn7 <beorn@grafana.com>
Previously, we had one “polymorphous” `sample` type in the `storage`
package. This commit breaks it up into `fSample`, `hSample`, and
`fhSample`, each still implementing the `tsdbutil.Sample` interface.
This reduces allocations in `sampleRing.Add` but inflicts the penalty
of the interface wrapper, which makes things worse in total.
This commit therefore just demonstrates the step taken. The next
commit will tackle the interface overhead problem.
Signed-off-by: beorn7 <beorn@grafana.com>
In other words: Instead of having a “polymorphous” `Point` that can
either contain a float value or a histogram value, use an `FPoint` for
floats and an `HPoint` for histograms.
This seemingly small change has a _lot_ of repercussions throughout
the codebase.
The idea here is to avoid the increase in size of `Point` arrays that
happened after native histograms had been added.
The higher-level data structures (`Sample`, `Series`, etc.) are still
“polymorphous”. The same idea could be applied to them, but at each
step the trade-offs needed to be evaluated.
The idea with this change is to do the minimum necessary to get back
to pre-histogram performance for functions that do not touch
histograms. Here are comparisons for the `changes` function. The test
data doesn't include histograms yet. Ideally, there would be no change
in the benchmark result at all.
First runtime v2.39 compared to directly prior to this commit:
```
name old time/op new time/op delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 542µs ± 1% +38.58% (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 617µs ± 2% +36.48% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.36ms ± 2% +21.58% (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 8.94ms ± 1% +14.21% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.30ms ± 1% +10.67% (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.10ms ± 1% +11.82% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 11.8ms ± 1% +12.50% (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 87.4ms ± 1% +12.63% (p=0.000 n=9+9)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 32.8ms ± 1% +8.01% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.6ms ± 2% +9.64% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 117ms ± 1% +11.69% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 876ms ± 1% +11.83% (p=0.000 n=9+10)
```
And then runtime v2.39 compared to after this commit:
```
name old time/op new time/op delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 547µs ± 1% +39.84% (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 616µs ± 2% +36.15% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.26ms ± 1% +12.20% (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 7.95ms ± 1% +1.59% (p=0.000 n=10+8)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.38ms ± 2% +13.49% (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.02ms ± 1% +9.80% (p=0.000 n=10+9)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 10.8ms ± 1% +3.08% (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 78.1ms ± 1% +0.58% (p=0.035 n=9+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 33.5ms ± 4% +10.18% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.0ms ± 1% +7.98% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 107ms ± 1% +1.92% (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 775ms ± 1% -1.02% (p=0.019 n=9+9)
```
In summary, the runtime doesn't really improve with this change for
queries with just a few steps. For queries with many steps, this
commit essentially reinstates the old performance. This is good
because the many-step queries are the one that matter most (longest
absolute runtime).
In terms of allocations, though, this commit doesn't make a dent at
all (numbers not shown). The reason is that most of the allocations
happen in the sampleRingIterator (in the storage package), which has
to be addressed in a separate commit.
Signed-off-by: beorn7 <beorn@grafana.com>
This commit adds a new 'keep_firing_for' field to Prometheus alerting
rules. The 'resolve_delay' field specifies the minimum amount of time
that an alert should remain firing, even if the expression does not
return any results.
This feature was discussed at a previous dev summit, and it was
determined that a feature like this would be useful in order to allow
the expression time to stabilize and prevent confusing resolved messages
from being propagated through Alertmanager.
This approach is simpler than having two PromQL queries, as was
sometimes discussed, and it should be easy to implement.
This commit does not include tests for the 'resolve_delay' field. This
is intentional, as the purpose of this commit is to gather comments on
the proposed design of the 'resolve_delay' field before implementing
tests. Once the design of the 'resolve_delay' field has been finalized,
a follow-up commit will be submitted with tests."
See https://github.com/prometheus/prometheus/issues/11570
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
In most cases, there is no sample at `maxt`, so `PeekBack` has to be
used. So far, `PeekBack` did not return a float histogram, and we
disregarded even any returned normal histogram. This fixes both, and
also tweaks the unit test to discover the problem (by using an earlier
timestamp than "now" for the samples in the TSDB).
Signed-off-by: beorn7 <beorn@grafana.com>
* Add API endpoints for getting scrape pool names
This adds api/v1/scrape_pools endpoint that returns the list of *names* of all the scrape pools configured.
Having it allows to find out what scrape pools are defined without having to list and parse all targets.
The second change is adding scrapePool query parameter support in api/v1/targets endpoint, that allows to
filter returned targets by only finding ones for passed scrape pool name.
Both changes allow to query for a specific scrape pool data, rather than getting all the targets for all possible scrape pools.
The problem with api/v1/targets endpoint is that it returns huge amount of data if you configure a lot of scrape pools.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
* Add a scrape pool selector on /targets page
Current targets page lists all possible targets. This works great if you only have a few scrape pools configured,
but for systems with a lot of scrape pools and targets this slow things down a lot.
Not only does the /targets page load very slowly in such case (waiting for huge API response) but it also take
a long time to render, due to huge number of elements.
This change adds a dropdown selector so it's possible to select only intersting scrape pool to view.
There's also scrapePool query param that will open selected pool automatically.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Use `FromStrings` instead of assuming the data structure.
And don't sort individual labels, since `labels.Labels` are always sorted.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Patterned after `Chunk.Iterator()`: pass the old iterator in so it
can be re-used to avoid allocating a new object.
(This commit does not do any re-use; it is just changing all the method
signatures so re-use is possible in later commits.)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
We have 2 bugfixes, one which is important for Windows users and
another one on native histograms. I think it is worth cutting another
bugfix release before 2.41 comes out.
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
* Switch from 'sanity' to more inclusive lanuage
"Removing ableist language in code is important; it helps to create and
maintain an environment that welcomes all developers of all backgrounds,
while emphasizing that we as developers select the most articulate,
precise, descriptive language we can rather than relying on metaphors.
The phrase sanity check is ableist, and unnecessarily references mental
health in our code bases. It denotes that people with mental illnesses
are inferior, wrong, or incorrect, and the phrase sanity continues to be
used by employers and other individuals to discriminate against these
people."
From https://gist.github.com/seanmhanson/fe370c2d8bd2b3228680e38899baf5cc
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* wrap api error on get series/labels on `returnAPIError` function
Signed-off-by: Alan Protasio <approtas@amazon.com>
* lint
Signed-off-by: Alan Protasio <approtas@amazon.com>
* query exemplars
Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>
Use new experimental package `golang.org/x/exp/slices`.
slices.Sort works on values that are directly comparable, like ints,
so avoids the overhad of an interface call to `.Less()`.
Left tests unchanged, because they don't need the speed and it may be
a cross-check that slices.Sort gives the same answer.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Export `marshalTimestamp` and `marshalValue` functions by moving them under their own util package.
Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>
* fix the way to get the list of workspaces
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* update UI dependencies
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* Introduce out-of-order TSDB support
This implementation is based on this design doc:
https://docs.google.com/document/d/1Kppm7qL9C-BJB1j6yb6-9ObG3AbdZnFUBYPNNWwDBYM/edit?usp=sharing
This commit adds support to accept out-of-order ("OOO") sample into the TSDB
up to a configurable time allowance. If OOO is enabled, overlapping querying
are automatically enabled.
Most of the additions have been borrowed from
https://github.com/grafana/mimir-prometheus/
Here is the list ist of the original commits cherry picked
from mimir-prometheus into this branch:
- 4b2198d7ec
- 2836e5513f
- 00b379c3a5
- ff0dc75758
- a632c73352
- c6f3d4ab33
- 5e8406a1d4
- abde1e0ba1
- e70e769889
- df59320886
Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Dieter Plaetinck <dieter@grafana.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* gofumpt files
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Add license header to missing files
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix OOO tests due to existing chunk disk mapper implementation
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix truncate int overflow
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Add Sync method to the WAL and update tests
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* remove useless sync
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Update minOOOTime after truncating Head
* Update minOOOTime after truncating Head
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix lint
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Add a unit test
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Load OutOfOrderTimeWindow only once per appender
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix OOO Head LabelValues and PostingsForMatchers
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix replay of OOO mmap chunks
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Remove unnecessary err check
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Prevent panic with ApplyConfig
Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Run OOO compaction after restart if there is OOO data from WBL
Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Apply Bartek's suggestions
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Refactor OOO compaction
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Address comments and TODOs
- Added a comment explaining why we need the allow overlapping
compaction toggle
- Clarified TSDBConfig OutOfOrderTimeWindow doc
- Added an owner to all the TODOs in the code
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Run go format
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix remaining review comments
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix tests
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Change wbl reference when truncating ooo in TestHeadMinOOOTimeUpdate
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* Fix TestWBLAndMmapReplay test failure on windows
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Address most of the feedback
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Refactor the block meta for out of order
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix windows error
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix review comments
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Dieter Plaetinck <dieter@grafana.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
* Allow copying label-value pair to buffer on click
Kept similar DOM structure to keep test compatibility.
Using `navigator.clipboard` API since it is used by the current standard browsers.
React hot toast is used to notify that the text was successfully copied into clipboard.
Signed-off-by: lpessoa <luisalmeida@yape.com.pe>
* Using reactstrap for toast notification
Using the bootstrap toast notification provided by reactstrap.
Clipboard handling is managed using React.Context via a shared callback.
Updated css according to CR suggestions.
Signed-off-by: lpessoa <luisalmeida@yape.com.pe>
* Changes from CR comments
Cleaning up renderFormatted method.
Renamed Clipboard to ToastContext.
Updated tests.
Signed-off-by: Luis Pessoa <luisalmeida@yape.com.pe>
Signed-off-by: lpessoa <luisalmeida@yape.com.pe>
Signed-off-by: Luis Pessoa <luisalmeida@yape.com.pe>
* Update go to 1.19, set min version to 1.18
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
* Update golangci-lint
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
Some frameworks issue HEAD requests to determine health.
This resolvesprometheus/prometheus#11159
Signed-off-by: Nicolas Dumazet <nicdumz.commits@gmail.com>
Signed-off-by: Nicolas Dumazet <nicdumz.commits@gmail.com>
* Tweak colors in the dark theme to improve contrast
Some colors from the dark theme used in the query editor have a very low
contrast ratio with the background.
Signed-off-by: Jorge Luis Betancourt Gonzalez <jorge-luis.betancourt@trivago.com>
* Avoid duplicated function call when in dark mode
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: Jorge Luis Betancourt Gonzalez <jorge-luis.betancourt@trivago.com>
* Apply styles for the matching bracket when focused in dark mode
Signed-off-by: Jorge Luis Betancourt Gonzalez <jorge-luis.betancourt@trivago.com>
* Improve style of the matching brackets when focused
Signed-off-by: Jorge Luis Betancourt Gonzalez <jorge-luis.betancourt@trivago.com>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
* Allow formatting PromQL expressions in the UI
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Improve error handling, also catch HTTP errors
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Remove now-unneeded async property
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Disable format button when already formatted
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Disable format button when there are linter errors
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Remove disabling of format button again for linter errors
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add /api/v1/format_query API endpoint for formatting queries
This uses the formatting functionality introduced in
https://github.com/prometheus/prometheus/pull/10544.
I've chosen "query" instead of "expr" in both the endpoint and parameter
names to stay consistent with the existing API endpoints. Otherwise, I
would have preferred to use the term "expr".
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add docs for /api/v1/format_query endpoint
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add note that formatting expressions removes comments
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* enable ui module publication
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* use main changelog of Prometheus to reflect the changes of the packages
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* ignore changelog and license in the libs
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* replace perses references
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
This follow a simple function-based approach to access the count and
sum fields of a native Histogram. It might be more elegant to
implement “accessors” via the dot operator, as considered in the
brainstorming doc [1]. However, that would require the introduction of
a whole new concept in PromQL. For the PoC, we should be fine with the
function-based approch. Even the obvious inefficiencies (rate'ing a
whole histogram twice when we only want to rate each the count and the
sum once) could be optimized behind the scenes.
Note that the function-based approach elegantly solves the problem of
detecting counter resets in the sum of observations in the case of
negative observations. (Since the whole native Histogram is rate'd,
the counter reset is detected for the Histogram as a whole.)
We will decide later if an “accessor” approach is really needed. It
would change the example expression for average duration in
functions.md from
histogram_sum(rate(http_request_duration_seconds[10m]))
/
histogram_count(rate(http_request_duration_seconds[10m]))
to
rate(http_request_duration_seconds.sum[10m])
/
rate(http_request_duration_seconds.count[10m])
[1]: https://docs.google.com/document/d/1ch6ru8GKg03N02jRjYriurt-CZqUVY09evPg6yKTA1s/edit
Signed-off-by: beorn7 <beorn@grafana.com>
* bump codemirror to v0.20.x and lezer to v.0.16.x
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* bump codemirror to v6 and lezer to v1
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* stop treating warning as error for UI
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
We would like to implement the tsdb/status API in certain Thanos
components.
In order to match the Prometheus API and avoid duplicating code,
this commit makes the structs used in the status API public.
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
This moves prometheus_ready to the web package and links it with the ready variable that decides if HTTP requests should return 200 or 503.
This is a follow up change from #10682
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
"Labels is a sorted set of labels. Order has to be guaranteed upon
instantiation." says the comment, so fix all the tests that break this
rule.
For `BenchmarkLabelValuesWithMatchers()` and
`BenchmarkHeadLabelValuesWithMatchers()` the amount of work done changes
significantly if you put the labels in order, because all series refs
get neatly partitioned by the `tens` label, so I renamed the labels
to maintain the previous behaviour.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
While empty buckets can make sense in the internal representation (by
joining spans that would otherwise need more overhead for separate
representation), there are no spans in the JSON rendering. Therefore,
the JSON should not contain any empty buckets, since any buckets not
included in the output counts as empty anyway.
This changes both the inefficient MarshalJSON implementation as well
as the jsoniter implementation.
Signed-off-by: beorn7 <beorn@grafana.com>
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir
Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
This now even enables jsoniter marshaling of Points in an instant
query (which previously used the traditional JSON marshaling).
Signed-off-by: beorn7 <beorn@grafana.com>
* create lezer-promql module + move codemirror to a pure esm module + unified dependencies
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* ignore test utils file and remove the type "module" in package.json
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* use jest to run the lezer-promql test
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* give an automatic way to update the ui dependencies
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* update all dependencies using make update-npm-deps
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* fix react-app test
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* remove generated file
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* remove unnecessary backslash in script
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* fix reviews
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* rewording
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* use npx to run lezer-generator
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
This allows other implementations to inject their own statistics that
they're gathering in data linked from the context.Context. For example,
Cortex can inject its stats.Stats value under the `cortex` key.
Signed-off-by: Andrew Bloomgarden <blmgrdn@amazon.com>
We always track total samples queried and add those to the standard set
of stats queries can report.
We also allow optionally tracking per-step samples queried. This must be
enabled both at the engine and query level to be tracked and rendered.
The engine flag is exposed via a Prometheus feature flag, while the
query flag is set when stats=all.
Co-authored-by: Alan Protasio <approtas@amazon.com>
Co-authored-by: Andrew Bloomgarden <blmgrdn@amazon.com>
Co-authored-by: Harkishen Singh <harkishensingh@hotmail.com>
Signed-off-by: Andrew Bloomgarden <blmgrdn@amazon.com>
This change makes sure that the git worktree is not changed while
compressing assets, making it better for local development.
To achieve this, the compression script keeps the un-compressed assets
and generates the go:embed directory when compressing the files.
A .gitignore file has been added to ignore generated files.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
This will avoid codemirror-promql clients to choose manually between cjs or esm so the bundler can decide.
Signed-off-by: Gabriel Bernal <gbernal@redhat.com>
* Fix DataTable tests and missing value key warning
Fixes issues introduced in https://github.com/prometheus/prometheus/pull/10376
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Fix more DataTable brokenness
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* remove vfsgen usages
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
* web: use embed package for static assets
This requires go 1.16.
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
* circleci: drop go generate in web/ui
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
* Makefile: compress web assets before build
This commit add compression before (and decompression after) prometheus
is build. This ensures that gzipped assets are embeded in the prometheus
binary, if the builtinassets build tag is passed. If the build tag is
not passed this step is still executed but has no effect.
All this is executed in a subshell so that we can run the decompress
step even if the build step fails, but retain the exit code of promu.
This cleanup could also cover interrupts, but I left that out for now.
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
* Add a tooltip for unix times (ISO strings)
Signed-off-by: Ondrej Kokes <ondrej.kokes@gmail.com>
* Leverage useLocalTime to adjust ISO string tooltips
Signed-off-by: Ondrej Kokes <ondrej.kokes@gmail.com>
* revert pre styling removal
Signed-off-by: Ondrej Kokes <ondrej.kokes@gmail.com>
* Upgrade create-react-app to v5
Some other dependencies needs to be upgraded as well, plus some typescript errors fixed.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
* Use ESM imports for codemirror-promql
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
* Update FontAwesome to v6
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
* Run gofumpt on all files
Getting golangci-lint errors when building on my laptop, possibly because I have newer version of gofumpt then what it was formatted with.
Run gofumpt -w -extra on all files as it will be needed in the future anyway.
* Update golangci-lint to v1.44.2
v1.44.0 upgraded gofumpt so bumping version in CI will help keep formatting correct for everyone
* Address golangci-lint error
Getting 'error-strings: error strings should not be capitalized or end with punctuation or a newline' from revive here.
Drop new line.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
* add prometheus logo in the list of file that should be served at the root
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* move prometheus logo to src/images
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* Add Prometheus logo in react UI
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Use REM
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* increase the margin top of the navbar-brand
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
* Fix bug that sets the range input to the resolution
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Address review comments
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* rework the target page
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* put back the URL of the endpoint
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* replace old code by the new one and change function style
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* align filter and search bar on the same row
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* remove unnecessary return
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* upgrade kvsearch to v0.3.0
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* fix unit test
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* add missing style on column
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* add placeholder and autofocus
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* put back the previous table design
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* fix issue relative to the position of the tooltip
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* fix health filter
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* fix test on label tooltip
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* simplify filter condition
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* rework service discovery page
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* introduced generic custom infinite scroll component
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* adjust the placeholder in discovery page
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* ignore returning type missing
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* rework alert page
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* update snapshot to match the new rendering
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* fix infinite scroll component usage in alert
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* align checkbox like it was before
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* propose a more responsive line to display the buttons and the search bar
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* add a script to update the snapshot and update it
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* text in span won't be wrapped
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* create a component to handle the search bar with debounce
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* Update web/ui/react-app/src/pages/serviceDiscovery/Services.tsx
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* Nits after PR 10051 merge (#10159)
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
* tsdb/agent: Fix deadlock from simultaneous GC and write (#10166)
* tsdb/agent: Fix deadlock from simultaneous GC and write
This commit fixes a potential deadlock where storing in-memory series
references could deadlock with a WAL GC cycle.
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* add missing license header
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* order local imports
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* align deadlock testing with discovery/manager_test.go method
Also prevents GCs from running concurrently, which could also cause a
deadlock (even though it's currently impossible for two GCs to run
concurrently).
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* bump @nexucis/kvsearch to v0.4.0
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* Bump github.com/prometheus/client_golang to v1.12.0
Signed-off-by: beorn7 <beorn@grafana.com>
* Cut v2.33.0-rc.1
Signed-off-by: beorn7 <beorn@grafana.com>
Co-authored-by: Augustin Husson <husson.augustin@gmail.com>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Co-authored-by: Mauro Stettler <mauro.stettler@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Robert Fratto <robertfratto@gmail.com>