Commit graph

8525 commits

Author SHA1 Message Date
beorn7 cf698f71e5 Add an explanation for the quantile aggregation operator
Sadly, just linking to the Histogram best practice document, as done
for `histogram_quantile`, would be confusing here because the best
practice document only deals with quantiles in the context of
Histograms and Summaries, which is very different from the context of
the `quantile` aggregator and `quantile_over_time` function, which is
already a source of a lot of confusion.

Thus, I think the least bad solution is to add a short explanation in
this section directly. There isn't even a good resource on the
internet we can link to. A lot of statisticians use φ-quantiles, but
they don't have a generally accepted name for it.

I have added the explanation after the other detailed explanations of
`count_values`, `topk` and `bottomk`. I think that fits quite nicely
into the flow.

Signed-off-by: beorn7 <beorn@grafana.com>
2020-07-06 19:53:12 +02:00
Björn Rabenstein ad7da8fd35
Merge pull request #7463 from yeya24/support-time-promtool
Support time range in promtool query labels
2020-07-06 13:37:04 +02:00
Guangwen Feng ad9449d1f1
Add unit test case for func HasAlertingRules in manager.go (#7502)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2020-07-06 10:35:16 +01:00
Harkishen Singh f32307b656
Increments WAL corruption metric on WAL corruption during checkpointing (#7491)
* Increments wal corruption metric on error during checkpointing

Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>

* check for wal corruption error

Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
2020-07-05 11:25:42 +05:30
Ganesh Vernekar e65e2e0dac
Fix panic from db metrics (#7501)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-05 10:11:42 +05:30
Hu Shuai b962029031
Add a unit test for MergeLabels in storage/remote/codec.go. (#7499)
This PR is about adding a unit test for MergeLabels in storage/remote/codec.go.

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2020-07-04 22:17:19 -06:00
John Bampton 98a69b77d1
Fix spelling (#7512)
Signed-off-by: John Bampton <jbampton@users.noreply.github.com>
2020-07-04 14:54:26 +02:00
yeya24 797e48c1a3 support time range in promtool query labels
Updated prometheus/client_golang and json-iterator/go

Signed-off-by: yeya24 <yb532204897@gmail.com>
2020-07-03 11:29:39 -04:00
Julien Pivotto 74a6959d46
Docs: fix types (#7508)
I have batched a bunch of fixes around types in the documentation.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-02 23:27:14 +02:00
Julien Pivotto e1f9816a33
Openstack: Reduce timeouts (#7507)
Set saner values for openstack timeouts

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-02 22:50:32 +02:00
Steffen Neubauer 9c9b872087
OpenStack SD: Add availability config option, to choose endpoint type (#7494)
* OpenStack SD: Add availability config option, to choose endpoint type

In some environments Prometheus must query OpenStack via an alternative
endpoint type (gophercloud calls this `availability`.

This commit implements this option.

Co-Authored-By: Dennis Kuhn <d.kuhn@syseleven.de>
Signed-off-by: Steffen Neubauer <s.neubauer@syseleven.de>
2020-07-02 15:17:56 +01:00
Mikael Johansson a98be49c52
Change rule index numbering from 0 to n+1 on errors found in rule files (#7495)
* Change rule index numbering from 0 to n+1 on errors found in rule files

Signed-off-by: Mikael Johansson <mik.json@gmail.com>
2020-07-02 11:09:01 +01:00
Julien Pivotto aa452d8ab4
digitalocean: use a safer pagination method (#7498)
this method is documented here: https://github.com/digitalocean/godo

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-02 00:13:39 +02:00
Hu Shuai 578f2b7974
Add a unit test for labelsToOpenAPILabelSet in notifier/notifier.go. (#7492)
This PR is about adding a unit test for labelsToOpenAPILabelSet in notifier/notifier.go.

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2020-07-01 08:51:32 +01:00
Julien Pivotto 72425d4e3d
Add group() aggregator (#7480)
* Add group() aggregator

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-30 16:51:18 +02:00
Tom Wilkie 27b1009acd
Rename the dashboard in the mixin to 'Prometheus Overview'. (#7489)
Due to https://github.com/grafana/grafana/issues/15642, this prevents users putting this dashboard in a Grafana folder called 'Prometheus'.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2020-06-30 15:45:44 +01:00
Hu Shuai a94b570dc6
Add a unit test for newAzureResourceFromID in discovery/azure/azure.go. (#7484)
This PR is about adding a unit test for newAzureResourceFromID in discovery/azure/azure.go.

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2020-06-30 11:11:57 +01:00
Guangwen Feng f50786bcd1
Add unit test case to improve test coverage for quote.go (#7482)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2020-06-30 10:33:02 +02:00
Bartlomiej Plotka 1861bf38f5
tombstones: Fixed Add method in order to support trimming time series; Simplified the algo. (#7471)
* tombstones: Fixed Add method in order to support edge trimming; Simplified the algo.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Removed duplicated test case.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed comment, removed "edge" mention.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Removed trimming word.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-06-29 17:00:22 +01:00
Guangwen Feng 487f1e07ff
Add unit test case for func State in alerting.go (#7476)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2020-06-29 13:16:52 +01:00
Simon Pasquier 3155642108
web/ui/react-app: bump elliptic package version (#7477)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-29 10:28:05 +02:00
Guangwen Feng 9ab072b470
Fix golint issue caused by typo (#7475)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2020-06-28 11:03:09 +01:00
Julien Pivotto 800c0aefcf
Fix types in k8s+dns docs (#7474)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-28 09:13:23 +02:00
Mark Hansen f0a439bfc5
Web: Scroll hash-fragment links with navbar height (#7456)
Previously, hash-fragment links like this:
http://mark-t510:9090/targets#job-alertmanager

Would scroll to have the header at the top, obscured by the nav bar.

Tested in both old and new UIs.

Fixes #7434

Signed-off-by: Mark Hansen <markhansen@google.com>
2020-06-27 09:12:11 +02:00
Frederic Branczyk d17d88935c
rules: Use narrower interface for rule manager loading of for state (#7472)
To load ALERT_FOR_STATE only `storage.Queryable` interface is required,
so this patch uses this narrower interface for to perform this.

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
2020-06-26 19:06:36 +01:00
Ganesh Vernekar d2e631041e
Merge pull request #7465 from prometheus/merge-release-2.19
Merge release 2.19.2 into master
2020-06-26 18:55:41 +05:30
Julien Pivotto 59de58d380
Docker Swarm service discovery (#7420)
* Docker Swarm service discovery

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-26 12:25:58 +02:00
Julien Pivotto 0444a419d7
Consul: document health meta label (#7466)
implemented in #5313

fixes #770

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-26 12:14:51 +02:00
Marco Pracucci cef4dd6fff
Optimized label regex matcher with literal prefix and/or suffix (#7453)
* Optimized label regex matcher with literal prefix and/or suffix

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added license

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added more tests cases with newlines

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Restored deleted test

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-06-26 15:19:09 +05:30
Ganesh Vernekar a4c2ea1ca3
Merge remote-tracking branch 'upstream/master' into merge-release-2.19
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-06-26 14:33:50 +05:30
Ganesh Vernekar c448ada63d
Cut v2.19.2 (#7464)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-06-26 14:27:23 +05:30
Chris Marchbanks b299aba6cf
Fix panic when updating a remote write queue (#7452)
Right now Queue Manager metrics are registered when the metrics struct
is created, which happens before a changed queue is shutdown and the old
metrics are unregistered. In the case of named queues or updates to
external labels the apply config will panic due to duplicate metrics.

Instead, register the metrics as part of starting the queue as we always
guarantee that Stop will be called before a new Start.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2020-06-26 12:03:52 +05:30
Chris Marchbanks d78656c244
Pending Samples metric includes samples in channel (#7335)
* Pending Samples metric includes samples in channel

The pending samples metric should also include samples waiting in the
channels to be sent to provide a more accurate measure. In addition,
make sure that the pending samples is reset to 0 anytime a queue is
started as we remake all of the shards at that time.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>

* Log the number of dropped samples on hard shutdown

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2020-06-25 14:48:30 -06:00
Chris Marchbanks ec45e3d029
Remove duplicate test in labels_test.go (#7461)
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2020-06-25 12:42:04 -06:00
Ganesh Vernekar 082c17b691
Introduce SortedLabelValues/LabelValues to speedup queries for high cardinality (#7448)
* Introduce LabelValuesUnsorted to speedup queries for high cardinality

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add sort check

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-06-25 14:10:29 +01:00
Callum Styan 325f54b178
Add Bartek as remote read maintainer. (#7444)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2020-06-24 16:47:11 -07:00
Simon Pasquier cf6890a3a8
web/ui: bump jQuery to 3.5.1 for the legacy UI (#7447)
jQuery prior to 3.4.0 is affected by an Object.prototype pollution
vulnerability (CVE-2019-11358). Even though our code doesn't seem to be
vulnerable to the issue, lets upgrade to the latest jQuery release so we
don't have to bother.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-24 16:14:28 +02:00
Bartlomiej Plotka b788986717
storage: Adjusted fully storage layer support for chunk iterators: Remote read client, readyStorage, fanout. (#7059)
* Fixed nits introduced by https://github.com/prometheus/prometheus/pull/7334
* Added ChunkQueryable implementation to fanout and readyStorage.
* Added more comments.
* Changed NewVerticalChunkSeriesMerger to CompactingChunkSeriesMerger, removed tiny interface by reusing VerticalSeriesMergeFunc for overlapping algorithm for
both chunks and series, for both querying and compacting (!) + made sure duplicates are merged.
* Added ErrChunkSeriesSet
* Added Samples interface for seamless []promb.Sample to []tsdbutil.Sample conversion.
* Deprecating non chunks serieset based StreamChunkedReadResponses, added chunk one.
* Improved tests.
* Split remote client into Write (old storage) and read.
* Queryable client is now SampleAndChunkQueryable. Since we cannot use nice QueryableFunc I moved
all config based options to sampleAndChunkQueryableClient to aboid boilerplate.

In next commit: Changes for TSDB.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-06-24 14:41:52 +01:00
Guangwen Feng b41adab735
Add unit test case for func FromStrings in labels.go (#7321)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2020-06-24 11:54:30 +01:00
Nuno Cardoso f97d2ddb6e
REACT UI: CollapsibleAlertPanel - value field more friendly human readable (scientific notation -> number) (#7426)
* value field more human readable

Signed-off-by: kisc <nuno_kisc@hotmail.com>

* fix typo

Signed-off-by: Nuno Cardoso <nuno_kisc@hotmail.com>

* add function convertSCToNumber

Signed-off-by: nunokisc <nuno_kisc@hotmail.com>

* add convertSCToNumber test

Signed-off-by: nunokisc <nuno_kisc@hotmail.com>

* normalize function name

Signed-off-by: kisc <nuno_kisc@hotmail.com>

* convertScientificNotationToNumber to parsePrometheusFloat

Signed-off-by: kisc <nuno_kisc@hotmail.com>
2020-06-23 20:10:56 +02:00
Marco Pracucci 153f859b74
Fixed returned API status code on error (#7435)
* Fixed returned API status code on error

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed linter

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Simplified code

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-06-22 19:59:35 +05:30
Ben Kochie 08780a9ec9
Merge pull request #7433 from prometheus/superq/force_sync
Update repo sync policies
2020-06-22 15:26:55 +02:00
Ben Kochie 9932ff648a
Update repo sync policies
* Don't try and sync non-apache license files.
* Force create CODE_OF_CONDUCT.md.
* Switch to using an array of files to update.

Signed-off-by: Ben Kochie <superq@gmail.com>
2020-06-22 12:37:07 +02:00
Joe Lei 74a73ba1cf
fix analyze limit not work expected (#7430)
Signed-off-by: joelei <thezero12@hotmail.com>
2020-06-22 10:38:10 +01:00
Harkishen Singh 70b0a34616
Exit early on invalid config file (#7399)
* Reload config file at start

Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>

* relocated config checking

Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>

* change log lever

Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>

* add helpful comment

Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
2020-06-21 21:26:59 +05:30
Ben Kochie 27f89ac651
Merge pull request #7428 from roidelapluie/reposync
repo_sync: fix variable names
2020-06-21 07:41:13 +02:00
Julien Pivotto 89fd3ebdfe repo_sync: fix variable names
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-21 02:18:21 +02:00
Pierre Souchay 1508678001
Use 10m timeouts for watches (#7423)
use ?wait=10m will give results as fast as usual when data is changing
but will perform far less requests when services do not change.

On large infrastructure, this will reduce quite a lot the number of
qps on Consul servers while having the same performance for freshness
of results.

Signed-off-by: Pierre Souchay <p.souchay@criteo.com>
2020-06-20 20:22:45 +01:00
Julien Pivotto fb9a1a872e
DigitalOcean: limit refresh timeout (#7425)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-20 09:45:28 +02:00
Ganesh Vernekar 74207c0465
Merge pull request #7421 from codesome/merge-release-2.19
Merge release 2.19 into master
2020-06-19 15:31:32 +05:30