Commit graph

5166 commits

Author SHA1 Message Date
Simon Pasquier 0798f14e02 Add TestCoordinationWithEmptyProvider
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 48989d8996 discovery: add more tests
Co-authored-by: Camille Janicki <camille.janicki@gmail.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Goutham Veeramachaneni 068eaa5dbf
Release 2.4.0 (#4592)
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-09-11 16:13:52 +05:30
Goutham Veeramachaneni d075d78bfd
release 2.4.0-rc.0 (#4576)
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-09-06 19:04:59 +05:30
Goutham Veeramachaneni 3e87c04b83
Logger is nil for API. Fixes #4577 (#4583)
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-09-06 16:07:48 +05:30
Tom Wilkie 457e4bb58e
Limit the number of samples remote read can return. (#4532)
* Limit the number of samples remote read can return.

- Return 413 entity too large.
- Limit can be set be a flag.  Allow 0 to mean no limit.
- Include limit in error message.
- Set default limit to 50M (* 16 bytes = 800MB).

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-05 15:50:50 +02:00
Krasi Georgiev ba7eb733e8 tidy up the discovery logs,updating loops and selects (#4556)
* tidy up the discovery logs,updating loops and selects

few objects renamings

removed a very noise debug log on the k8s discovery. It would be usefull
to show some summary rather than every update as this is impossible to
follow.

added most comments as debug logs so each block becomes self
explanatory.

when the discovery receiving channel is full will retry again on the
next cycle.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>

* add noop logger for the SD manager tests.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>

* spelling nits

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-09-05 17:02:47 +05:30
Tariq Ibrahim f708fd5c99 Adding support for multiple azure environments (#4569)
Signed-off-by: Tariq Ibrahim <tariq.ibrahim@microsoft.com>
2018-09-04 17:55:40 +02:00
Simon Pasquier 40517dc4b3
Merge pull request #4539 from tariq1890/fix_docker_label
Fix Dockerfile label syntax
2018-09-04 09:22:30 +02:00
Simon Pasquier 674c76adb8 discovery: coalesce identical SD configurations (#3912)
* discovery: coalesce identical SD configurations

Instead of creating as many SD providers as declared in the
configuration, the discovery manager merges identical configurations
into the same provider and keeps track of the subscribers. When
the manager receives target updates from a SD provider, it will
broadcast the updates to all interested subscribers.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-01 08:51:31 +01:00
Simon Pasquier 75bd348135 web: clean up api/v2 (#4554)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-08-29 12:55:46 +05:30
Chris Marchbanks 63ed9d1b70 Send EndsAt along with alerts (#4550)
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2018-08-28 16:05:00 +01:00
Thomas Jackson f0a1ebc19d Add SelectParams to Select for federation (#4546)
When prom2 came out the storage querier interface consolidated to a
single Select() method. While doing this it makes it impossible as the
implementer of the querier to know if you are being called for metadata
or actual data. The workaround has been to check if the SelectParams are
nil, which the federation call is always nil. This has 2 negative
consequences (1) remote implementations interpret this as a metadata
call, which makes the federation endpoint return nothing. (2) this means
that the storage implementations don't get the same information passed
down to them as far as SelectParams goes.

This diff simply adds SelectParams to the Select() call in the
federation handler

Mitigation for #4057

Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
2018-08-28 11:23:31 +01:00
Chris Marchbanks 87f1dad16d throttle resends of alerts to 1 minute by default (#4538)
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2018-08-27 17:41:42 +01:00
Krasi Georgiev 53691ae261 Simplify SD update throttling (#4523)
* simplfied SD updates throtling

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>

* add default to catch cases when we don't have new updates.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-08-27 17:12:11 +02:00
Tariq Ibrahim a188693015 Adding unit tests for the util package (#4534)
Signed-off-by: Tariq Ibrahim <tariq.ibrahim@microsoft.com>
2018-08-27 15:55:49 +01:00
Tariq Ibrahim c7693f6c68 Fix Dockerfile label syntax
Signed-off-by: Tariq Ibrahim <tariq.ibrahim@microsoft.com>
2018-08-26 19:43:53 -07:00
Dan Cech 9f4cb06a37 use Welford/Knuth method to compute standard deviation and variance (#4533)
* use Welford/Knuth method to compute standard deviation and variance, avoids float precision issues
* use better method for calculating avg and avg_over_time

Signed-off-by: Dan Cech <dcech@grafana.com>
2018-08-26 10:28:47 +01:00
Daisy T 7d01ead689 change time.duration to model.duration for standardization (#4479)
Signed-off-by: Daisy T <daisyts@gmx.com>
2018-08-24 16:55:21 +02:00
Simon Pasquier 3581377e5d Replace go-bindata with vfsgen (#4430)
Looking at https://tech.townsourced.com/post/embedding-static-files-in-go/ (which was mentioned in the issue), vfsgen has all the needed features.

In particular:

- Reproducible builds (no issue with timestamping).
- Well maintained and relatively popular.
- Integration with go generate.
- Self-contained (no external dependency).

* [WIP] Replace go-bindata by vfsgen

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Add license + remove doc.go

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Generate templates assets

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use new templates assets

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* split static assets

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Idempotent make assets

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Update vendor/

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* vendor vfsgendev

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Update README.md

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Simplify assets generation

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Fix README.md

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use generate helper program instead of vfsgen

This avoids installing vfsgendev in the target environment.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Remove unused vfsgen package

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Fix Makefile

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* vendoring shurcooL/vfsgen

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Fix go generate command

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Sync web/ui/assets_vfsdata.go

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-08-24 09:03:10 +02:00
Max Inden ecf676cf97 web/api: Expose rule health and last error (#4501)
Expose rule health and last evaluation error on `/api/v1/rules`.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-08-23 18:30:10 +05:30
Krasi Georgiev 12fe204ea6
move runtime debug funcs in own package (#4494)
To make local debuging with `go run` easyer moved all files into a
dedicate package `runtime`.
This allows running prometheus just by using `go run main.go` instead of
passing mani files like `go run main.go limits_default.go ...`

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-08-22 13:41:11 +03:00
Goutham Veeramachaneni f3b7c22827 rules: add comment about lock taking (#4525)
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-08-21 21:30:08 +02:00
Ganesh Vernekar c663477688 Fixed TestUpdate in rules/manager_test.go (#4516)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-08-20 18:21:05 +05:30
Fabian Reinartz f571b69010
Merge pull request #4514 from jkohen/ec2-targets
Expose EC2 instance owner as a discovery label.
2018-08-20 08:43:44 +02:00
Javier Kohen 1c89984778 Expose EC2 instance owner as a discovery label.
This exposes the OwnerID field of the DescribeInstances respons as .

Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-17 11:30:18 -04:00
Julius Volz 8fbe1b5133
Handle a bunch of unchecked errors (#4461)
There are many more (mostly finalizers like Close/Stop/etc.), but most of
the others seemed like one couldn't do much about them anyway.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-17 17:24:35 +02:00
Yecheng Fu d4eae8cc0c Wait for all internal discoveries are done before exiting. (#4508)
Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-08-17 18:50:22 +05:30
Ganesh Vernekar a0a9e7df91 Fix TestForStateRestore (#4476) (#4512)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-08-16 22:56:15 +05:30
Goutham Veeramachaneni 71855a22a4
Add tracing spans to promql (#4436)
* Add spans to promql

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Simplify timer and span tracking.

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-08-16 13:11:34 +05:30
Julien Pivotto 0b4d22b245 rules/manager: remove a no-longer-relevant comment (#4503)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2018-08-15 09:33:39 +01:00
Chris Marchbanks 11155c7028 Existing alert labels will update based on templates (#4500)
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2018-08-15 08:52:08 +01:00
Brian Brazil cd54add5b8
Clarify that {a="b",a!="c"} is possible. (#4492)
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-08-13 11:38:57 +01:00
Fabian Reinartz b04ab71268
Merge pull request #4488 from jkohen/patch-3
Populate __meta_gce_instance_id discovery label
2018-08-11 09:52:28 +02:00
Javier Kohen 403ac08ece Expose __meta_gce_instance_id as an integer (instead of raw bytes).
Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-10 16:21:46 -04:00
Javier Kohen 2d4bcb3ee1 Document the new __meta_gce_instance_id discovery label.
Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-10 11:59:22 -04:00
Javier Kohen 7e9549b398 Added __meta_gce_instance_id discovery label
Populated from instance.ID. I will follow up with a change to the documentation.

Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-10 11:57:55 -04:00
Simon Pasquier b7054f3a78
Merge pull request #4443 from simonpasquier/fix-consul-connections-leak
discovery/consul: close idle connections on stop
2018-08-10 17:43:39 +02:00
Simon Pasquier d6c590018b
Merge pull request #4472 from simonpasquier/better-gofmt-msg
Makefile: improve 'make style' reporting
2018-08-08 09:32:17 +02:00
Simon Pasquier 08c2f50382
Merge pull request #4418 from simonpasquier/log-vm-limits
prometheus: log virtual memory limits
2018-08-07 16:27:46 +02:00
Simon Pasquier a827f91aff Makefile: improve 'make style' reporting
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-08-07 14:54:20 +02:00
Fabian Reinartz cc878ea559
Merge pull request #4471 from prometheus/uptsdb
vendor: update TSDB
2018-08-07 14:29:26 +02:00
Fabian Reinartz b7e2f407de rules: Fix double-locking of mutex
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-08-07 07:33:39 -04:00
Fabian Reinartz 817e379a0e vendor: update TSDB
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-08-07 07:33:39 -04:00
Benji Visser 46fb4078a6 handle nil pointer in ec2 discovery (#4469)
This handles a nil pointer that was being accessed in EC2 discovery.

Fixes: #4441

Signed-off-by: noqcks <benny@noqcks.io>
2018-08-07 08:35:22 +01:00
Benji Visser 8bb6e0dd6e Show rule evaluation errors on rules page (#4457)
* adding information about the health and errors for Rules

adding Health() and LastError() to the Rule interface. This will allow
us to easily surface information about rules.

Signed-off-by: noqcks <benny@noqcks.io>

* updating rules.html with fields for Rule errors and health state

Signed-off-by: noqcks <benny@noqcks.io>

* fix code comment grammar & access Rule health/error info using a mutex

Signed-off-by: noqcks <benny@noqcks.io>

* s/Errors/Error/ in rules.html to remain consistent with targets.html

Signed-off-by: noqcks <benny@noqcks.io>

* adding periods to code comments in reporting/alerting

Signed-off-by: noqcks <benny@noqcks.io>

* putting health/error below mutex in struct field

Signed-off-by: noqcks <benny@noqcks.io>
2018-08-07 00:33:45 +02:00
Julius Volz 2b8fc062a8
rules: HTML-escape rule YAML marshal errors (#4464)
This was pointed out by `gosec`.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-05 14:01:51 +02:00
Dave Henderson 73a08f0045 promtool - Adding --step flag to 'query range' subcommand (#4454)
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
2018-08-05 11:03:18 +02:00
Julius Volz 159e1537d2
Remove /heap endpoint (#4460)
It was added 5 years ago by Matt and I'm not sure anyone ever used
it after public release (since we have /debug/pprof/heap as well).

It also lacked error checking and allows people to write to disk over HTTP.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-04 21:31:28 +02:00
Julius Volz 90521a65f8
Remove error return value from NotifyFunc() (#4459)
It's always nil and we also forgot to check it.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-04 21:31:12 +02:00