Commit graph

297 commits

Author SHA1 Message Date
Dustin Hooten ca60bf298c React UI: Implement /targets page (#6276)
* Add LastScrapeDuration to targets endpoint

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* Add Scrape job name to targets endpoint

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* Implement the /targets page in react

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* Add state query param to targets endpoint

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* Use state filter in api call

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* api feedback

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* pr feedback frontend

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* Implement and use localstorage hook

Signed-off-by: Dustin Hooten <dhooten@splunk.com>

* PR feedback

Signed-off-by: Dustin Hooten <dhooten@splunk.com>
2019-11-11 22:42:24 +01:00
Boyko cb7cbad5f9 WIP: status page - API and UI (#6243)
* status page initial commit

Signed-off-by: Boyko Lalov <boyskila@gmail.com>
Signed-off-by: blalov <boyko.lalov@tick42.com>

* refactor useFetch

Signed-off-by: Boyko Lalov <boyskila@gmail.com>
Signed-off-by: blalov <boyko.lalov@tick42.com>

* refactoring

Signed-off-by: Boyko Lalov <boyskila@gmail.com>
Signed-off-by: blalov <boyko.lalov@tick42.com>

* adding tests

Signed-off-by: Boyko Lalov <boyskila@gmail.com>
Signed-off-by: blalov <boyko.lalov@tick42.com>

* snapshot testing

Signed-off-by: Boyko Lalov <boyskila@gmail.com>
Signed-off-by: blalov <boyko.lalov@tick42.com>

* fix wrong go files formatting

Signed-off-by: Boyko Lalov <boyskila@gmail.com>
Signed-off-by: blalov <boyko.lalov@tick42.com>

* change the snapshot library

Signed-off-by: Boyko Lalov <boyskila@gmail.com>
Signed-off-by: blalov <boyko.lalov@tick42.com>

* update api paths

Signed-off-by: Boyko Lalov <boyskila@gmail.com>
Signed-off-by: blalov <boyko.lalov@tick42.com>

* move test folder outside src

Signed-off-by: Boyko Lalov <boyskila@gmail.com>
Signed-off-by: blalov <boyko.lalov@tick42.com>

* useFetches tests

Signed-off-by: blalov <boyko.lalov@tick42.com>

* sticky navbar

Signed-off-by: Boyko Lalov <boyskila@gmail.com>
Signed-off-by: blalov <boyko.lalov@tick42.com>

* handle runtimeInfo error on Gather() and add json tags to RuntimeInfo struct

Signed-off-by: blalov <boyko.lalov@tick42.com>

* refactor alert managers section

Signed-off-by: blalov <boyko.lalov@tick42.com>
2019-11-02 16:53:32 +01:00
Yao Zengzeng 8744afdd1e cleanup redundant code of TestEndpoints (#6022)
Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>
2019-09-18 11:40:50 +01:00
Goutham Veeramachaneni 4c648eddf4
vendor: Update json-iterator (#6030)
Also add explicit support for NaN/Inf

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2019-09-17 19:22:26 +05:30
Yao Zengzeng 21c9789083 multiple queries test for StreamRead (#5969)
Signed-off-by: YaoZengzeng <yaozengzeng@zju.edu.cn>
2019-08-29 11:57:38 +01:00
Bartek Płotka 48b2c9c8ea
remote-read: streamed chunked server side; Extended protobuf; Added chunked, checksumed reader (#5703)
Part of: https://github.com/prometheus/prometheus/issues/4517 and https://github.com/improbable-eng/thanos/issues/488

Changes:
* Extended protobuf for chunked remote read and negotation.
* Added checksumed, chunked Writer/Reader.
* Added Server side implementation for chunked streamed remote-read.


Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2019-08-19 21:16:10 +01:00
Ganesh Vernekar 5ecef3542d
Cleanup after merging tsdb into prometheus
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2019-08-13 14:04:14 +05:30
Chris Marchbanks 0685eb5395
Refactor testutil.NewStorage into a new package
This avoids a circular dependency between the testutil and storage
packages.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2019-08-08 19:43:04 -06:00
@aifsair 0f00737308 Fix log config setup (#5807)
The return value of Set() was not checked.
Hence, the typo ("al" instead of "af") wasn't catched.

Signed-off-by: François (fser) <fser@code-libre.org>
2019-07-29 18:00:30 +01:00
Tariq Ibrahim 9fe9e66cfd
remove the dependency on cockroachdb in prometheus
Signed-off-by: Tariq Ibrahim <tariq181290@gmail.com>
2019-07-09 01:27:17 -07:00
Thomas Jackson fef150f1b5 Add tests to ensure we can marshal and unmarshal our min/max times (#5734)
* Add tests to ensure we can marshal and unmarshal our min/max times

Related to https://github.com/prometheus/client_golang/issues/614

Instead of implementing all the time parsing, we can special-case handle
these 2 times. This means if times in this format show up that
time.Parse can't handle they will still error, but we can marshal/parse
our own min/max time

Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
2019-07-08 10:43:59 +01:00
Thomas Jackson 91d7175eaa Add storage.Warnings to LabelValues and LabelNames (#5673)
Fixes #5661

Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
2019-06-17 08:31:17 +01:00
Alex Salt d6a4daa26a web api: handle alert with Infinity/NaN values (#5582)
* web/api/v1: alert value as string in alert/rules endpoints

Signed-off-by: Alexander Saltykov <alexander-s@yandex-team.ru>
2019-05-21 10:41:54 +01:00
Simon Pasquier 45506841e6
*: enable all default linters (#5504)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-05-03 15:11:28 +02:00
Bjoern Rabenstein 38d518c0fe Rework #5009 after comments
Signed-off-by: Bjoern Rabenstein <bjoern@rabenste.in>
2019-04-17 01:40:10 +02:00
Simon Pasquier 81c4248081
*: bump gRPC and protobuf dependencies (#5367)
The goal is to remove almost all references to the
golang.org/x/net/context package.

github.com/gogo/protobuf => v1.2.1
google.golang.org/grpc => v1.19.1
github.com/grpc-ecosystem/grpc-gateway => v1.18.5

It also replaces github.com/cockroachdb/cmux by github.com/soheilhy/cmux
because of [1] which fixes #3909 incidentally.

[1] https://github.com/grpc/grpc-go/issues/2636

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-04-04 11:55:32 +02:00
Bob Shannon 8c8bb82d04 Add support for POSTing to /series endpoint (#5422)
* Add support for POSTing to /series endpoint
* Document query API POST support

Signed-off-by: Bob Shannon <bob.m.shannon@gmail.com>
2019-04-02 18:00:29 +01:00
Tariq Ibrahim 8fdfa8abea refine error handling in prometheus (#5388)
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-26 00:01:12 +01:00
Bharath 91306bdf24 Support non POST methods for Lifecycle and Admin APIs (#5376)
Signed-off-by: Bharath Thiruveedula <bharath_ves@hotmail.com>
2019-03-20 17:33:45 +00:00
Tom Wilkie c7b3535997 Use pkg/relabelling in remote write.
- Unmarshall external_labels config as labels.Labels, add tests.
- Convert some more uses of model.LabelSet to labels.Labels.
- Remove old relabel pkg (fixes #3647).
- Validate external label names.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-03-18 20:31:12 +00:00
Palash Nigam 09208b1a58 queryRange: Add more descriptive error messages (#5229)
Fixes: https://github.com/prometheus/prometheus/issues/4811

Signed-off-by: Palash Nigam <npalash25@gmail.com>
2019-02-19 19:16:14 +00:00
Callum Styan 6f69e31398 Tail the TSDB WAL for remote_write
This change switches the remote_write API to use the TSDB WAL.  This should reduce memory usage and prevent sample loss when the remote end point is down.

We use the new LiveReader from TSDB to tail WAL segments.  Logic for finding the tracking segment is included in this PR.  The WAL is tailed once for each remote_write endpoint specified. Reading from the segment is based on a ticker rather than relying on fsnotify write events, which were found to be complicated and unreliable in early prototypes.

Enqueuing a sample for sending via remote_write can now block, to provide back pressure.  Queues are still required to acheive parallelism and batching.  We have updated the queue config based on new defaults for queue capacity and pending samples values - much smaller values are now possible.  The remote_write resharding code has been updated to prevent deadlocks, and extra tests have been added for these cases.

As part of this change, we attempt to guarantee that samples are not lost; however this initial version doesn't guarantee this across Prometheus restarts or non-retryable errors from the remote end (eg 400s).

This changes also includes the following optimisations:
- only marshal the proto request once, not once per retry
- maintain a single copy of the labels for given series to reduce GC pressure

Other minor tweaks:
- only reshard if we've also successfully sent recently
- add pending samples, latest sent timestamp, WAL events processed metrics

Co-authored-by: Chris Marchbanks <csmarchbanks.com> (initial prototype)
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com> (sharding changes)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-02-12 11:39:13 +00:00
zhulongcheng fd964426a7 web: predeclare and reuse errors (#5180)
Predeclare and reuse errors to reduce duplicate code

Signed-off-by: zhulongcheng <zhulongcheng.me@gmail.com>
2019-02-04 13:06:26 +01:00
zhulongcheng a75f8a8e05 update error message in extractTimeRange (#5179)
Update error message in the extractTimeRange function
to match function's logic

Signed-off-by: zhulongcheng <zhulongcheng.me@gmail.com>
2019-02-03 09:29:23 +00:00
Hrishikesh Barman a1f34bec2e Added CORS Origin flag (#5011)
Signed-off-by: Hrishikesh Barman <hrishikeshbman@gmail.com>
2019-01-17 15:01:06 +00:00
Matt Layher 302148fd69 *: apply gofmt -s
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-01-16 17:28:14 -05:00
Callum Styan 5358f76c5c update remote write path proto so that Labels/Timeseries can't be nil (#4957)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-01-15 19:13:39 +00:00
Simon Pasquier 375ad1185c
*: bump gRPC dependencies (#5075)
* *: bump gRPC dependencies

This change updates the gRPC dependencies to more recent versions:

* github.com/gogo/protobuf => v1.2.0
* github.com/grpc-ecosystem/grpc-gateway => v1.6.3
* google.golang.org/grpc => v1.17.0

In addition scripts/genproto.sh leverages Go modules information instead of
hardcoding SHA1 commits. This ensures that the code is generated from
the exact same sources.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Run 'make proto' in CI

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Revert tabs -> spaces change

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Fix 'make proto' step

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* 'go get' grpc/protobuf dependencies

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Prepopulate cache with go mod download

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-15 15:32:05 +01:00
Simon Pasquier f678e27eb6
*: use latest release of staticcheck (#5057)
* *: use latest release of staticcheck

It also fixes a couple of things in the code flagged by the additional
checks.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use official release of staticcheck

Also run 'go list' before staticcheck to avoid failures when downloading packages.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-04 14:47:38 +01:00
Tom Wilkie 6e08029b56
Move err to be the last return value from storage.Select. (#5054)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-01-02 11:10:13 +00:00
SenXuDC 8fd0c0ab2e fix typo imeplements -> implements (#4979)
Signed-off-by: SenXuDC <sen.xu@daocloud.io>
2018-12-18 11:52:16 +01:00
Julius Volz 11a52be1d8
Better rounding for incoming query timestamps (#4941)
Fixes https://github.com/prometheus/prometheus/issues/4939

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-12-03 20:25:54 +08:00
mknapphrt f0e9196dca Return warnings on a remote read fail (#4832)
Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>
2018-11-30 14:27:12 +00:00
Ben Kochie c6399296dc
Fix spelling/typos (#4921)
* Fix spelling/typos

Fix spelling/typos reported by codespell/misspell.
* UK -> US spelling changes.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-11-27 17:44:29 +01:00
Fabian Reinartz a9803e9ecb Correctly skip mismatching targets
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-11-23 17:10:31 +01:00
Alex Yu 5dcce32ef8 update promlog to latest version (#4876)
* update promlog to latest version

Signed-off-by: Alex Yu <yu.alex96@gmail.com>

* Update api tests, fix main setup

Signed-off-by: Alex Yu <yu.alex96@gmail.com>

* tidy go.sum

Signed-off-by: Alex Yu <yu.alex96@gmail.com>

* revendor prometheus/common

Signed-off-by: Alex Yu <yu.alex96@gmail.com>

* only initialize config; use kingpin for remote_storage_adapter

Signed-off-by: Alex Yu <yu.alex96@gmail.com>

* actually parse the flags

Signed-off-by: Alex Yu <yu.alex96@gmail.com>

* clean up imports

Signed-off-by: Alex Yu <yu.alex96@gmail.com>
2018-11-23 14:22:40 +01:00
Ganesh Vernekar ca93fd544b /api/v1/labels endpoint for getting all label names (#4835)
* vendor: update tsdb

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* /api/v1/labels endpoint

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* regex matchers for API

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add docs

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Matchers behaving as OR

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Removed the matchers

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* vendor: update tsdb using go mod

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* vendor update: tsdb

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Added LabelNames() to storage.Querier

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Test for api.labelNames

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Nits

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-11-19 15:51:14 +05:30
Simon Pasquier 6fa8de132b
web/v1/api: add tests for admin actions (#4767)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-15 14:22:16 +01:00
Goutham Veeramachaneni 7acedbce64
web(api): Make query and range api errors match
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-11-14 15:25:54 +05:30
Simon Pasquier 181f07ef26
web: avoid proxy to connect to the local gRPC server (#4572)
By default the gRPC client of the REST API gateway relies on the
HTTP_PROXY variable to connect to the local gRPC server which isn't
desired as the server runs in the same process. This change uses a
custom dialer that connects directly to the server's address.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-13 14:42:23 +01:00
Simon Pasquier a308a186e4 web/api/v1: fix targets endpoint
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-25 11:10:26 +02:00
Brian Brazil 9c03e11c2c Hook OpenMetrics parser into scraping.
Extend metadata api to support units.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-10-18 13:58:00 +01:00
Simon Pasquier c4a6acfb1e
*: move to go 1.11 (#4626)
* *: move to go 1.11

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Reduce number of places where we specify the Go version

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-16 09:41:45 +02:00
Goutham Veeramachaneni ffb7f829ec
Merge pull request #4730 from prometheus/release-2.4
Release 2.4
2018-10-12 14:15:42 -07:00
Tariq Ibrahim d371697841 Adding new metric type to track in-flight queries via the remote read API endpoint. (#4699)
* Adding new metric type to track in-flight queries via the remote read API endpoint.

Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>

* fix review comments

Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>

* fix comments

Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-10-10 16:09:08 -07:00
Callum Styan 9bca041285 WIP: keep track of samples per query, set a max # of samples (#4513)
* keep track of samples per query, set a max # of samples that can be in
memory at once

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2018-10-02 12:59:19 +01:00
Simon Pasquier 3c00eeaf16 web/api/v1: fix optional skip_head for snapshot (#4674)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-02 16:18:07 +05:30
Krasi Georgiev 47a673c3a0
process scrape loops reloading in parallel (#4526)
The scrape manage receiver's channel now just saves the target sets
and another backgorund runner updates the scrape loops every 5 seconds.
This is so that the scrape manager doesn't block the receiving channel
when it does the long background reloading of the scrape loops.

Active and dropped targets are now saved in each scrape pool instead of
the scrape manager. This is mainly to avoid races when getting the
targets via the web api.

When reloading the scrape loops now happens in parallel to speed up the
final disared state and this also speeds up the prometheus's shutting
down.

Also updated some funcs signatures in the web package for consistency.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-09-26 12:20:56 +03:00
Tom Wilkie 4c52400708
Limit concurrent remote reads. (#4656)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-25 20:07:34 +01:00
Tom Wilkie d3a1ff1abf
Reduce memory usage of remote read by reducing pointer usage. (#4655)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-25 19:14:00 +01:00
Goutham Veeramachaneni 3e87c04b83
Logger is nil for API. Fixes #4577 (#4583)
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-09-06 16:07:48 +05:30
Tom Wilkie 457e4bb58e
Limit the number of samples remote read can return. (#4532)
* Limit the number of samples remote read can return.

- Return 413 entity too large.
- Limit can be set be a flag.  Allow 0 to mean no limit.
- Include limit in error message.
- Set default limit to 50M (* 16 bytes = 800MB).

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-05 15:50:50 +02:00
Simon Pasquier 75bd348135 web: clean up api/v2 (#4554)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-08-29 12:55:46 +05:30
Max Inden ecf676cf97 web/api: Expose rule health and last error (#4501)
Expose rule health and last evaluation error on `/api/v1/rules`.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-08-23 18:30:10 +05:30
Julius Volz 8fbe1b5133
Handle a bunch of unchecked errors (#4461)
There are many more (mostly finalizers like Close/Stop/etc.), but most of
the others seemed like one couldn't do much about them anyway.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-17 17:24:35 +02:00
Ganesh Vernekar f1db699dff Persist alert 'for' state across restarts (#4061)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-08-02 11:18:24 +01:00
Max Leonard Inden 71fafad099
api/v1: Coninue work exposing rules and alerts
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-07-30 15:31:51 +02:00
mg03 31f8ca0dfb
api v1 alerts/rules json endpoint
Signed-off-by: mg03 <mgeng03@gmail.com>
2018-07-30 15:29:44 +02:00
Tom Wilkie b1f600343f
Merge pull request #4359 from prometheus/report-errors
Log errors encountered when marshalling and writing responses.
2018-07-25 13:39:04 +01:00
Tom Wilkie 02534510ca Review feedback.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-07-25 13:35:47 +01:00
Tom Wilkie 901e6d1f82 Review feedback.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-07-25 13:17:10 +01:00
Thomas Jackson 92c6f0c92e Add offset to selectParams (#4226)
* Add Start/End to SelectParams
* Make remote read use the new selectParams for start/end

This commit will continue sending the start/end time of the remote read
query as the overarching promql time and the specific range of data that
the query is intersted in receiving a response to is now part of the
ReadHints (upstream discussion in #4226).

* Remove unused vendored code

The genproto.sh script was updated, but the code wasn't regenerated.
This simply removes the vendored deps that are no longer part of the
codegen output.

Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
2018-07-18 04:58:00 +01:00
Tom Wilkie f83155b11e Review feedback.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-07-13 19:31:23 +01:00
Tom Wilkie ccb2ee607b Log errors encountered when marshalling and writing responses.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-07-06 18:44:45 +01:00
Tom Wilkie fcc3f43acd spelling.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-06-18 17:32:44 +01:00
Tom Wilkie ae29512444 Extend API tests to cover remote read API.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-06-18 17:21:12 +01:00
Fabian Reinartz 057a5ae2b1 Address comments
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-06-06 11:21:17 -04:00
Fabian Reinartz ad4c33c1ff scrape,api: provide per-target metric metadata
This adds a per-target cache of scraped metadata. The metadata is only
available for the lifecycle of the attached target. An API endpoint allows
to select metadata by metric name and a label selection of targets.

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-06-06 05:56:10 -04:00
Brian Brazil dd6781add2 Optimise PromQL (#3966)
* Move range logic to 'eval'

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make aggregegate range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* PromQL is statically typed, so don't eval to find the type.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Extend rangewrapper to multiple exprs

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Start making function evaluation ranged

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make instant queries a special case of range queries

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Eliminate evalString

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Evaluate range vector functions one series at a time

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make unary operators range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make binops range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Pass time to range-aware functions.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make simple _over_time functions range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reduce allocs when working with matrix selectors

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Add basic benchmark for range evaluation

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reuse objects for function arguments

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Do dropmetricname and allocating output vector only once.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Add range-aware support for range vector functions with params

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Optimise holt_winters, cut cpu and allocs by ~25%

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make rate&friends range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make more functions range aware. Document calling convention.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make date functions range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make simple math functions range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Convert more functions to be range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make more functions range aware

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Specialcase timestamp() with vector selector arg for range awareness

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove transition code for functions

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove the rest of the engine transition code

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove more obselete code

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove the last uses of the eval* functions

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove engine finalizers to prevent corruption

The finalizers set by matrixSelector were being called
just before the value they were retruning to the pool
was then being provided to the caller. Thus a concurrent query
could corrupt the data that the user has just been returned.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Add new benchmark suite for range functinos

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Migrate existing benchmarks to new system

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Expand promql benchmarks

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Simply test by removing unused range code

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* When testing instant queries, check range queries too.

To protect against subsequent steps in a range query being
affected by the previous steps, add a test that evaluates
an instant query that we know works again as a range query
with the tiimestamp we care about not being the first step.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reuse ring for matrix iters. Put query results back in pool.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reuse buffer when iterating over matrix selectors

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Unary minus should remove metric name

Cut down benchmarks for faster runs.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reduce repetition in benchmark test cases

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Work series by series when doing normal vectorSelectors

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Optimise benchmark setup, cuts time by 60%

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Have rangeWrapper use an evalNodeHelper to cache across steps

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Use evalNodeHelper with functions

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Cache dropMetricName within a node evaluation.

This saves both the calculations and allocs done by dropMetricName
across steps.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reuse input vectors in rangewrapper

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Reuse the point slices in the matrixes input/output by rangeWrapper

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make benchmark setup faster using AddFast

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Simplify benchmark code.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Add caching in VectorBinop

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Use xor to have one-level resultMetric hash key

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Add more benchmarks

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Call Query.Close in apiv1

This allows point slices allocated for the response data
to be reused by later queries, saving allocations.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Optimise histogram_quantile

It's now 5-10% faster with 97% less garbage generated for 1k steps

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make the input collection in rangeVector linear rather than quadratic

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Optimise label_replace, for 1k steps 15x fewer allocs and 3x faster

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Optimise label_join, 1.8x faster and 11x less memory for 1k steps

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Expand benchmarks, cleanup comments, simplify numSteps logic.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Address Fabian's comments

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Comments from Alin.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Address jrv's comments

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Remove dead code

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Address Simon's comments.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Rename populateIterators, pre-init some sizes

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Handle case where function has non-matrix args first

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Split rangeWrapper out to rangeEval function, improve comments

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Cleanup and make things more consistent

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Make EvalNodeHelper public

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

* Fabian's comments.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-06-04 15:47:45 +02:00
Henri DF 2952387ed1 Pass query hints down into remote read query proto (#4122)
Signed-off-by: Henri DF <henridf@gmail.com>
2018-05-08 09:48:13 +01:00
beorn7 94ff07b81d Merge branch 'release-2.2'
Signed-off-by: beorn7 <beorn@soundcloud.com>
2018-04-10 16:50:35 +02:00
Krasi Georgiev ddd46de6f4 Races/3994 (#4005)
Fix race by properly locking access to scrape pools. Use separate mutex for information needed by UI so that UI isn't blocked when targets are being updated.
2018-04-09 15:18:25 +01:00
Mario Trangoni 464e747f1e fix some comments typos (#4059) 2018-04-08 10:51:54 +01:00
Brian Brazil cc39021b2b Provide custom marshalling for Point
Point has a non-standard marshalling, and is also
where the vast majority of CPU time is spent so
it is worth optimising.
2018-03-21 15:02:01 +00:00
Brian Brazil 299b78a887 Switch to json-iterator for v1 api.
This makes queries ~15% faster and cuts cpu
time spent on json encoding by ~40%.
2018-03-21 15:02:01 +00:00
Brian Brazil 8ede14b24c Add unittests for Point json output 2018-03-21 15:02:01 +00:00
Brian Brazil ecd0a9c6ba web: Add benchmark for respond() 2018-03-21 15:02:01 +00:00
Simon Pasquier 83325c8d82 web: replace deprecated InstrumentHandler() (#3862)
* web: replace deprecated InstrumentHandler()

This change replaces the deprecated InstrumentHandler function by the
equivalent functions from the promhttp package.

The following metrics are removed:

* http_request_duration_microseconds (Summary).
* http_request_size_bytes (Summary).
* http_requests_total (Counter).

And the following metrics are added instead:

* prometheus_http_request_duration_seconds (Histogram).
* prometheus_http_response_size_bytes (Histogram).
* promhttp_metric_handler_requests_in_flight (Gauge).
* promhttp_metric_handler_requests_total (Counter).

* Update github.com/prometheus/common/route package

* web: refactor using the new prometheus/common/route package
2018-03-21 08:16:16 +00:00
Fabian Reinartz 3e6c890aea api: add flag to skip head on snapshots 2018-03-08 13:07:12 +01:00
Conor Broderick 99006d3baf Added dropped targets API to targets endpoint (#3870) 2018-02-21 17:26:18 +00:00
Conor Broderick 1fd20fc954 Add dropped alertmanagers to alertmanagers API (#3865) 2018-02-21 09:00:07 +00:00
Bartek Plotka 93a63ac5fd api: Added v1/status/flags endpoint. (#3864)
Endpoint URL: /api/v1/status/flags
Example Output:
```json
{
  "status": "success",
  "data": {
    "alertmanager.notification-queue-capacity": "10000",
    "alertmanager.timeout": "10s",
    "completion-bash": "false",
    "completion-script-bash": "false",
    "completion-script-zsh": "false",
    "config.file": "my_cool_prometheus.yaml",
    "help": "false",
    "help-long": "false",
    "help-man": "false",
    "log.level": "info",
    "query.lookback-delta": "5m",
    "query.max-concurrency": "20",
    "query.timeout": "2m",
    "storage.tsdb.max-block-duration": "36h",
    "storage.tsdb.min-block-duration": "2h",
    "storage.tsdb.no-lockfile": "false",
    "storage.tsdb.path": "data/",
    "storage.tsdb.retention": "15d",
    "version": "false",
    "web.console.libraries": "console_libraries",
    "web.console.templates": "consoles",
    "web.enable-admin-api": "false",
    "web.enable-lifecycle": "false",
    "web.external-url": "",
    "web.listen-address": "0.0.0.0:9090",
    "web.max-connections": "512",
    "web.read-timeout": "5m",
    "web.route-prefix": "/",
    "web.user-assets": ""
  }
}
```

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2018-02-21 08:49:02 +00:00
Fabian Reinartz 7ccd4b39b8 *: implement query params
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
2018-02-13 12:17:22 +01:00
Krasi Georgiev b75428ec19 rename package retrieve to scrape
no fucnctinal changes just renaming retrieval to scrape
2018-02-01 09:55:07 +00:00
Goutham Veeramachaneni 2d73d2b892
Merge pull request #3570 from Gouthamve/colon-snapshot
Make the date returned by snapshot script friendly
2017-12-11 19:04:10 -08:00
Goutham Veeramachaneni bee6864c14 Make the date returned by snapshot script friendly
Fixes #3568

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-12-10 15:14:31 -06:00
Ed Schouten bb724f1bef Deprecate DeduplicateSeriesSet() in favor of NewMergeSeriesSet().
Federation makes use of dedupedSeriesSet to merge SeriesSets for every
query into one output stream. If many match[] arguments are provided,
many dedupedSeriesSet objects will get chained. This has the downside of
causing a potential O(n*k) running time, where n is the number of series
and k the number of match[] arguments.

In the mean time, the storage package provides a mergeSeriesSet that
accomplishes the same with an O(n*log(k)) running time by making use of
a binary heap. Let's just get rid of dedupedSeriesSet and change all
existing callers to use mergeSeriesSet.
2017-12-10 19:51:20 +01:00
Goutham Veeramachaneni e0d917e2f5
Merge pull request #3523 from Gouthamve/clean-tomb
Add endpoint to cleanup tombstones
2017-12-07 14:39:24 -06:00
Goutham Veeramachaneni f0599d4dbf Incorporate review-feedback
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-12-07 09:06:04 -06:00
Julius Volz ab11a457e8 Remove obsolete TODO in API code
In https://github.com/prometheus/prometheus/pull/3230/files, contexts were
added to the Querier() method instead, and Cortex is fine with that.
2017-12-07 23:01:13 +08:00
Goutham Veeramachaneni 311edc5a38 Merge branch 'master' into clean-tomb
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-12-05 10:23:21 -06:00
Goutham Veeramachaneni d8515b2580 Move Admin APIs to v1
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-12-04 00:13:43 +05:30
Brian Brazil d7b3df5ae1 Fix staticcheck errors 2017-12-02 14:52:13 +00:00
Goutham Veeramachaneni 3de10e3b44
Add CleanTombstones API endpoint
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-11-30 19:51:44 +05:30
Fabian Reinartz 83cd270ea4 *: adapt to storage interface changes 2017-11-23 19:05:04 +01:00
David Kaltschmidt af75ce02c1 Review feedback
* renamed MakeQueryStats
* added stats to query() as well
* gofmt
2017-11-16 16:30:48 +01:00
David Kaltschmidt c93e54d240 Adds execution timer stats to the range query
API consumers should be able to get insight into the query run times.
The UI currently measures total roundtrip times. This PR allows for more
fine grained metrics to be exposed.

* adds new timer for total execution time (queue + eval)

* expose new timer, queue timer, and eval timer in stats field of the
 range query response:
```json
{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [],
    "stats": {
      "execQueueTimeNs": 4683,
      "execTotalTimeNs": 2086587,
      "totalEvalTimeNs": 2077851
    }
  }
}
```

* stats field is optional, only set when query parameter `stats` is not
empty

Try it via
```sh
curl 'http://localhost:9090/api/v1/query_range?query=up&start=1486480279&end=1486483879&step=14000&stats=true'
```

Review feedback

* moved query stats json generation to query_stats.go
* use seconds for all query timers
* expose all timers available
* Changed ExecTotalTime string representation from Exec queue total time to Exec total time
2017-11-16 16:05:10 +01:00
Alexey Miroshkin 8c681f4a6c Provide POST endpoint for query+query_range (#3322)
This PR fixes #3072 by providing POST endpoints for `query` and `query_range`.
POST request must be made with `Content-Type: application/x-www-form-urlencoded` header.
2017-11-11 01:53:48 +01:00
Tom Wilkie 746752b946 Merge external labels in order. 2017-10-26 11:44:49 +01:00
Tom Wilkie b22485bef0 Remove spurious test import. 2017-10-26 11:09:43 +01:00