API consumers should be able to get insight into the query run times.
The UI currently measures total roundtrip times. This PR allows for more
fine grained metrics to be exposed.
* adds new timer for total execution time (queue + eval)
* expose new timer, queue timer, and eval timer in stats field of the
range query response:
```json
{
"status": "success",
"data": {
"resultType": "matrix",
"result": [],
"stats": {
"execQueueTimeNs": 4683,
"execTotalTimeNs": 2086587,
"totalEvalTimeNs": 2077851
}
}
}
```
* stats field is optional, only set when query parameter `stats` is not
empty
Try it via
```sh
curl 'http://localhost:9090/api/v1/query_range?query=up&start=1486480279&end=1486483879&step=14000&stats=true'
```
Review feedback
* moved query stats json generation to query_stats.go
* use seconds for all query timers
* expose all timers available
* Changed ExecTotalTime string representation from Exec queue total time to Exec total time
This PR fixes#3072 by providing POST endpoints for `query` and `query_range`.
POST request must be made with `Content-Type: application/x-www-form-urlencoded` header.
* Re-add contexts to storage.Storage.Querier()
These are needed when replacing the storage by a multi-tenant
implementation where the tenant is stored in the context.
The 1.x query interfaces already had contexts, but they got lost in 2.x.
* Convert promql.Engine to use native contexts
This PR adds the `/status/config` endpoint which exposes the currently
loaded Prometheus config. This is the same config that is displayed on
`/config` in the UI in YAML format. The response payload looks like
such:
```
{
"status": "success",
"data": {
"yaml": <CONFIG>
}
}
```
* Use request.Context() instead of a global map of contexts.
* Add some basic opentracing instrumentation on the query path.
* Remove tracehandler endpoint.
* Fixed int64 overflow for timestamp in v1/api parseDuration and parseTime
This led to unexpected results on wrong query with "(...)&start=148966367200.372&end=1489667272.372"
That query is wrong because of `start > end` but actually internal int64 overflow caused start to be something around MinInt64 (huge negative value) and was passing validation.
BTW: Not sure if negative timestamp makes sense even.. But model.Earliest is actually MinInt64, can someone explain me why?
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
* Added missing trailing periods on comments.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
* MOved to only `<` and `>`. Removed equal.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
retreival.Target contains a mutex. It was copied in the Targets()
call. This potentially can wreak a lot of havoc.
It might even have caused the issues reported as #2266 and #2262 .
This extracts Querier as an instantiateable and closeable object
rather than just defining extending methods of the storage interface.
This improves composability and allows abstracting query transactions,
which can be useful for transaction-level caches, consistent data views,
and encapsulating teardown.
This is based on https://github.com/prometheus/prometheus/pull/1997.
This adds contexts to the relevant Storage methods and already passes
PromQL's new per-query context into the storage's query methods.
The immediate motivation supporting multi-tenancy in Frankenstein, but
this could also be used by Prometheus's normal local storage to support
cancellations and timeouts at some point.
For Weaveworks' Frankenstein, we need to support multitenancy. In
Frankenstein, we initially solved this without modifying the promql
package at all: we constructed a new promql.Engine for every
query and injected a storage implementation into that engine which would
be primed to only collect data for a given user.
This is problematic to upstream, however. Prometheus assumes that there
is only one engine: the query concurrency gate is part of the engine,
and the engine contains one central cancellable context to shut down all
queries. Also, creating a new engine for every query seems like overkill.
Thus, we want to be able to pass per-query contexts into a single engine.
This change gets rid of the promql.Engine's built-in base context and
allows passing in a per-query context instead. Central cancellation of
all queries is still possible by deriving all passed-in contexts from
one central one, but this is now the responsibility of the caller. The
central query context is now created in main() and passed into the
relevant components (web handler / API, rule manager).
In a next step, the per-query context would have to be passed to the
storage implementation, so that the storage can implement multi-tenancy
or other features based on the contextual information.
See discussion in
https://groups.google.com/forum/#!topic/prometheus-developers/bkuGbVlvQ9g
The main idea is that the user of a storage shouldn't have to deal with
fingerprints anymore, and should not need to do an individual preload
call for each metric. The storage interface needs to be made more
high-level to not expose these details.
This also makes it easier to reuse the same storage interface for remote
storages later, as fewer roundtrips are required and the fingerprint
concept doesn't work well across the network.
NOTE: this deliberately gets rid of a small optimization in the old
query Analyzer, where we dedupe instants and ranges for the same series.
This should have a minor impact, as most queries do not have multiple
selectors loading the same series (and at the same offset).
I got feedback from different sources about rules and targets being
too heavy in the status tab if their are lots of them.
This change also allows for more fine-granular locking.
Prometheus is Apache 2 licensed, and most source files have the
appropriate copyright license header, but some were missing it without
apparent reason. Correct that by adding it.
The chunk encoding was hardcoded there because it mostly doesn't
matter what encoding is chosen in that test. Since type 1 is
battle-hardened enough, I'm switching to type 2 here so that we can
catch unexpected problems as a byproduct. My expectation is that the
chunk encoding doesn't matter anyway, as said, but then "unexpected
problems" contains the word "unexpected".
WIP: This needs more tests.
It now gets a from and through value, which it may opportunistically
use to optimize the retrieval. With possible future range indices,
this could be used in a very efficient way. This change merely applies
some easy checks, which should nevertheless solve the use case of
heavy rule evaluations on servers with a lot of series churn.
Idea is the following:
- Only archive series that are at least as old as the headChunkTimeout
(which was already extremely unlikely to happen).
- Then maintain a high watermark for the last archival, i.e. no
archived series has a sample more recent than that watermark.
- Any query that doesn't reach to a time before that watermark doesn't
have to touch the archive index at all. (A production server at
Soundcloud with the aforementioned series churn and heavy rule
evaluations spends 50% of its CPU time in archive index
lookups. Since rule evaluations usually only touch very recent
values, most of those lookup should disappear with this change.)
- Federation with a very broad label matcher will profit from this,
too.
As a byproduct, the un-needed MetricForFingerprint method was removed
from the Storage interface.
It's actually happening in several places (and for flags, we use the
standard Go time.Duration...). This at least reduces all our
home-grown parsing to one place (in model).
This adapts some functionality from the Go standard library for string
literal lexing and unquoting/unescaping.
The following string types are now supported:
Double- or single-quoted strings:
These support all escape sequences that Go supports in double-quoted
string literals. The difference is that Prometheus also has
single-quoted strings (instead of single-quoted runes in Go). Raw
newlines are not allowed.
Backtick-quoted raw strings:
Strings quoted in backticks are treated as raw strings just like in Go
and may contain raw newlines and other special characters directly.
Fixes https://github.com/prometheus/prometheus/issues/1122
Fixes https://github.com/prometheus/prometheus/issues/1121
This is with `golint -min_confidence=0.5`.
I left several lint warnings untouched because they were either
incorrect or I felt it was better not to change them at the moment.
This change is conceptually very simple, although the diff is large. It
switches logging from "github.com/golang/glog" to
"github.com/prometheus/log", while not actually changing any log
messages. V(1)-style logging has been changed to be log.Debug*().