Commit graph

224 commits

Author SHA1 Message Date
Brian Brazil c4b4a58e3a Correctly handle on() in alerts. (#2096)
Fixes #2082
2016-10-19 18:38:26 +01:00
Fabian Reinartz 8fa18d564a storage: enhance Querier interface usage
This extracts Querier as an instantiateable and closeable object
rather than just defining extending methods of the storage interface.
This improves composability and allows abstracting query transactions,
which can be useful for transaction-level caches, consistent data views,
and encapsulating teardown.
2016-10-16 10:39:29 +02:00
Fabian Reinartz ccbce0c51f promql: handle NaN in changes() correctly 2016-09-30 11:04:25 +02:00
Julius Volz c187308366 storage: Contextify storage interfaces.
This is based on https://github.com/prometheus/prometheus/pull/1997.

This adds contexts to the relevant Storage methods and already passes
PromQL's new per-query context into the storage's query methods.
The immediate motivation supporting multi-tenancy in Frankenstein, but
this could also be used by Prometheus's normal local storage to support
cancellations and timeouts at some point.
2016-09-19 16:29:07 +02:00
Julius Volz ed5a0f0abe promql: Allow per-query contexts.
For Weaveworks' Frankenstein, we need to support multitenancy. In
Frankenstein, we initially solved this without modifying the promql
package at all: we constructed a new promql.Engine for every
query and injected a storage implementation into that engine which would
be primed to only collect data for a given user.

This is problematic to upstream, however. Prometheus assumes that there
is only one engine: the query concurrency gate is part of the engine,
and the engine contains one central cancellable context to shut down all
queries. Also, creating a new engine for every query seems like overkill.

Thus, we want to be able to pass per-query contexts into a single engine.

This change gets rid of the promql.Engine's built-in base context and
allows passing in a per-query context instead. Central cancellation of
all queries is still possible by deriving all passed-in contexts from
one central one, but this is now the responsibility of the caller. The
central query context is now created in main() and passed into the
relevant components (web handler / API, rule manager).

In a next step, the per-query context would have to be passed to the
storage implementation, so that the storage can implement multi-tenancy
or other features based on the contextual information.
2016-09-19 15:38:17 +02:00
Tobias Schmidt 29ced0090f Fix common english misspellings 2016-09-14 23:23:28 -04:00
Matt Bostock a0201036fa PromQL: Add tests for time/date funcs with arg
Add tests for the date and time functions where an argument is
specified.

Suggested by @grobie:
https://github.com/prometheus/prometheus/pull/1984#issuecomment-246508286

`1136239445` is the reference time used by Go:
https://golang.org/src/time/format.go
2016-09-12 23:12:43 +01:00
Matt Bostock 9628eb5998 PromQL: Add minute() function
Returns the minutes from the current time in UTC. Related to the
`hour()` function.

Fixes #1983.
2016-09-12 20:34:23 +01:00
Tobias Schmidt 04ae6196f2 Fix parsing of label names which are also keywords
The current separation between lexer and parser is a bit fuzzy when it
comes to operators, aggregators and other keywords. The lexer already
tries to determine the type of a token, even though that type might
change depending on the context.

This led to the problematic behavior that no tokens known to the lexer
could be used as label names, including operators (and, by, ...),
aggregators (count, quantile, ...) or other keywords (for, offset, ...).

This change additionally checks whether an identifier is one of these
types. We might want to check whether the specific item identification
should be moved from the lexer to the parser.
2016-09-07 17:45:58 -04:00
Fabian Reinartz ab88057063 Merge pull request #1908 from prometheus/on-dates
Add various time and date functions
2016-08-30 11:03:23 +02:00
Brian Brazil 4680daf237 Default date functions to current time. 2016-08-29 18:22:12 +01:00
Fabian Reinartz 23ddbd64aa Merge pull request #1925 from hashmap/1898-test-race
Fix data race in lexer and lexer test
2016-08-29 09:28:02 +02:00
Alexey Miroshkin bf0e441576 Instantiate lexer inline for the test
Don't use the lex constructor, remove the constructor introduced in the
prevous commit.
2016-08-29 09:20:43 +02:00
Alexey Miroshkin 485f7dde08 Fix data race in lexer and lexer test
As described in #1898 'go test -race' detects a race in lexer code. This
pacth fixes it and also add '-race' option to test target to prevent
regression.
2016-08-26 17:07:17 +02:00
beorn7 71571a8ec4 promql: Fix (and simplify) populating iterators
This was only relevant so far for the benchmark suite as it would
recycle Expr for repetitions. However, the append is unnecessary as
each node is only inspected once when populating iterators, and
population must always start from scratch.

This also introduces error checking during benchmarks and fixes the so
far undetected test errors during benchmarking.

Also, remove a style nit (two golint warnings less…).
2016-08-24 18:37:09 +02:00
Brian Brazil ea1318f38b Short names of some date related functions 2016-08-23 22:34:22 +01:00
Brian Brazil d2ca2b496a Add days_in_month function. 2016-08-22 21:15:35 +01:00
Brian Brazil 0ed31c8c47 Sort list of functions. 2016-08-22 21:15:34 +01:00
Brian Brazil fd7822829c Add date related functions.
Add day_of_month, day_of_week, hour_of_day, month_of_year and year.
This only work for UTC, and ignore leap seconds the same as Go.
2016-08-22 21:15:30 +01:00
Fabian Stäber 08b6556ee6 Assume counters start at zero after reset. 2016-08-12 20:21:04 +02:00
Fabian Reinartz 98c0d33567 Merge pull request #1875 from brancz/idelta-function
add idelta function
2016-08-08 12:33:07 +02:00
Frederic Branczyk f02df4138c refactor duplication of irate and idelta functions implementations 2016-08-08 10:52:00 +02:00
Frederic Branczyk dbf83666bb add idelta function
similar to the irate function the idelta function calculates the delta
function with the last two values
2016-08-08 10:40:50 +02:00
Frederic Branczyk 0ce5e7fe6d move legacy test for delta function 2016-08-08 10:02:58 +02:00
Julius Volz 3bfec97d46 Make the storage interface higher-level.
See discussion in
https://groups.google.com/forum/#!topic/prometheus-developers/bkuGbVlvQ9g

The main idea is that the user of a storage shouldn't have to deal with
fingerprints anymore, and should not need to do an individual preload
call for each metric. The storage interface needs to be made more
high-level to not expose these details.

This also makes it easier to reuse the same storage interface for remote
storages later, as fewer roundtrips are required and the fingerprint
concept doesn't work well across the network.

NOTE: this deliberately gets rid of a small optimization in the old
query Analyzer, where we dedupe instants and ranges for the same series.
This should have a minor impact, as most queries do not have multiple
selectors loading the same series (and at the same offset).
2016-07-25 13:59:22 +02:00
Brian Brazil 0303ccc6a7 Add quantile aggregator. 2016-07-21 00:09:19 +01:00
Brian Brazil 15f9fe0a45 Factor out quantile fucntion. 2016-07-20 23:56:18 +01:00
Brian Brazil b0342ba9ec Add quantile_over_time function 2016-07-20 23:56:18 +01:00
beorn7 fc6737b7fb storage: improve index lookups
tl;dr: This is not a fundamental solution to the indexing problem
(like tindex is) but it at least avoids utilizing the intersection
problem to the greatest possible amount.

In more detail:

Imagine the following query:

    nicely:aggregating:rule{job="foo",env="prod"}

While it uses a nicely aggregating recording rule (which might have a
very low cardinality), Prometheus still intersects the low number of
fingerprints for `{__name__="nicely:aggregating:rule"}` with the many
thousands of fingerprints matching `{job="foo"}` and with the millions
of fingerprints matching `{env="prod"}`. This totally innocuous query
is dead slow if the Prometheus server has a lot of time series with
the `{env="prod"}` label. Ironically, if you make the query more
complicated, it becomes blazingly fast:

    nicely:aggregating:rule{job=~"foo",env=~"prod"}

Why so? Because Prometheus only intersects with non-Equal matchers if
there are no Equal matchers. That's good in this case because it
retrieves the few fingerprints for
`{__name__="nicely:aggregating:rule"}` and then starts right ahead to
retrieve the metric for those FPs and checking individually if they
match the other matchers.

This change is generalizing the idea of when to stop intersecting FPs
and go into "retrieve metrics and check them individually against
remaining matchers" mode:

- First, sort all matchers by "expected cardinality". Matchers
  matching the empty string are always worst (and never used for
  intersections). Equal matchers are in general consider best, but by
  using some crude heuristics, we declare some better than others
  (instance labels or anything that looks like a recording rule).

- Then go through the matchers until we hit a threshold of remaining
  FPs in the intersection. This threshold is higher if we are already
  in the non-Equal matcher area as intersection is even more expensive
  here.

- Once the threshold has been reached (or we have run out of matchers
  that do not match the empty string), start with "retrieve metrics
  and check them individually against remaining matchers".

A beefy server at SoundCloud was spending 67% of its CPU time in index
lookups (fingerprintsForLabelPairs), serving mostly a dashboard that
is exclusively built with recording rules. With this change, it spends
only 35% in fingerprintsForLabelPairs. The CPU usage dropped from 26
cores to 18 cores. The median latency for query_range dropped from 14s
to 50ms(!). As expected, higher percentile latency didn't improve that
much because the new approach is _occasionally_ running into the worst
case while the old one was _systematically_ doing so. The 99th
percentile latency is now about as high as the median before (14s)
while it was almost twice as high before (26s).
2016-07-20 17:35:53 +02:00
Brian Brazil 40f8da699e Merge pull request #1815 from prometheus/stddev
Add stddev_over_time and stdvar_over_time.
2016-07-19 15:48:32 +01:00
Brian Brazil 1edd6875f5 Add stddev_over_time and stdvar_over_time. 2016-07-16 00:34:44 +01:00
Fabian Reinartz f8bb0ee91f Merge pull request #1793 from prometheus/count_values
Add count_values() aggregator.
2016-07-08 11:50:42 +02:00
Brian Brazil 875818d060 Clean out old keywords 2016-07-07 05:30:48 +01:00
Brian Brazil 16690736ab Add count_values() aggregator.
This is useful for counting how many instances
of a job are running a particular version/build.

Fixes #622
2016-07-05 17:14:01 +01:00
Brian Brazil 7f23a4a099 Add type check on topk/bottomk parameter. 2016-07-04 18:03:05 +01:00
Brian Brazil fa9cc15573 Add topk/bottomk tests for multiple buckets. 2016-07-04 13:18:28 +01:00
Brian Brazil 3b0c182eee Move topk/bottomk unittests over to aggregators. 2016-07-04 13:18:28 +01:00
Brian Brazil 3e5136e36d Make topk/bottomk aggregators. 2016-07-04 13:18:19 +01:00
Fabian Reinartz 4d1985e405 Merge pull request #1778 from mattbostock/fix_annotations
promql: Fix annotations conflated with labels
2016-07-01 11:45:18 +02:00
Matt Bostock cc98e164d3 promql: Fix annotations conflated with labels
When converting `AlertStmt` to a string, the alert rule labels were
printed as `ANNOTATIONS` instead of the annotations themselves.

Fix and add a test to catch future regressions.
2016-07-01 10:39:17 +01:00
Brian Brazil 3b89616d82 Allow on, ignoring, by and without wit empty laberls.
This offers new semantics in allowing on() for matching
two single-element vectors with no known common labels.
Previosuly this was often done using on(dummy).

This also allows making it explict that you meant
to do an aggregation without labels via by().

Fixes #1597.
2016-06-24 14:12:51 +01:00
Brian Brazil 246a817300 Flip vector matching to be ignoring by default.
This is a noop semantically.
2016-06-23 17:23:44 +01:00
Julius Volz b7b6717438 Separate query interface out of local.Storage.
PromQL only requires a much narrower interface than local.Storage in
order to run queries. Narrower interfaces are easier to replace and
test, too.

We could also change the web interface to use local.Querier, except that
we'll probably use appending functions from there in the future.
2016-06-23 15:14:38 +02:00
Fabian Reinartz 0e281f5500 Merge pull request #1687 from royels/issue-1629
Added power binop
2016-06-23 10:28:57 +02:00
royels 2fdc5717a3 promql: add power binary operation 2016-06-22 23:34:46 -04:00
Fatih Arslan 362e44501a promql: fix printing annotations of an *AlertStmt
Currently the printer doesn't print the annotations of an `*AlertStmt`
declaration. I've added a test case as well, which fails for the current
master.
2016-06-16 17:43:54 +03:00
beorn7 e3ec8fa83b Merge branch 'release-0.19' 2016-05-29 21:06:44 +02:00
beorn7 5408666387 Correctly stringify GROUP_x modifiers without labels
Since rule evaluations work via String(), this fixes evaluation of
rules containing GROUP_x modifiers without labels. This change is the
minimal bugfix (so that we can release a fixed version without
risk). It does not intend to implement any additional features (like
allowing `GROUP_LEFT()` or `ON()` or even `ON` - see discussion in
https://github.com/prometheus/prometheus/issues/1597 ).
2016-05-28 20:15:02 +02:00
Ali Reza e7eba75690 remove keeping_extra because it's replaced with keep_common
change all keepExtra label into keepCommon, and move action into removed list

change incorrect token list
2016-05-27 00:02:04 +07:00
Brian Brazil 74094947ea effect -> affect 2016-05-12 15:14:48 +01:00