Commit graph

106 commits

Author SHA1 Message Date
Tobias Guggenmos b18b6cb332 PromQL: Avoid lexer item copies and allocations (#6584)
* PromQL: Avoid lexer item copies and allocations

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2020-01-09 11:26:58 +00:00
Tobias Guggenmos 4aef43a9f5 Cleanup: Remove parser switching logic (#6583)
During the PromQL parser rewrite there was some logic put in place that allowed switching between the non generated and the generated parser. Since the parser is now fully generated this is not needed anymore.

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2020-01-08 14:59:25 +00:00
Tobias Guggenmos 2064abab40 Cleanup: Remove unused variable (#6581)
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2020-01-08 14:58:52 +00:00
Tobias Guggenmos 7c7746257c Cleanup: remove function that does not do anything (#6580)
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2020-01-08 14:58:21 +00:00
Tobias Guggenmos 3d6cf1c289 PromQL: Make parser completely generated (#6548)
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2020-01-08 11:04:47 +00:00
Josh Soref 91d76c8023 Spelling (#6517)
* spelling: alertmanager

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: attributes

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: autocomplete

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: bootstrap

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: caught

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: chunkenc

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: compaction

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: corrupted

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: deletable

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: expected

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: fine-grained

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: initialized

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: iteration

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: javascript

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: multiple

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: number

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: overlapping

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: possible

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: postings

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: procedure

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: programmatic

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: queuing

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: querier

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: repairing

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: received

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: reproducible

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: retention

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: sample

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: segements

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: semantic

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: software [LICENSE]

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: staging

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: timestamp

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: unfortunately

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: uvarint

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: subsequently

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: ressamples

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-01-02 15:54:09 +01:00
Tobias Guggenmos 213a8fe89a PromQL: Parse Series descriptions using the generated parser (#6494)
* Use generated parser for series descriptions

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-12-21 08:44:12 +00:00
Tobias Guggenmos db1258f2a5 PromQL: Refactor error message generation (#6481)
* Add parser method to produce errors messages about unexpected items
* PromQL: use parser.unexpected in generated parser

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-12-18 17:36:43 +00:00
Tobias Guggenmos 9e34f08ac3 PromQL: Parse grouping opts with the generated parser (#6472)
* PromQL: Parse grouping opts with the generated parser

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-12-18 14:18:52 +00:00
Tobias Guggenmos 53615412b4 PromQL: Parse Metrics using generated parser (#6466)
* Parse Metrics with the generated parser

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-12-16 16:38:18 +00:00
Tobias Guggenmos 8cb4a48e2e PromQL: Parse label sets using the generated parser (#6432)
* Add grammar for label_sets
* Parse label Sets using the generated parser
* Allow trailing commas for label sets and selectors
* Add test to trigger all possible error messages for label matchers

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-12-16 13:58:46 +00:00
Tobias Guggenmos 5c503d85f7 PromQL: export lexer (#6435)
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-12-09 19:03:31 +00:00
Tobias Guggenmos 35c1f31721 PromQL: Use more standart format for error positions (#6433)
The most common format (used by go, gcc and clang) for compiler error positions seems to be

`filename:line:char:` or `line:char:` if the filename is unknown.

This PR adapts the PromQL parser to use this convention.

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-12-09 16:39:03 +00:00
Tobias Guggenmos 3bb715031f PromQL: Use generated parser to parse label matchers (#6410)
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-12-05 16:16:12 +00:00
Tobias Guggenmos 408574a6e1 promql: Allow injecting fake tokens into the generated parser (#6381)
* promql: Allow injecting fake tokens into the generated parser

Yacc grammars do not support having multiple start symbols.

To work around that restriction, it is possible to inject fake tokens into the lexer stream,
as described here https://www.gnu.org/software/bison/manual/html_node/Multiple-start_002dsymbols.html .

This is part of the parser rewrite effort described in #6256.

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-11-27 12:59:03 +00:00
Tobias Guggenmos bbd92b85da promql: Use capitalized names for item types (#6371)
For yacc generated parsers there is the convention to capitalize the names of item types provided by the lexer, which makes it easy to distinct lexer tokens (capitalized) from nonterminal symbols (not capitalized) in language grammars.

This convention is also followed by the (non generated) go compiler (see https://golang.org/pkg/go/token/#Token).

Part of the parser rewrite described in #6256.

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-11-26 13:29:42 +00:00
Tobias Guggenmos c229ed17e2 promql: Implement yyLexer interface (#6370)
This is the first step towards a generated lexer as described in #6256.

It adds methods to the parser struct, that make it implement the yyLexer interface required by a yacc generated parser, as described here: https://godoc.org/golang.org/x/tools/cmd/goyacc .

The yyLexer interface is implemented by the parser struct instead of the lexer struct for the following reasons:

* Both parsers have a lookahead that the lexer does not know about. This solution makes it possible to synchronize these lookaheads when switching parsers.
* The routines to handle parser errors are not accessible to the lexer.

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-11-26 13:28:36 +00:00
Tobias Guggenmos c63259b83c promql: Clean up parser struct (#6360)
* promql: Clean up parser struct

The parser struct used two have two somewhat misused fields:

peekCount int
token     [3]item

By reading the code carefully one notices, that peekCount always has the value 0 or 1 and that only the first element of token is ever accessed.

To make this clearer, this commit replaces the token array with a single variable and the peekCount int with a boolean.

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-11-25 12:29:14 +00:00
Tobias Guggenmos fe80cf4734 promql: Eliminate dead code (#6215)
peek() already ensures to not return a ItemComment so checking for this
is redundant.

Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2019-10-25 10:18:20 +02:00
Bjoern Rabenstein a92ef68dd8 Fix staticcheck errors
Not sure why they only show up now.

Signed-off-by: Bjoern Rabenstein <bjoern@rabenste.in>
2019-04-17 01:40:10 +02:00
Tariq Ibrahim 8fdfa8abea refine error handling in prometheus (#5388)
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-26 00:01:12 +01:00
Julius Volz 8155cc4992
Expose lexer item types (#5358)
* Expose lexer item types

We have generally agreed to expose AST types / values that are necessary
to make sense of the AST outside of the promql package. Currently the
`UnaryExpr`, `BinaryExpr`, and `AggregateExpr` AST nodes store the lexer
item type to indicate the operator type, but since the individual item
types aren't exposed, an external user of the package cannot determine
the operator type. So this PR exposes them.

Although not all item types are required to make sense of the AST (some
are really only used in the lexer), I decided to expose them all here to
be somewhat more consistent. Another option would be to not use lexer
item types at all in AST nodes.

The concrete motivation is my work on the PromQL->Flux transpiler, but
this ought to be useful for other cases as well.

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Fix item type names in tests

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2019-03-14 20:53:55 +01:00
Nguyen Hai Truong aed9ea144a Remove duplicated words in comments
Although it is spelling mistakes, it might make an affects
while reading.

Co-Authored-By: Kim Bao Long longkb@vn.fujitsu.com
Signed-off-by: Nguyen Hai Truong <truongnh@vn.fujitsu.com>
2019-02-20 17:41:02 -08:00
Ganesh Vernekar dbe55c1352 Subquery (#4831)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-12-22 13:47:13 +00:00
Brian Brazil 8edaa8ad4d
Fix goroutine leak in lexer/parser. (#4858)
When there was an error in the parser, the
lexer goroutine was left running.

Also make runtime panic test actually test things.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-11-12 18:47:13 +00:00
Frederic Branczyk bda9781ccd
Merge pull request #3839 from brancz/remove-old-alert-record
promql: Remove old and unused alerting/reconding syntax
2018-11-06 15:53:27 +01:00
Ganesh Vernekar 73db8b8cea [bugfix] Parse negative value in PromQL (#4564)
* Parse negative value in PromQL
* Enforce space between values

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-09-13 09:08:01 +01:00
Frederic Branczyk b0b3e3dd74
promql: Remove old and unused alerting/reconding syntax
Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
2018-08-07 15:14:06 +02:00
Henri DF 986674a790 Make some lexing errors more informative (#4167)
Signed-off-by: Henri DF <henridf@gmail.com>
2018-05-16 16:18:15 +01:00
Warren Fernandes d49a3df55b Parser test cleanup (#3977)
* parser test cleanup

- Test against the exported package functions instead of the private functions.

* Improves readability of TestParseSeries

- Moves package function closer to parser function
2018-03-20 14:30:52 +00:00
Nikunj Aggarwal 998dfcbac6 Expose itemtype outside the package (#3933) 2018-03-08 16:52:44 +00:00
ferhat elmas ffa673f7d8 General simplifications (#3887)
Another try as in #1516
2018-02-26 07:58:10 +00:00
Brian Brazil 99905f82a6 Remove keep_common modifier.
See #3060
2017-10-05 13:27:48 +01:00
Fabian Reinartz d21f149745 *: migrate to go-kit/log 2017-09-08 22:01:51 +05:30
Fabian Reinartz ca2b68889b Merge branch 'master' into dev-2.0 2017-06-23 13:15:44 +02:00
Harsh Agarwal 16867c89a7 implement label_join issue 1147 (#2806)
Replace OptionalArgs int with Variadic int.
2017-06-16 14:51:22 +01:00
Brian Brazil 80b40e6d91 Add initial staleness handing to promql.
For instant vectors, if "stale" is the newest sample
ignore the timeseries.

For range vectors, filter out "stale" samples.

Make it possible to inject "stale" samples in promql tests.
2017-05-16 18:33:51 +01:00
Fabian Reinartz 71fe0c58a8 promql: misc fixes 2016-12-28 11:32:15 +01:00
Fabian Reinartz fecf9532b9 *: fix misc compile errors 2016-12-25 11:42:57 +01:00
Fabian Reinartz 9ea10d5265 promql: use labels.Builder to modify labels 2016-12-24 14:35:24 +01:00
Fabian Reinartz c6cd998905 promql: use local labels, add conversion 2016-12-24 14:01:37 +01:00
Fabian Reinartz 09666e2e2a promql: make scalar public 2016-12-24 10:44:04 +01:00
Fabian Reinartz b3f71df350 promql: make matrix exported 2016-12-24 10:42:54 +01:00
Fabian Reinartz a62df87022 promql: rename vector 2016-12-24 10:40:09 +01:00
Fabian Reinartz 15a931dbdb promql: migrate model types, use tsdb interfaces 2016-12-24 00:39:52 +01:00
Tristan Colgate 68fc15fe4e Report type names in the form used in documentation 2016-11-18 10:12:55 +00:00
Tobias Schmidt 29ced0090f Fix common english misspellings 2016-09-14 23:23:28 -04:00
Tobias Schmidt 04ae6196f2 Fix parsing of label names which are also keywords
The current separation between lexer and parser is a bit fuzzy when it
comes to operators, aggregators and other keywords. The lexer already
tries to determine the type of a token, even though that type might
change depending on the context.

This led to the problematic behavior that no tokens known to the lexer
could be used as label names, including operators (and, by, ...),
aggregators (count, quantile, ...) or other keywords (for, offset, ...).

This change additionally checks whether an identifier is one of these
types. We might want to check whether the specific item identification
should be moved from the lexer to the parser.
2016-09-07 17:45:58 -04:00
Brian Brazil 0303ccc6a7 Add quantile aggregator. 2016-07-21 00:09:19 +01:00
beorn7 fc6737b7fb storage: improve index lookups
tl;dr: This is not a fundamental solution to the indexing problem
(like tindex is) but it at least avoids utilizing the intersection
problem to the greatest possible amount.

In more detail:

Imagine the following query:

    nicely:aggregating:rule{job="foo",env="prod"}

While it uses a nicely aggregating recording rule (which might have a
very low cardinality), Prometheus still intersects the low number of
fingerprints for `{__name__="nicely:aggregating:rule"}` with the many
thousands of fingerprints matching `{job="foo"}` and with the millions
of fingerprints matching `{env="prod"}`. This totally innocuous query
is dead slow if the Prometheus server has a lot of time series with
the `{env="prod"}` label. Ironically, if you make the query more
complicated, it becomes blazingly fast:

    nicely:aggregating:rule{job=~"foo",env=~"prod"}

Why so? Because Prometheus only intersects with non-Equal matchers if
there are no Equal matchers. That's good in this case because it
retrieves the few fingerprints for
`{__name__="nicely:aggregating:rule"}` and then starts right ahead to
retrieve the metric for those FPs and checking individually if they
match the other matchers.

This change is generalizing the idea of when to stop intersecting FPs
and go into "retrieve metrics and check them individually against
remaining matchers" mode:

- First, sort all matchers by "expected cardinality". Matchers
  matching the empty string are always worst (and never used for
  intersections). Equal matchers are in general consider best, but by
  using some crude heuristics, we declare some better than others
  (instance labels or anything that looks like a recording rule).

- Then go through the matchers until we hit a threshold of remaining
  FPs in the intersection. This threshold is higher if we are already
  in the non-Equal matcher area as intersection is even more expensive
  here.

- Once the threshold has been reached (or we have run out of matchers
  that do not match the empty string), start with "retrieve metrics
  and check them individually against remaining matchers".

A beefy server at SoundCloud was spending 67% of its CPU time in index
lookups (fingerprintsForLabelPairs), serving mostly a dashboard that
is exclusively built with recording rules. With this change, it spends
only 35% in fingerprintsForLabelPairs. The CPU usage dropped from 26
cores to 18 cores. The median latency for query_range dropped from 14s
to 50ms(!). As expected, higher percentile latency didn't improve that
much because the new approach is _occasionally_ running into the worst
case while the old one was _systematically_ doing so. The 99th
percentile latency is now about as high as the median before (14s)
while it was almost twice as high before (26s).
2016-07-20 17:35:53 +02:00