Commit graph

537 commits

Author SHA1 Message Date
Julien Pivotto 6f9e7ff750
Drop metric name in bool comparison between two instant vectors (#7819)
* Drop metric name in bool comparison between two instant vectors

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-08-22 21:04:03 +02:00
Julien Pivotto 20ab94fedf
Hints: Separating out the range and offsets of PromQL subqueries (#7667)
Fix #7629

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-08-11 07:21:39 +01:00
Annanay Agarwal 118aeab02c
Make context key type public (#7748)
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-08-05 09:51:36 +01:00
Julien Pivotto d867491364
Human-friendly durations in PromQL (#7713)
* Add support for user-friendly durations

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-08-04 21:12:41 +02:00
johncming 31929b83d5
promql: use explicit type declare instead of string. (#7716)
Signed-off-by: johncming <johncming@yahoo.com>
2020-08-02 09:57:38 +01:00
johncming 1c1b394e5e
promql: Swap order of parseBrokenJSON. (#7718)
Signed-off-by: johncming <johncming@yahoo.com>
2020-08-02 09:48:57 +01:00
Bartlomiej Plotka e6d7cc5fa4
tsdb: Added ChunkQueryable implementations to db; unified MergeSeriesSets and vertical to single struct. (#7069)
* tsdb: Added ChunkQueryable implementations to db; unified compactor, querier and fanout block iterating.

Chained to https://github.com/prometheus/prometheus/pull/7059

* NewMerge(Chunk)Querier now takies multiple primaries allowing tsdb DB code to use it.
* Added single SeriesEntry / ChunkEntry for all series implementations.
* Unified all vertical, and non vertical for compact and querying to single
merge series / chunk sets by reusing VerticalSeriesMergeFunc for overlapping algorithm (same logic as before)
* Added block (Base/Chunk/)Querier for block querying. We then use populateAndTomb(Base/Chunk/) to iterate over chunks or samples.
* Refactored endpoint tests and querier tests to include subtests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments from Brian and Beorn.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed snapshot test and added chunk iterator support for DBReadOnly.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed race when iterating over Ats first.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed tests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed populate block tests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed endpoints test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added test & fixed case of head open chunk.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed DBReadOnly tests and bug producing 1 sample chunks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added cases for partial block overlap for multiple full chunks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added extra tests for chunk meta after compaction.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed small vertical merge bug and added more tests for that.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-07-31 16:03:02 +01:00
Annanay 9bba8a6eae Merge branch 'master' into appender-context
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-30 16:43:18 +05:30
Julien Pivotto 22acb87e09
refactoring: make sure that query_duration_seconds metrics are the same (#7668)
* refactoring: make sure that query_duration_seconds are the same

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-25 11:55:59 +02:00
Owen Diehl 00b7bdb1b6
parser.{Expr,Statement} publicly implementable (#7639)
* parser.{Expr,Statement} publicly implementable

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
2020-07-25 09:05:58 +01:00
Annanay 7f98a744e5 Add context to Appender interface
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-24 19:40:51 +05:30
Guangwen Feng 6b7ac2ac1b
Add unit test case to improve test coverage for matcher.go (#7658)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2020-07-24 11:21:42 +01:00
Julien Pivotto 93e9c010f3
Add more Go leak tests (#7652)
* Implement go leak test for promql

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Implement go leak test for Consul SD

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Implement go leak test in discovery manager

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-24 10:10:20 +01:00
Bartlomiej Plotka 841b13641c
promql: Refactored subquery hint tests and added todos. (#7636)
* promql: Refactorer subquery hint tests and added todos.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* fmt.


Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixes.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-07-23 23:05:43 +01:00
Bartlomiej Plotka a0df8a383a
promql: Removed global and add ability to have better interval for subqueries if not specified (#7628)
* promql: Removed global and add ability to have better interval for subqueries if not specified

## Changes
* Refactored tests for better hints testing
* Added various TODO in places to enhance.
* Moved DefaultEvalInterval global to opts with func(rangeMillis int64) int64 function instead

Motivation: At Thanos we would love to have better control over the subqueries step/interval.
This is important to choose proper resolution. I think having proper step also does not harm for
Prometheus and remote read users. Especially on stateless querier we do not know evaluation interval
and in fact putting global can be wrong to assume for Prometheus even.

I think ideally we could try to have at least 3 samples within the range, the same
way Prometheus UI and Grafana assumes.

Anyway this interfaces allows to decide on promQL user basis.

Open question: Is taking parent interval a smart move?

Motivation for removing global: I spent 1h fighting with:


=== RUN   TestEvaluations
    TestEvaluations: promql_test.go:31: unexpected error: error evaluating query "absent_over_time(rate(nonexistant[5m])[5m:])" (line 687): unexpected error: runtime error: integer divide by zero
--- FAIL: TestEvaluations (0.32s)
FAIL

At the end I found that this fails on most of the versions including this master if you run this test alone. If run together with many
other tests it passes. This is due to SetDefaultEvaluationInterval(1 * time.Minute)
in test that is ran before TestEvaluations. Thanks to globals (:

Let's fix it by dropping this global.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added issue links for TODOs.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Removed irrelevant changes.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-07-22 14:39:51 +01:00
Guangwen Feng b30654211c
Fix incorrect arguments order in TestExprString (#7602)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2020-07-17 13:38:04 +01:00
Julien Pivotto d77b56e88e
Fix avg_over_time for nan and float64 overflows (#7346)
* Fix avg_over_time with Inf and NaN values

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-13 17:30:50 +02:00
Tobias Guggenmos 1f73073d73
Make without a valid metric identifier (#7533)
Discussion see #7532.

cc @juliusv

Signed-off-by: Tobias Guggenmos <tobias.guggenmos@uni-ulm.de>
2020-07-08 12:58:12 +02:00
Julien Pivotto 72425d4e3d
Add group() aggregator (#7480)
* Add group() aggregator

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-30 16:51:18 +02:00
Guangwen Feng 9ab072b470
Fix golint issue caused by typo (#7475)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2020-06-28 11:03:09 +01:00
Linas Medžiūnas 7eaffa7180
Fix off-by-one error in funcHistogramQuantile / ensureMonotonic (#7393)
* Fix off-by-one error in funcHistogramQuantile / ensureMonotonic
* Additional coverage for nonmonotonic histogram buckets

Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
2020-06-15 11:32:10 +01:00
Kemal Akkoyun 66dfb951c4
*: Consistent Error/Warning handling for SeriesSet iterator: Allowing Async Select (#7251)
* Add errors and Warnings to SeriesSet

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Change Querier interface and refactor accordingly

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Refactor promql/engine to propagate warnings at eval stage

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review issues

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Make sure all the series from all Selects are pre-advanced

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review issues

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Separate merge series sets

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Clean

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Refactor merge querier failure handling

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Refactored and simplified fanout with improvements from incoming chunk iterator PRs.

* Secondary logic is hidden, instead of weird failed series set logic we had.
* Fanout is well commented
* Fanout closing record all errors
* MergeQuerier improved API (clearer)
* deferredGenericMergeSeriesSet is not needed as we return no samples anyway for failed series sets (next = false).

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix formatting

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix CI issues

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Added final tests for error handling.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

* Moved hints in populate to be allocated only when needed.
* Used sync.Once in secondary Querier to achieve all-or-nothing partial response logic.
* Select after first Next is done will panic.

NOTE: in lazySeriesSet in theory we could just panic, I think however we can
totally just return error, it will panic in expand anyway.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Utilize errWithWarnings

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix recently introduced expansion issue

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add tests for secondary querier error handling

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Implement lazy merge

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add name to test cases

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Reorganize

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review comments

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review comments

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Remove redundant warnings

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix rebase mistake

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-06-09 17:57:31 +01:00
Julien Pivotto 4284dd1f1b
promql: cleanup: use errors.As (#7351)
This was TODO because circleci was not in go1.13 yet.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-05 19:51:25 +02:00
B++ d6374ae1b6
Return NaN for histogram_quantile when buckets have 0 observations (#7318)
Signed-off-by: jberny <f.bernardi89@gmail.com>
2020-06-01 09:40:39 +01:00
Julien Pivotto 58c445e6ef
Fuzz: limit input size (#7317)
We know that fuzzParseExpr and fuzzParseMetricSelector make use of heavy
things like regexes, which take a fairly big amount of memory.

OSS-Fuzz does not offer a proper way to increase the memory [1], therefore
we limit the input size [2].

[1] https://google.github.io/oss-fuzz/faq/#how-do-you-handle-timeouts-and-ooms
[2] https://google.github.io/oss-fuzz/getting-started/new-project-guide/#input-size

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-05-31 09:42:56 +02:00
Brian Brazil 3932a7149f
Correctly track points no longer used by matrixIterSlice's slice. (#7307)
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-05-28 13:36:30 +01:00
Callum Styan 5bb7f00d00
change labelset comparison in promql engine to avoid false positive during detection of duplicates (#7058)
* Use go1.14 new hash/maphash to hash both RHS and LHS instead of XOR'ing
which has been resulting in hash collisions.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Refactor engine labelset signature generation, just use labels.Labels
instead of hashes.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address review comments; function comments + store result of
lhs.String+rhs.String as key.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Replace all signatureFunc usage with signatureFuncString.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Make optimizations to labels String function and generation of rhs+lhs
as string in resultMetric.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Use separate string functions that don't use strconv just for engine
maps.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Use a byte invalid separator instead of quoting and have a buffer
attached to EvalNodeHelper instead of using a global pool in the labels
package.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address review comments.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address more review comments, labels has a function that now builds a
byte slice without turning it into a string.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Use two different non-ascii hex codes as byte separators between labels
and between sets of labels when building bytes of a Labels struct.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* We only need the 2nd byte invalid sep. at the beginning of a
labels.Bytes

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2020-05-12 14:03:15 -07:00
Ben Ye 1e4e37144d
Fixed wrongly handled not ready TSDB on web and API. (#7182)
* fix federate endpoint panic

Signed-off-by: yeya24 <yb532204897@gmail.com>

* Fixed all cases of not ready TSDB being wrongly handled.

* Fixed issue for federation.
* Ensured this will never happen again thanks to interfaces
* Fixes same issue for stats.
* Added tests for readiness.
* Fixed bug in stats. It was:
   status.MaxTime = db.Head().MaxTime()
   status.MinTime = db.Head().MaxTime()


Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-04-29 17:16:14 +01:00
Vasily Sliouniaev 0393b188c9
Add Jaeger (#7148)
* Trace remote read

Signed-off-by: vas <vasily.sliouniaev@jet.com>

* Use jaeger

Signed-off-by: vas <vasily.sliouniaev@jet.com>
2020-04-23 02:05:55 +02:00
Julien Pivotto 1f6f8e60ee promql/parser: Cleanup generatedParserResult accross reuse
Reusing the same generatedParserResult ends up in strange panics:
See #7131 and #7127.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-04-16 01:51:08 +02:00
Marek Slabicki 8224ddec23
Capitalizing first letter of all log lines (#7043)
Signed-off-by: Marek Slabicki <thaniri@gmail.com>
2020-04-11 09:22:18 +01:00
Brian Brazil 7646cbca32
Use .UTC everywhere we use time.Unix (#7066)
time.Unix attaches the local timezone, which can then
leak out (e.g. in the alert json). While this is harmless,
we should be consistent.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-03-29 17:35:39 +01:00
Ben Kochie 269e7c8091
Fix golint issues.
Signed-off-by: Ben Kochie <superq@gmail.com>
2020-03-23 20:38:43 +01:00
johncming bdc45c2b9e
remove unused code. (#7019)
Signed-off-by: johncming <johncming@yahoo.com>
2020-03-21 07:15:51 +00:00
Björn Rabenstein 1da83305be
Merge pull request #7009 from prometheus/release-2.17
Merge release-2.17 into master
2020-03-19 13:46:28 +01:00
Tobias Guggenmos 012161d90d
PromQL: Fix lexer error handling (#6958)
* PromQL: Fix lexer error handling

This fixes bugs in the handling of lexer errors that are only noticeable for users of the language server and caused https://github.com/prometheus-community/promql-langserver/issues/104 .

Signed-off-by: Tobias Guggenmos <tobias.guggenmos@uni-ulm.de>

* Add test for error position ranges

Signed-off-by: Tobias Guggenmos <tobias.guggenmos@uni-ulm.de>
2020-03-16 15:47:47 +01:00
Björn Rabenstein a28fa010ee
TSDB: Extract parts out of populateSeries (#6983)
This addresses fabxc's TODO.

More importantly, it now properly defers the
querier.Close(). Previously, if a panic happened after creation of the
querier within the populateSeries function, querier.Close() was never called.

The latter was responsible for #6977.

Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-14 09:03:40 +01:00
Bartlomiej Plotka fe802f29c9 storage: Removed SelectSorted method; Simplified interface; Added requirement for remote read to sort response.
This is technically BREAKING CHANGE, but it was like this from the beginning: I just notice that we rely in
Prometheus on remote read being sorted. This is because we use selected data from remote reads in MergeSeriesSet
which rely on sorting.

I found during work on https://github.com/prometheus/prometheus/pull/5882 that
we do so many repetitions because of this, for not good reason. I think
I found a good balance between convenience and readability with just one method.
Smaller the interface = better.

Also I don't know what TestSelectSorted was testing, but now it's testing sorting.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-13 13:06:25 +00:00
Björn Rabenstein bc703b6456
Use struct{} as underlying type for context keys (#6965)
This is an alternative to #6963.

Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-11 15:05:35 +01:00
Julien Pivotto 5ddd1dcf0f
Fix panic when parsing varags (#6940)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-08 13:09:24 +01:00
Tobias Guggenmos 1dbd799354
PromQL: Fix regression tests (#6935)
This PR fixes the regression tests for the issue fixed in #6931 .

The reason for that is that all of the invalid queries that triggered the regression have become more or less valid syntax in #6933 (they might still fail typechecking).

Signed-off-by: Tobias Guggenmos <tobias.guggenmos@uni-ulm.de>
2020-03-06 08:17:01 +00:00
Brian Brazil 44ad28dd5e
PromQL: Allow more keywords as metric names (#6933)
* Allow more keywords as metric names
* Add documentation about forbidden keywords

Signed-off-by: Tobias Guggenmos <tobias.guggenmos@uni-ulm.de>
2020-03-05 13:20:53 +00:00
Brian Brazil 7164b58945
PromQL: Fix parser panic (#6931)
Signed-off-by: Tobias Guggenmos <tobias.guggenmos@uni-ulm.de>
2020-03-05 08:03:38 +00:00
李国忠 2bf4952049
remove Unused parameter 'sf' in calcTrendValue function (#6900)
Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>
2020-03-03 08:23:27 +00:00
LongKB 82f7ed208b
Remove some duplicated words (#6882)
Signed-off-by: Pham Duc Hanh <hanhpd@fujitsu.com>
2020-02-27 07:08:31 +01:00
Tobias Guggenmos 3d74fcfa6a Bartek's suggestions
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2020-02-25 13:57:30 +01:00
Tobias Guggenmos f9db320e5a Look up function call in all cases
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2020-02-24 13:45:03 +01:00
Tobias Guggenmos 9ebf6bd1e6 Remove superfluous blank lines
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2020-02-21 13:36:57 +01:00
Tobias Guggenmos 7143d64fc1
Julien's suggestion
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>

Co-Authored-By: Julien Pivotto <roidelapluie@gmail.com>
2020-02-21 13:34:15 +01:00
Tobias Guggenmos 4124828c00 Add test to check that promql.FunctionCalls and parser.Functions contain the same functions
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2020-02-21 12:43:30 +01:00