Commit graph

207 commits

Author SHA1 Message Date
Jesus Vazquez c1b669bf9b
Add out-of-order sample support to the TSDB (#11075)
* Introduce out-of-order TSDB support

This implementation is based on this design doc:
https://docs.google.com/document/d/1Kppm7qL9C-BJB1j6yb6-9ObG3AbdZnFUBYPNNWwDBYM/edit?usp=sharing

This commit adds support to accept out-of-order ("OOO") sample into the TSDB
up to a configurable time allowance. If OOO is enabled, overlapping querying
are automatically enabled.

Most of the additions have been borrowed from
https://github.com/grafana/mimir-prometheus/
Here is the list ist of the original commits cherry picked
from mimir-prometheus into this branch:
- 4b2198d7ec
- 2836e5513f
- 00b379c3a5
- ff0dc75758
- a632c73352
- c6f3d4ab33
- 5e8406a1d4
- abde1e0ba1
- e70e769889
- df59320886

Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Dieter Plaetinck <dieter@grafana.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* gofumpt files

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Add license header to missing files

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix OOO tests due to existing chunk disk mapper implementation

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix truncate int overflow

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Add Sync method to the WAL and update tests

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* remove useless sync

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Update minOOOTime after truncating Head

* Update minOOOTime after truncating Head

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add a unit test

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Load OutOfOrderTimeWindow only once per appender

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix OOO Head LabelValues and PostingsForMatchers

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix replay of OOO mmap chunks

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Remove unnecessary err check

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Prevent panic with ApplyConfig

Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Run OOO compaction after restart if there is OOO data from WBL

Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Apply Bartek's suggestions

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Refactor OOO compaction

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Address comments and TODOs

- Added a comment explaining why we need the allow overlapping
  compaction toggle
- Clarified TSDBConfig OutOfOrderTimeWindow doc
- Added an owner to all the TODOs in the code

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Run go format

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix remaining review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Change wbl reference when truncating ooo in TestHeadMinOOOTimeUpdate

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Fix TestWBLAndMmapReplay test failure on windows

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Address most of the feedback

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Refactor the block meta for out of order

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix windows error

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Dieter Plaetinck <dieter@grafana.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2022-09-20 22:35:50 +05:30
Bryan Boreham c438b50133 cmd/promtool: in tests use labels.FromStrings
Replacing code which assumes the internal structure of `Labels`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-09-09 13:34:49 +02:00
Cosrider bef6556ca5
delete redundant alias (#11180)
Signed-off-by: Cosrider <cosrider7@gmail.com>

Signed-off-by: Cosrider <cosrider7@gmail.com>
2022-08-31 15:50:38 +02:00
Bryan Boreham 8b863c42dd
Optimise relabeling by re-using memory (#11147)
* model/relabel: Add benchmark

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* model/relabel: re-use Builder across relabels

Saves memory allocations.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* labels.Builder: allow re-use of result slice

This reduces memory allocations where the caller has a suitable slice available.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* model/relabel: re-use source values slice

To reduce memory allocations.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Unwind one change causing test failures

Restore original behaviour in PopulateLabels, where we must not overwrite the input set.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* relabel: simplify values optimisation

Use a stack-based array for up to 16 source labels, which will be the
vast majority of cases.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* lint

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-08-19 15:27:52 +05:30
Levi Harrison fa9bc5184a
Update and fix interface (#11131)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2022-08-10 10:14:52 +02:00
Levi Harrison d61459d826
no-default-scrape-port feature flag (#9523)
* Add `no-default-scrape-port` flag

Signed-off-by: Levi Harrison <git@leviharrison.dev>
2022-07-20 13:35:47 +02:00
Julien Pivotto 13bd4fd3c8
Fix promtool check config not erroring properly on failures (#10952)
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2022-07-01 14:38:49 +02:00
David Leadbeater 355b8bcf0b
Add --lint-fatal option (#10815)
This keeps the previous behaviour of printing details about duplicate
rules but doesn't exit with a fatal exit code unless turned on.

Signed-off-by: David Leadbeater <dgl@dgl.cx>
2022-06-03 23:33:39 +10:00
Matthieu MOREL 36eee11434
refactor (package cmd): move from github.com/pkg/errors to 'errors' and 'fmt' packages (#10733)
Signed-off-by: Matthieu MOREL <mmorel-35@users.noreply.github.com>

Co-authored-by: Matthieu MOREL <mmorel-35@users.noreply.github.com>
2022-05-24 16:58:59 +10:00
Ben Ye af5ea213f7
promtool: support matchers when querying label values (#10727)
* promtool: support matchers when querying label values

Signed-off-by: Ben Ye <ben.ye@bytedance.com>

* address review comment

Signed-off-by: Ben Ye <ben.ye@bytedance.com>
2022-05-23 11:10:45 +10:00
Matthieu MOREL e2ede285a2
refactor: move from io/ioutil to io and os packages (#10528)
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir

Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
2022-04-27 11:24:36 +02:00
Filip Petkovski 1c1b174a8e
Add a --lint flag to the promtool check rules and check config commands (#10435)
* Add a --lint flag to the promtool check rules and check config commands

Checking rules with promtool emits warnings in the case of duplicate rules.
These warnings do not result in a non-zero exit code and are difficult to
spot in CI environments. Additionally, checking for duplicates is closer
to a lint check rather than a syntax check.

This commit adds a --lint flag to commands which include checking rules.
The flag can be used to enable or disable certain linting options
and cause the execution to return a non-zero exit code in case
those options are not met.

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

* Exit with status 3 on lint error

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
2022-04-06 00:05:11 -04:00
Julien Pivotto f9d8e5245a
Plugins support (#10495)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2022-03-29 14:44:39 +02:00
Wilbert Guo 83a2e52bc2
Add SyncForState Implementation for Ruler HA (#10070)
* continuously syncing activeAt for alerts

Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

* add import

Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

* Refactor SyncForState and add unit tests

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

* Format code

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

* Add hook for syncForState

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Fix go lint

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Refactor syncForState override implementation

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Add syncForState override func as argument to Update()

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Fix go formatting

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Fix circleci test errors

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Remove overrideFunc as argument to run()

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

* remove the syncForState

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* use the override function to decide if need to replace the activeAt or not

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix test case

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix format

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* Trigger build

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fixing comments

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* return the result of map of alerts instead of single one

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* upper case the QueryforStateSeries

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* use a more generic rule group post process function type

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix indentation

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix gofmt

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix lint

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fixing naming

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix comments

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* add the lastEvalTimestamp as parameter

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fmt

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* change funcType to func

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

Co-authored-by: Yijie Qin <qinyijie@amazon.com>
Co-authored-by: Yijie Qin <63399121+qinxx108@users.noreply.github.com>
2022-03-29 02:16:46 +02:00
Alan Protasio 606ef33d91 Track and report Samples Queried per query
We always track total samples queried and add those to the standard set
of stats queries can report.

We also allow optionally tracking per-step samples queried. This must be
enabled both at the engine and query level to be tracked and rendered.
The engine flag is exposed via a Prometheus feature flag, while the
query flag is set when stats=all.

Co-authored-by: Alan Protasio <approtas@amazon.com>
Co-authored-by: Andrew Bloomgarden <blmgrdn@amazon.com>
Co-authored-by: Harkishen Singh <harkishensingh@hotmail.com>
Signed-off-by: Andrew Bloomgarden <blmgrdn@amazon.com>
2022-03-21 23:49:17 +01:00
ian woolf 025528a5d6
cmd: use os.MkdirTemp instead of ioutil.TempDir (#10320)
Signed-off-by: ianwoolf <btw515wolf2@gmail.com>
2022-03-08 14:08:53 +01:00
Łukasz Mierzwa a4317bf0ec
Run gofumpt on all files (#10392)
* Run gofumpt on all files

Getting golangci-lint errors when building on my laptop, possibly because I have newer version of gofumpt then what it was formatted with.
Run gofumpt -w -extra on all files as it will be needed in the future anyway.

* Update golangci-lint to v1.44.2

v1.44.0 upgraded gofumpt so bumping version in CI will help keep formatting correct for everyone

* Address golangci-lint error

Getting 'error-strings: error strings should not be capitalized or end with punctuation or a newline' from revive here.
Drop new line.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2022-03-03 17:21:05 +01:00
beorn7 b39f2739e5 PromQL: Always enable negative offset and @ modifier
This follows the line of argument that the invariant of not looking
ahead of the query time was merely emerging behavior and not a
documented stable feature. Any query that looks ahead of the query
time was simply invalid before the introduction of the negative offset
and the @ modifier.

Signed-off-by: beorn7 <beorn@grafana.com>
2022-01-11 17:08:55 +01:00
chenlujjj 2ce94ac196
Add '--weight' flag to 'promtool check metrics' command (#10045) 2022-01-07 16:58:28 -05:00
David Leadbeater a961062c37
Disable time based retention in tests (#8818)
Fixes #7699.

Signed-off-by: David Leadbeater <dgl@dgl.cx>
2022-01-02 23:46:03 +01:00
Jessica G 174a1147d5
Merge pull request #9861 from JessicaGreben/minor-prom-improvements
Add exit code constants in promtool
2021-12-31 12:07:02 -08:00
jessicagreben 4b03fa3100 replace config exit code with failure exit code
Signed-off-by: jessicagreben <jessicagrebens@gmail.com>
2021-12-30 05:37:57 -08:00
jessicagreben 59f7ef06d0 update exit code for sd
Signed-off-by: jessicagreben <jessicagrebens@gmail.com>
2021-12-18 04:45:15 -08:00
Nicholas Blott c92673fb14
Remove check against cfg so interval/ timeout are always set (#10023)
Signed-off-by: Nicholas Blott <blottn@tcd.ie>
2021-12-16 13:28:46 +01:00
zzehring 42628899b5 promtool: Add --syntax-only flag for check config
This commit adds a `--syntax-only` flag for `promtool check config`.
When passing in this flag, promtool will omit various file existence
checks that would cause the check to fail (e.g. the check would not
fail if `rule_files` files don't exist at their respective paths).

This functionality will allow CI systems to check the syntax of
configs without worrying about referenced files.

Fixes: #5222
Signed-off-by: zzehring <zack.zehring@grafana.com>
2021-12-02 15:33:11 -05:00
jessicagreben 99bb56fc46 add errcodes from sd file
Signed-off-by: jessicagreben <jessicagrebens@gmail.com>
2021-12-01 04:45:18 -08:00
Filip Petkovski 5849521e90
promtool: Fix credentials file check (#9883)
The promtool check config command still uses the bearer_token_file
field which is deprecated in favour of authorization.credentials_file.

This commit modifies the command to use the new field insted.

Fixes #9874

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
2021-11-30 15:02:07 +11:00
jessicagreben 764f2d03a5 add const for exit codes
Signed-off-by: jessicagreben <jessicagrebens@gmail.com>
2021-11-24 09:17:49 -08:00
Matheus Alcantara d9a8c453a0
cmd: use t.TempDir instead of ioutil.TempDir on tests (#9852) 2021-11-23 20:09:28 -05:00
beorn7 c954cd9d1d Move packages out of deprecated pkg directory
This creates a new `model` directory and moves all data-model related
packages over there:
  exemplar labels relabel rulefmt textparse timestamp value

All the others are more or less utilities and have been moved to `util`:
  gate logging modetimevfs pool runtime

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-09 08:03:10 +01:00
Dieter Plaetinck cda025b5b5
TSDB: demistify SeriesRefs and ChunkRefs (#9536)
* TSDB: demistify seriesRefs and ChunkRefs

The TSDB package contains many types of series and chunk references,
all shrouded in uint types.  Often the same uint value may
actually mean one of different types, in non-obvious ways.

This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.

Concretely:

* Use appropriately named types and document their semantics and
  relations.
* Make multiplexing and demuxing of types explicit
  (on the boundaries between concrete implementations and generic
  interfaces).
* Casting between different types should be free.  None of the changes
  should have any impact on how the code runs.

TODO: Implement BlockSeriesRef where appropriate (for a future PR)

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* agent: demistify seriesRefs and ChunkRefs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-11-06 15:40:04 +05:30
Mateusz Gozdek b7bdf6fab2 Fix imports formatting
According to
2829908806 (r58457095).

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Mateusz Gozdek 1a6c2283a3 Format Go source files using 'gofumpt -w -s -extra'
Part of #9557

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Darshan Chaudhary a7e554b158
add check service-discovery command (#8970)
Signed-off-by: darshanime <deathbullet@gmail.com>
2021-11-01 14:42:12 +01:00
Arthur Silva Sens be2599c853
config: Make remote-write required for Agent mode (#9618)
* config: Make remote-write required for Agent mode

Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2021-10-30 01:41:40 +02:00
David Leadbeater c91c2bbea5
promtool: Show more human readable got/exp output (#8064)
Avoid using %#v, nothing needs to parse this, so escaping " and so on
leads to hard to read output.

Add new lines, number and indentation to each alert series output.

Signed-off-by: David Leadbeater <dgl@dgl.cx>
2021-10-28 22:17:18 +11:00
DrAuYueng 69e309d202
Expose TargetsFromGroup/AlertmanagerFromGroup func and reuse this for (#9343)
static/file sd config check in promtool

Signed-off-by: DrAuYueng <ouyang1204@gmail.com>
2021-10-28 02:01:28 +02:00
Julien Pivotto 73255e15f6 Address golint failures from revive
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-10-23 00:53:11 +02:00
Will Tran 97b0738895
add --max-block-duration in promtool create-blocks-from rules (#9511)
* support maxBlockDuration for promtool tsdb create-blocks-from rules

Fixes #9465

Signed-off-by: Will Tran <will@autonomic.ai>

* don't hardcode 2h as the default block size in rules test

Signed-off-by: Will Tran <will@autonomic.ai>
2021-10-21 23:28:37 +02:00
jessicagreben 60d0990886 add more explicit label values
Signed-off-by: jessicagreben <jessicagrebens@gmail.com>
2021-10-18 01:04:13 +02:00
jessicagreben 3da87d2f39 add unit test to check label rule labels override
Signed-off-by: jessicagreben <jessicagrebens@gmail.com>
2021-10-18 01:04:13 +02:00
Julien Pivotto f8372bc6b9 backfill: Apply rule labels after query labels
Fix #9419

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-10-18 01:04:13 +02:00
Julien Pivotto bd217c58a7
Backfill: Do not query after --end (#9340)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-09-15 16:02:41 +02:00
Julien Pivotto 1ea774f184
Merge pull request #9339 from roidelapluie/remove-double-align
backfill: Do not align the start of the group since we align every rule.
2021-09-14 23:46:25 +02:00
Julien Pivotto 2bde71ec5f
Merge pull request #9338 from prometheus/release-2.30
merge back release 2.30
2021-09-14 23:46:11 +02:00
Julien Pivotto 691ce066fb backfill: Do not align the start of the group since we align every rule.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-09-14 23:13:06 +02:00
jessicagreben b0a21f9eab rm overlap, add label builder to fix name bug
Signed-off-by: jessicagreben <jessicagrebens@gmail.com>
2021-09-13 10:32:08 -07:00
fpetkovski 449f874679 promtool: add extended flag for tsdb analysis
The compaction analysis which runs under promtool tsdb analyze can be an
intensive process which slows down the entire command.

This commit adds an --extended flag to tsdb analyze which can be toggled
for running long running tasks, such as compaction analysis.

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
2021-09-08 10:50:01 +02:00
Julien Pivotto ad642a85c0
Merge pull request #9304 from LeviHarrison/backfill-fix-date
Rules backfill: fix new rule importer message
2021-09-07 18:01:03 +02:00
Julien Pivotto bd24e2fb92
Merge pull request #9303 from LeviHarrison/backfill-return-1
Rules backfill: return 1 if unsuccessful
2021-09-07 18:00:42 +02:00