Commit graph

68 commits

Author SHA1 Message Date
Julien Duchesne 8855c2e626
Add prometheus_tsdb_clean_start metric (#8824)
Add cleanup of the lockfile when the db is cleanly closed

The metric describes the status of the lockfile on startup
0: Already existed
1: Did not exist
-1: Disabled

Therefore, if the min value over time of this metric is 0, that means that executions have exited uncleanly
We can then use that metric to have a much lower threshold on the crashlooping alert:

If the metric exists and it has been zero, two restarts is enough to trigger the alarm
If it does not exist (old prom version for example), the current five restarts threshold remains

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Change metric name + set unset value to -1

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Only check the last value of the clean start alert

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Fix test + nit

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>
2021-06-16 15:03:02 +05:30
Levi Harrison b5f6f8fb36 Switched to go-kit/log
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2021-06-11 12:28:36 -04:00
Levi Harrison 7bc11dcb06
React UI: Add Starting Screen (#8662)
* Added walreplay API endpoint

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added starting page to react-ui

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Documented the new endpoint

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed typos

Signed-off-by: Levi Harrison <git@leviharrison.dev>

Co-authored-by: Julius Volz <julius.volz@gmail.com>

* Removed logo

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed isResponding to isUnexpected

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed width of progress bar

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed width of progress bar

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added DB stats object

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Updated starting page to work with new fields

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (pt. 2)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (pt. 3)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (and also implementing a method this time) (pt. 4)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (and also implementing a method this time) (pt. 5)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed const to let

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (pt. 6)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Remove SetStats method

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added comma

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed api

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed to triple equals

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed data response types

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Don't return pointer

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed version

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed interface issue

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed pointer

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed copying lock value error

Signed-off-by: Levi Harrison <git@leviharrison.dev>

Co-authored-by: Julius Volz <julius.volz@gmail.com>
2021-06-05 15:29:32 +01:00
Ben Ye 0a8912433a
allow compact series merger to be configurable (#8836)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2021-05-18 18:38:37 +02:00
nberkley f9e2dd0697
Add support for smaller block chunk segment allocations (#8478)
* Add support for --storage.tsdb.max-chunk-size to suport small chunks for space limited prometheus instances.

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update tsdb/compact.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update tsdb/db.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update cmd/prometheus/main.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Change naming scheme to

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Add a lower bound to --storage.tsdb.max-block-chunk-segment-size

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update storage.md to explain what a chunk segment is

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Force tests

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Fix code style

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-04-15 14:25:01 +05:30
Bryan Boreham c7a62b95ce
GetRef() now returns the label set (#8641)
The purpose of GetRef() is to allow Append() to be called without
the caller needing to copy the labels. To avoid a race where a series
is removed from TSDB between the calls to GetRef() and Append(), we
return TSDB's copy of the labels.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-03-24 15:24:58 +00:00
Bryan Boreham d614ae9ecf
[RFC] Add method to get reference number for TSDB Appender (#8600)
* Add method to get reference number for TSDB Appender

In situations where we need to copy labels before calling Add(),
GetRef() allows to check first, then call AddFast() in the case that the series
is already known.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Add explicit interface for GetRef() method

Suggested in code review by @bwplotka

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Rename OptionalGetRef to GetRef

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Simplify return value of GetRef()

0 can be relied on to mean 'no reference'

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-03-19 19:28:55 +00:00
Callum Styan 289ba11b79
Add circular in-memory exemplars storage (#6635)
* Add circular in-memory exemplars storage

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com>
Co-authored-by: Martin Disibio <mdisibio@gmail.com>

* Fix some comments, clean up exemplar metrics struct and exemplar tests.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Fix exemplar query api null vs empty array issue.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com>
Co-authored-by: Martin Disibio <mdisibio@gmail.com>
2021-03-16 15:17:45 +05:30
Arthur Silva Sens 6a3d55db0a
Rolling tombstones clean up (#8007)
* CleanupTombstones refactored, now reloading blocks after every compaction.

The goal is to remove deletable blocks after every compaction and, thus, decrease disk space used when cleaning tombstones.

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Protect DB against parallel reloads

Signed-off-by: ArthurSens <arthursens2005@gmail.com>

* Fix typos

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-02-17 11:02:43 +05:30
Dustin Hooten b9f0baf6ff
Combine NewHead() args into a HeadOptions struct (#8452)
* Combine NewHead() args into a HeadOptions struct

Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>

* remove overrides params

Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>

* address pr feedback

Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>
2021-02-09 19:42:48 +05:30
Nguyen Le Vu Long fbe960f2c1
fix: remove pre-2.21 tmp blocks on start (#8353)
* fix: remove pre-2.21 tmp blocks on start

Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>

* fix: commenting

Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>
2021-01-09 10:02:26 +01:00
Arthur Silva Sens 7e932637e3
Reload tsdb blocks every minute (#8340)
* Reload tsdb blocks every minute

Signed-off-by: ArthurSens <arthursens2005@gmail.com>

* Proteced tsdb with mutex locks

Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2021-01-07 13:00:08 +05:30
arthursens 8493704b9b Change seconds()*1000 to milliseconds()
Signed-off-by: arthursens <arthursens2005@gmail.com>
2020-12-25 10:45:23 -03:00
Ganesh Vernekar faa1554aa1
Don't call runtime.GC() after compaction (#8276)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-12-22 14:44:17 +00:00
Arthur Silva Sens 64a106c5dd
Logging added for when compaction takes more than the block time range (#8151)
* Logging added for when compaction takes more than the block time range

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Log only if no errors were already logged

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Log duration as human readable string

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Move logging from compactHead() to Compact()

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Compute duration of all head compactions plus wal truncation

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Remove named return added os first commits

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Address nits

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Change miliseconds to seconds to make fuzzit tests happy

Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2020-12-07 21:29:43 +00:00
Marco Pracucci db19e05d93
Add option to customise head chunks write buffer size (#8201)
* Add option to customise head chunks write buffer size

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-11-19 18:30:47 +05:30
Bartlomiej Plotka 3d8826a3d4
MultiError: Refactored MultiError for more concise and safe usage. (#8066)
* MultiError: Refactored MultiError for more concise and safe usage.

* Less lines
* Goland IDE was marking every usage of old MultiError "potential nil" error
* It was easy to forgot using Err() when error was returned, now it's safely assured on compile time.

NOTE: Potentially I would rename package to merrors. (: In different PR.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed review comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix after rebase.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-10-28 15:24:58 +00:00
Ganesh Vernekar 3245b3267b
Don't use returned DB to close resources on TSDB startup error (#8113)
* Don't use returned DB to close resources on TSDB startup error

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add unit test and fix another panic

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comment

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-10-28 15:39:03 +05:30
Julien Pivotto 4e5b1722b3
Move away from testutil, refactor imports (#8087)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-22 11:00:08 +02:00
Arthur Silva Sens c5a832b394
Close resources after failing to startup TSDB (#8031)
* Close resources after failing to startup TSDB

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Return close error instead of logging

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Change named return's name

Signed-off-by: arthursens <arthursens2005@gmail.com>
2020-10-21 20:38:28 +05:30
Brian Brazil fdf1c29224
Fix metric description of prometheus_tsdb_symbol_table_size_bytes. (#8080)
This is how much memory we use to load in the on-disk
symbol tables, not the size of the tables themselves.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-10-21 14:35:40 +01:00
Bartlomiej Plotka 2fe1e9fa93
Create a checkpoint only at the end of Compact call (#8067)
* Create a checkpoint only at the end of Compact call

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix Bartek's offline reviews

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Introduce TruncateInMemory and TruncateWAL

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Small enhancements and test fixing attempts

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add TestOneCheckpointPerCompactCall

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Don't truncate WAL on block compaction

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Simplified the algo.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Better protection around calling truncateWAL, truncate WAL on Head compaction error

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-10-19 20:57:08 +05:30
Julien Pivotto 59733b1a26
TSDB: use blocks instead of db.blocks in condition (#8068)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-19 16:51:54 +05:30
Gayathri Venkatesh 73e2ce1bd6
Do not ignore reload errors in compactHead (#8051)
* Modified unknownRefs to unknownRefs.Load()

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>

* Modified db.go

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>

* Revert "Modified unknownRefs to unknownRefs.Load()"

This reverts commit 79caf595fa9b9193878dc0dd9dec17d58851ae42.

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>

* Made changes to reload error in db.go

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>
2020-10-14 15:05:24 +05:30
Arthur Silva Sens 4f45e201cc
Promtool tsdb list now prints block sizes (#7993)
* promtool tsdb list now prints blocks' size

Signed-off-by: arthursens <arthursens2005@gmail.com>
2020-10-12 23:15:40 +02:00
johncming b521612042
tsdb: simplify code. (#7792)
Signed-off-by: johncming <johncming@yahoo.com>
2020-08-14 15:15:08 +05:30
johncming d19fc71903
tsdb: use NewRangeHead instead. (#7793)
Signed-off-by: johncming <johncming@yahoo.com>
2020-08-13 10:55:35 +01:00
Bartlomiej Plotka f16cbc20d6
tsdb: Bug fix for further continued deletions after crash deletions; added more tests. (#7777)
* tsdb: Bug fix for further continued after crash deletions; added more tests.

Additionally: Added log line for block removal.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comment.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-08-11 15:53:23 +01:00
Bartlomiej Plotka 4ae2ef94e0
tsdb: Delete blocks atomically; Remove tmp blocks on start; Added test. (#7772)
## Changes:

* Rename dir when deleting
* Ignoring blocks with broken meta.json on start (we do that on reload)
* Compactor writes <ulid>.tmp-for-creation blocks instead of just .tmp
* Delete tmp-for-creation and tmp-for-deletion blocks during DB open.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-08-11 06:56:08 +01:00
Bartlomiej Plotka e6d7cc5fa4
tsdb: Added ChunkQueryable implementations to db; unified MergeSeriesSets and vertical to single struct. (#7069)
* tsdb: Added ChunkQueryable implementations to db; unified compactor, querier and fanout block iterating.

Chained to https://github.com/prometheus/prometheus/pull/7059

* NewMerge(Chunk)Querier now takies multiple primaries allowing tsdb DB code to use it.
* Added single SeriesEntry / ChunkEntry for all series implementations.
* Unified all vertical, and non vertical for compact and querying to single
merge series / chunk sets by reusing VerticalSeriesMergeFunc for overlapping algorithm (same logic as before)
* Added block (Base/Chunk/)Querier for block querying. We then use populateAndTomb(Base/Chunk/) to iterate over chunks or samples.
* Refactored endpoint tests and querier tests to include subtests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments from Brian and Beorn.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed snapshot test and added chunk iterator support for DBReadOnly.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed race when iterating over Ats first.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed tests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed populate block tests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed endpoints test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added test & fixed case of head open chunk.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed DBReadOnly tests and bug producing 1 sample chunks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added cases for partial block overlap for multiple full chunks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added extra tests for chunk meta after compaction.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed small vertical merge bug and added more tests for that.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-07-31 16:03:02 +01:00
Annanay 9bba8a6eae Merge branch 'master' into appender-context
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-30 16:43:18 +05:30
Annanay 89129cd39a Address comments
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-30 16:41:13 +05:30
Javier Palomo Almena 348ff4285f
tsdb: Replace sync/atomic with uber-go/atomic in tsdb (#7659)
* tsdb/chunks: Replace sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* tsdb/heaad: Replace sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* vendor: Make go.uber.org/atomic a direct dependency

There is no modifications to go.sum and vendor/ because
it was already vendored.

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* tsdb: Remove comments referring to the sync/atomic alignment bug

Related: https://golang.org/pkg/sync/atomic/#pkg-note-BUG

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
2020-07-28 10:12:42 +05:30
Annanay 7f98a744e5 Add context to Appender interface
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-24 19:40:51 +05:30
Ganesh Vernekar 4a8531a64b
BlocksToDelete function in DB options (#7638)
* Optional retention filter for DB

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Specify len for the map creation

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-22 20:49:33 +05:30
Ganesh Vernekar e65e2e0dac
Fix panic from db metrics (#7501)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-05 10:11:42 +05:30
Bartlomiej Plotka b788986717
storage: Adjusted fully storage layer support for chunk iterators: Remote read client, readyStorage, fanout. (#7059)
* Fixed nits introduced by https://github.com/prometheus/prometheus/pull/7334
* Added ChunkQueryable implementation to fanout and readyStorage.
* Added more comments.
* Changed NewVerticalChunkSeriesMerger to CompactingChunkSeriesMerger, removed tiny interface by reusing VerticalSeriesMergeFunc for overlapping algorithm for
both chunks and series, for both querying and compacting (!) + made sure duplicates are merged.
* Added ErrChunkSeriesSet
* Added Samples interface for seamless []promb.Sample to []tsdbutil.Sample conversion.
* Deprecating non chunks serieset based StreamChunkedReadResponses, added chunk one.
* Improved tests.
* Split remote client into Write (old storage) and read.
* Queryable client is now SampleAndChunkQueryable. Since we cannot use nice QueryableFunc I moved
all config based options to sampleAndChunkQueryableClient to aboid boilerplate.

In next commit: Changes for TSDB.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-06-24 14:41:52 +01:00
Simon Pasquier 2f12049371
tsdb: improve logs when encountering corruption (#7308)
* tsdb: improve logs when encountering corruption

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Wrap corrupted block errors

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Add file path to head chunks

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-17 16:40:00 +02:00
Krasimir Georgiev f4dd45609a
Use min and maxt of the range head when creating a block (#7282)
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
2020-05-22 17:00:06 +05:30
Ganesh Vernekar 1c99adb9fd
Callbacks for lifecycle of series in TSDB (#7159)
* Callbacks for lifecycle of series in TSDB

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add more comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-20 18:52:08 +05:30
Ganesh Vernekar d4b9fe801f
M-map full chunks of Head from disk (#6679)
When appending to the head and a chunk is full it is flushed to the disk and m-mapped (memory mapped) to free up memory

Prom startup now happens in these stages
 - Iterate the m-maped chunks from disk and keep a map of series reference to its slice of mmapped chunks.
- Iterate the WAL as usual. Whenever we create a new series, look for it's mmapped chunks in the map created before and add it to that series.

If a head chunk is corrupted the currpted one and all chunks after that are deleted and the data after the corruption is recovered from the existing WAL which means that a corruption in m-mapped files results in NO data loss.

[Mmaped chunks format](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/head_chunks.md)  - main difference is that the chunk for mmaping now also includes series reference because there is no index for mapping series to chunks.
[The block chunks](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/chunks.md) are accessed from the index which includes the offsets for the chunks in the chunks file - example - chunks of series ID have offsets 200, 500 etc in the chunk files.
In case of mmaped chunks, the offsets are stored in memory and accessed from that. During WAL replay, these offsets are restored by iterating all m-mapped chunks as stated above by matching the series id present in the chunk header and offset of that chunk in that file.

**Prombench results**

_WAL Replay_

1h Wal reply time
30% less wal reply time - 4m31 vs 3m36
2h Wal reply time
20% less wal reply time - 8m16 vs 7m

_Memory During WAL Replay_

High Churn:
10-15% less RAM -  32gb vs 28gb
20% less RAM after compaction 34gb vs 27gb
No Churn:
20-30% less RAM -  23gb vs 18gb
40% less RAM after compaction 32.5gb vs 20gb

Screenshots are in [this comment](https://github.com/prometheus/prometheus/pull/6679#issuecomment-621678932)


Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-06 21:00:00 +05:30
Ben Ye 1e4e37144d
Fixed wrongly handled not ready TSDB on web and API. (#7182)
* fix federate endpoint panic

Signed-off-by: yeya24 <yb532204897@gmail.com>

* Fixed all cases of not ready TSDB being wrongly handled.

* Fixed issue for federation.
* Ensured this will never happen again thanks to interfaces
* Fixes same issue for stats.
* Added tests for readiness.
* Fixed bug in stats. It was:
   status.MaxTime = db.Head().MaxTime()
   status.MinTime = db.Head().MaxTime()


Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-04-29 17:16:14 +01:00
Marek Slabicki 8224ddec23
Capitalizing first letter of all log lines (#7043)
Signed-off-by: Marek Slabicki <thaniri@gmail.com>
2020-04-11 09:22:18 +01:00
Brad Walker 3348930df5
Replace fileutil.ReadDir with ioutil.ReadDir (#7029) (#7033)
* tsdb: Replace fileutil.ReadDir with ioutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>

* tsdb: Remove fileutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>
2020-04-06 19:04:20 +05:30
Ben Kochie 269e7c8091
Fix golint issues.
Signed-off-by: Ben Kochie <superq@gmail.com>
2020-03-23 20:38:43 +01:00
Ganesh Vernekar e64a149984
Close Head in DBReadOnly.FlushWAL (#7022)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-23 14:49:44 +05:30
李国忠 261cbab8e9
remove Unused parameter 'reg' in wal.Open function (#6941)
Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>
2020-03-10 11:01:47 +05:30
Bartlomiej Plotka a20bebf7eb Moved readyStorage to main.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka 59c9d6ef45 Addressed Brian's comments, moved metrics to main.go
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka cfba92a133 Addressed comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00