Commit graph

78 commits

Author SHA1 Message Date
Ganesh Vernekar d4b9fe801f
M-map full chunks of Head from disk (#6679)
When appending to the head and a chunk is full it is flushed to the disk and m-mapped (memory mapped) to free up memory

Prom startup now happens in these stages
 - Iterate the m-maped chunks from disk and keep a map of series reference to its slice of mmapped chunks.
- Iterate the WAL as usual. Whenever we create a new series, look for it's mmapped chunks in the map created before and add it to that series.

If a head chunk is corrupted the currpted one and all chunks after that are deleted and the data after the corruption is recovered from the existing WAL which means that a corruption in m-mapped files results in NO data loss.

[Mmaped chunks format](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/head_chunks.md)  - main difference is that the chunk for mmaping now also includes series reference because there is no index for mapping series to chunks.
[The block chunks](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/chunks.md) are accessed from the index which includes the offsets for the chunks in the chunks file - example - chunks of series ID have offsets 200, 500 etc in the chunk files.
In case of mmaped chunks, the offsets are stored in memory and accessed from that. During WAL replay, these offsets are restored by iterating all m-mapped chunks as stated above by matching the series id present in the chunk header and offset of that chunk in that file.

**Prombench results**

_WAL Replay_

1h Wal reply time
30% less wal reply time - 4m31 vs 3m36
2h Wal reply time
20% less wal reply time - 8m16 vs 7m

_Memory During WAL Replay_

High Churn:
10-15% less RAM -  32gb vs 28gb
20% less RAM after compaction 34gb vs 27gb
No Churn:
20-30% less RAM -  23gb vs 18gb
40% less RAM after compaction 32.5gb vs 20gb

Screenshots are in [this comment](https://github.com/prometheus/prometheus/pull/6679#issuecomment-621678932)


Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-06 21:00:00 +05:30
Ben Ye 1e4e37144d
Fixed wrongly handled not ready TSDB on web and API. (#7182)
* fix federate endpoint panic

Signed-off-by: yeya24 <yb532204897@gmail.com>

* Fixed all cases of not ready TSDB being wrongly handled.

* Fixed issue for federation.
* Ensured this will never happen again thanks to interfaces
* Fixes same issue for stats.
* Added tests for readiness.
* Fixed bug in stats. It was:
   status.MaxTime = db.Head().MaxTime()
   status.MinTime = db.Head().MaxTime()


Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-04-29 17:16:14 +01:00
Marek Slabicki 8224ddec23
Capitalizing first letter of all log lines (#7043)
Signed-off-by: Marek Slabicki <thaniri@gmail.com>
2020-04-11 09:22:18 +01:00
Brad Walker 3348930df5
Replace fileutil.ReadDir with ioutil.ReadDir (#7029) (#7033)
* tsdb: Replace fileutil.ReadDir with ioutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>

* tsdb: Remove fileutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>
2020-04-06 19:04:20 +05:30
Ben Kochie 269e7c8091
Fix golint issues.
Signed-off-by: Ben Kochie <superq@gmail.com>
2020-03-23 20:38:43 +01:00
Ganesh Vernekar e64a149984
Close Head in DBReadOnly.FlushWAL (#7022)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-23 14:49:44 +05:30
李国忠 261cbab8e9
remove Unused parameter 'reg' in wal.Open function (#6941)
Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>
2020-03-10 11:01:47 +05:30
Bartlomiej Plotka a20bebf7eb Moved readyStorage to main.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka 59c9d6ef45 Addressed Brian's comments, moved metrics to main.go
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka cfba92a133 Addressed comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka 2cf637fbf5 Addressed comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka 34426766d8 Unify Iterator interfaces. All point to storage now.
This is part of https://github.com/prometheus/prometheus/pull/5882 that can be done to simplify things.
All todos I added will be fixed in follow up PRs.

* querier.Querier, querier.Appender, querier.SeriesSet, and querier.Series interfaces merged
with storage interface.go. All imports that.
* querier.SeriesIterator replaced by chunkenc.Iterator
* Added chunkenc.Iterator.Seek method and tests for xor implementation (?)
* Since we properly handle SelectParams for Select methods I adjusted min max
based on that. This should help in terms of performance for queries with functions like offset.
* added Seek to deletedIterator and test.
* storage/tsdb was removed as it was only a unnecessary glue with incompatible structs.

No logic was changed, only different source of abstractions, so no need for benchmarks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:54 +00:00
Bartlomiej Plotka 88af973663
Merge pull request #6820 from codesome/break-compact
Break DB.Compact and DB.CompactHead and DB.CompactBlocks
2020-02-17 13:20:21 +00:00
Zhou Hao e628fd7735
fix comments spelling (#6829)
Signed-off-by: Zhou Hao <zhouhao@cn.fujitsu.com>
2020-02-17 12:45:11 +01:00
Ganesh Vernekar 6f1d2ec73e
Break DB.Compact and DB.compactHead and DB.compactBlocks. Add DB.CompactHead.
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-02-14 20:33:26 +05:30
Thor 17d8c49919
made stripe size configurable (#6644)
Signed-off-by: Thor <thansen@digitalocean.com>
2020-01-30 12:42:43 +05:30
Ganesh Vernekar e0733a99e3
Expose DB.Compact()
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-01-20 12:59:49 +05:30
Julien Pivotto 398bd84d6f small tsdb fixes (#6616)
* tsdb: register compactions_skipped_total

That metric was not registered.

I also reordered the metrics in the list.

* tsdb: display correct error when WAL can't be read

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-13 22:15:45 +00:00
Josh Soref 91d76c8023 Spelling (#6517)
* spelling: alertmanager

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: attributes

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: autocomplete

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: bootstrap

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: caught

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: chunkenc

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: compaction

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: corrupted

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: deletable

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: expected

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: fine-grained

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: initialized

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: iteration

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: javascript

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: multiple

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: number

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: overlapping

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: possible

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: postings

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: procedure

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: programmatic

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: queuing

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: querier

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: repairing

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: received

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: reproducible

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: retention

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: sample

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: segements

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: semantic

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: software [LICENSE]

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: staging

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: timestamp

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: unfortunately

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: uvarint

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: subsequently

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: ressamples

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-01-02 15:54:09 +01:00
johncming 0ae048ff78 tsdb: add error details in log. (#6415)
Signed-off-by: johncming <johncming@yahoo.com>
2019-12-09 10:37:01 +00:00
Dipack P Panjabi e2dd5b61ef Added CreateBlock and CreateHead functions to new file (#6331)
* Added CreateBlock and CreateHead functions to new file to make it reusable across packages.

Signed-off-by: Dipack P Panjabi <dipack.panjabi@gmail.com>
2019-11-21 19:10:25 +07:00
Tom Wilkie de0a772b8e Port tsdb to use pkg/labels. (#6326)
* Port tsdb to use pkg/labels.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* Get tests passing.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* Remove useless cast.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* Appease linters.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2019-11-18 11:53:33 -08:00
Dipack P Panjabi ce7bab04dd Compute WAL size and use it during retention size checks (#5886)
* Compute WAL size and account for it when applying the retention settings.

Signed-off-by: Dipack P Panjabi <dpanjabi@hudson-trading.com>
2019-11-12 09:40:16 +07:00
Krasi Georgiev 81d284f806
Merge the 2.13 release branch to master (#6117) 2019-10-09 17:41:46 +02:00
陈谭军 103f26d188 fix the wrong word (#6069)
Signed-off-by: chentanjun <2799194073@qq.com>
2019-09-30 09:54:55 -06:00
Lucas Servén Marín 8ab628b354 tsdb: allow readonly DB to create flush WAL (#6006)
This PR gives the readonly DB the ability to create blocks from the WAL.
In order to implement this, we modify DBReadOnly.Blocks() to return an
empty slice and no error if no blocks are found.
xref: https://github.com/prometheus/tsdb/issues/346#issuecomment-520786524

Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
2019-09-13 11:25:21 +01:00
Ganesh Vernekar 5ecef3542d
Cleanup after merging tsdb into prometheus
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2019-08-13 14:04:14 +05:30
Ganesh Vernekar 7cf09b0395
Moving tsdb into its own subdirectory
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2019-08-13 13:58:49 +05:30
Renamed from db.go (Browse further)