Commit graph

261 commits

Author SHA1 Message Date
Fabian Reinartz def912ce0e Integrate new WAL and checkpoints
Remove the old WAL and drop in the new one

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:25:30 -04:00
codwu 667e539a7a Merge branch 'master' of https://github.com/prometheus/tsdb into tsdb-delete 2018-07-06 20:21:32 +08:00
Benoît Knecht 1e1b2e163d Make interval overlap comparisons more explicit
Blocks are half-open intervals [a, b), while all other intervals
(chunks, head, ...) are closed intervals [a, b].

Make that distinction explicit by defining `OverlapsClosedInterval()`
methods for blocks and chunks, and using them in place of the more
generic `intervalOverlap()` function.

This change also fixes `db.Querier()` and `db.Delete()`, which could
previously return one extraneous block at the end of the specified
interval.

Signed-off-by: Benoît Knecht <benoit.knecht@fsfe.org>
2018-07-02 10:35:08 +02:00
Fabian Reinartz ea607b9fc3 Log series on rollback
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-06-28 09:04:07 -04:00
codwu 84a45cb79a add rwmutex to prevent concurrent map read when delete series
Signed-off-by: codwu <wuhan9087@163.com>
2018-06-08 19:52:01 +08:00
Goutham Veeramachaneni 90d55672d1
Merge pull request #286 from mattbostock/rename_high_timestamp
head: Rename highTimestamp to maxt
2018-04-05 17:41:48 +05:30
Matt Bostock 55ea5ae6b1 head: Rename highTimestamp to maxt
`maxt` seems more consistent with `mint` and other uses of `maxt`
elsewhere in the code, if I've understand the intent correctly.
2018-02-21 16:01:12 +00:00
Simon Pasquier 79defa54df Fix crash when a series has no block 2018-02-21 16:45:06 +01:00
Matt Bostock bc97c6c6be Misc fixes (#285)
* Fix typo in head.go

pralellize -> paralellize

* Remove commented out code

It's dead code, remove it.

* Correct reference to sample buffer
2018-02-21 16:38:59 +01:00
Matt Bostock 50211e89ad Fix typos in comments (#254)
a the -> the
timestmap -> timestamp
badded -> padded
its -> it is
callers -> caller's
2018-01-13 17:51:50 +00:00
Fabian Reinartz 67f0ca8f0e Move index and chunk encoders to own packages 2017-12-21 11:27:54 +01:00
Goutham Veeramachaneni 239cbae154
Merge pull request #228 from Gouthamve/not-matchers
Select series with label unset for != and !~
2017-12-21 12:10:51 +05:30
Goutham Veeramachaneni 3158b03e6c Select series with label unset for != and !~
Fixes https://github.com/prometheus/prometheus/issues/3575

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2017-12-19 02:47:31 +00:00
Goutham Veeramachaneni 05d62ca842 Make sure gc'ed chunks are handled properly
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2017-12-14 19:48:39 +00:00
Fabian Reinartz f1512a368a Expose ChunkSeriesSet and lookups methods. 2017-11-13 14:02:32 +01:00
Fabian Reinartz 3ef4326114 Refactor tombstone reader types 2017-11-13 13:38:07 +01:00
Fabian Reinartz e5ce2bef43 Add explicit error to Querier.Select
This has been a frequent source of debugging pain since errors are
potentially delayed to a much later point. They bubble up in an
unrelated execution path.
2017-11-13 12:16:58 +01:00
Julius Volz 1dad3370fd Close WAL when closing the DB
Also, the `wal` field of the `DB` was not used anywhere, so this removes
it.
2017-11-11 14:56:23 +01:00
Oren Shomron 6ca5e52b69 Typo in prometheus_tsdb_head_samples_appended_total description (#188) 2017-10-25 19:12:18 +01:00
Fabian Reinartz 82796db37b Ensure near-empty chunks end at correct boundary
We were determining a chunk's end time once it was one quarter full to
compute it so all chunks have uniform number of samples.
This accidentally skipped the case where series started near the end of
a chunk range/block and never reached that threshold. As a result they
got persisted but were continued across the range.

This resulted in corrupted persisted data.
2017-10-25 09:51:55 +02:00
Fabian Reinartz c93162c751 Merge pull request #185 from prometheus/walslice
Limit WAL sample processing batch size
2017-10-25 09:30:52 +02:00
Fabian Reinartz 820bd8b170 Merge pull request #181 from prometheus/gcchunk
Fix dangling chunk reference panic
2017-10-25 09:30:16 +02:00
Fabian Reinartz 7bc07d80b6 Merge pull request #180 from prometheus/fixbugz
Fix race in symbol table re-creation
2017-10-24 11:39:13 +02:00
Fabian Reinartz 9749aa2a3e head: limit WAL sample processing batch size 2017-10-23 16:22:24 +02:00
Fabian Reinartz d17104f1f0 Prefix all metrics with prometheus_* 2017-10-20 12:32:32 +02:00
Fabian Reinartz ea817e169b Return nop iterator for invalid chunk references 2017-10-20 09:43:52 +02:00
Fabian Reinartz 6dcca97755 Fix race in symbol table re-creation 2017-10-20 09:29:03 +02:00
Fabian Reinartz 7f8fa07cf7 Merge pull request #176 from prometheus/races
Races
2017-10-12 15:27:08 +02:00
Fabian Reinartz 065f42f58c head: track number of series not found errors in metric 2017-10-12 15:25:12 +02:00
Fabian Reinartz 88305e7612 Access chunk time range while holding lock 2017-10-11 10:17:59 +02:00
Fabian Reinartz 106eaf39d1 Ensure workers terminated fully before reading unknownRefs 2017-10-11 10:12:29 +02:00
Fabian Reinartz 665955da48 Clarify postings index semantics, handle staleness
The postings list index may point to series that no longer
exist during garbage collection. This clarifies that this is valid
behavior.
It would be possible, though more complex, to always keep them in sync.
However, series existance means nothing in itself as the queried time
range defines whether there's actual data. Thus our definition is sane
overall as long as drift is kept small.
2017-10-11 09:37:19 +02:00
Fabian Reinartz c3e502b194 Merge pull request #168 from prometheus/fasterwal
wal: decode and process in separate threads.
2017-10-10 18:11:44 +02:00
Fabian Reinartz fb9da52b11 Add more verbose error handling for closing, reduce locking
This commit introduces error returns in various places and is explicit
about closing persisted blocks.
{Index,Chunk,Tombstone}Readers are more consistent about their Close()
method. Whenever a reader is retrieved, the corresponding close method
must eventually be called. We use this to track pending readers against
persisted blocks.

Querier's against the DB no longer hold a read lock for their entire
lifecycle. This avoids long running queriers to starve new ones when we
have to acquire a write lock when reloading blocks.
2017-10-10 12:13:37 +02:00
Fabian Reinartz 7efb830d70 wal: parallelize sample processing 2017-10-09 15:22:38 +02:00
Simon Pasquier e858c0826c Fix innocuous typo in variable names
This change fixes the variable names holding the tsdb_head_max_time and
tsdb_head_min_time metrics. It is a cosmetic change to improve the
code readability as the metric values are taken from the correct
variables.
2017-10-09 12:24:53 +02:00
Fabian Reinartz d3682d701c wal: decode and process in separate threads. 2017-10-06 14:46:52 +02:00
Fabian Reinartz cd2e26b7fc Load postings in batch on startup
This allows to insert IDs to postings out of order until
a trigger function is called. This avoids the insertion sort we usually
do which can be very costly since WAL entries are more out of order than
regular adds.
2017-10-06 10:39:10 +02:00
Goutham Veeramachaneni 203012169a
snapshot: Remove truncation check to restore func.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-10-04 16:58:07 +05:30
Goutham Veeramachaneni c35d3a65bd
Add levels to all log lines.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-28 12:49:34 +05:30
Fabian Reinartz 69f105f4f9 Merge pull request #151 from prometheus/waltrunc
Use boolean function instead of postings to drop WAL series
2017-09-21 15:04:43 +02:00
Fabian Reinartz 1e88ba06b4 Use boolean function instead of postings to drop WAL series
There is not guarantee or requirement for WAL writers to only add
series entries in increasing order of IDs. A postings list cannot look
back and thus unordered WAL entries would skip over IDs to not truncate
from the WAL.
We replace it with a simple boolean check function that does not require
order.
2017-09-21 13:31:01 +02:00
Fabian Reinartz 6ee254e353 Ensure postings are always sorted
IDs for new series are handed out before the postings are locked. Thus
series are not indexed in order of their IDs, which could result in only
partially sorted postings list.
Iterating over those silently skipped elements as the sort invariant was
violated.
2017-09-21 09:38:18 +02:00
Fabian Reinartz 162a48e4f2 Create series with ID recorded in WAL when reading it back 2017-09-19 11:31:16 +02:00
Fabian Reinartz 7ada9cd805 Simplify series create logic in head 2017-09-18 12:38:39 +02:00
Fabian Reinartz ab8d9b9706 Add missing unlock on early return 2017-09-18 11:23:22 +02:00
Fabian Reinartz f904cd385f Do not build a superflous 'all' postings 2017-09-08 18:41:43 +02:00
Fabian Reinartz 6892fc6dcb Finish old WAL segment async, default to no fsync
We were still fsyncing while holding the write lock when we cut a new
segment. Given we cannot do anything but logging errors, we might just
as well complete segments asynchronously.

There's not realistic use case where one would fsync after every WAL
entry, thus make the default of a flush interval of 0 to never fsync
which is a much more likely use case.
2017-09-08 18:41:12 +02:00
Fabian Reinartz 1d5f85817d Fix various races 2017-09-08 08:48:19 +02:00
Fabian Reinartz 0db4c227b7 Fix min/max time handling and concurrent crc32 usage 2017-09-07 13:04:02 +02:00
Fabian Reinartz 81222849bc Filter WAL data in Head, misc fixes 2017-09-06 16:20:37 +02:00
Fabian Reinartz 33e9bdf403 WAL refactoring and truncation fixes and test 2017-09-06 14:59:25 +02:00
Fabian Reinartz c36d574290 Replace single head lock with granular locks
This adds various new locks to replace the single big lock on
the head. All parts now must be COW as they may be held by clients
after initial retrieval.
Series by ID and hashes are now held in a stripe lock to reduce
contention and total holding time during GC. This should reduce
starvation of readers.
2017-09-05 14:41:39 +02:00
Fabian Reinartz 1ddedf2b30 Change series ID from uint32 to uint64 2017-09-04 16:08:38 +02:00
Goutham Veeramachaneni 1698c516ad [WIP]: WAL implementation
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-04 14:52:40 +02:00
Fabian Reinartz 893b6ec506 Add tests for GC and chunk truncation 2017-09-01 14:38:49 +02:00
Fabian Reinartz 4f037da462 Remove defer statement in hot path 2017-09-01 12:09:29 +02:00
Fabian Reinartz 5cf2662074 Refactor WAL into Head and misc improvements 2017-09-01 11:50:58 +02:00
Fabian Reinartz 8209e3ec23 Add various metrics 2017-09-01 11:50:58 +02:00
Fabian Reinartz 3901b6e70b Remove multiple heads
This changes the structure to a single WAL backed by a single head
block.
Parts of the head block can be compacted. This relieves us from any head
amangement and greatly simplifies any consistency and isolation concerns
by just having a single head.
2017-09-01 11:50:58 +02:00
Goutham Veeramachaneni 7438ed7035 Expose Intervals type for use by TombstoneReader.
TombstoneReader is exposed but Intervals is not.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-08-25 16:06:36 +05:30
Fabian Reinartz 905af27cf9 Refactor compactor 2017-08-09 11:10:29 +02:00
Fabian Reinartz 66ff7b12e9 Pool Chunk objects during compaction 2017-08-08 17:35:34 +02:00
Fabian Reinartz 2644c8665c Don't allocate ChunkMetas, reuse postings slices 2017-08-06 20:41:24 +02:00
Fabian Reinartz 96d7f540d4 Persist series without allocating the full set
Change index persistence for series to not be accumulated in memory
before being written as one large batch. `Labels` and `ChunkMeta`
objects are reused.
This cuts down memory spikes during compaction of multiple blocks
significantly.

As part of the the Index{Reader,Writer} now have an explicit notion of
symbols and series must be inserted in order.
2017-08-06 12:06:41 +02:00
Goutham Veeramachaneni f1ae239c20 Persist the right MaxTime when snapshotting
This is because we cut a new block from where the snapshotted block ends
if we restore from backups and highTimestamp would be where we should be
 starting from.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-07-12 13:48:13 +02:00
Fabian Reinartz 1e74c155eb Return empty string to signal non-caching 2017-06-26 14:58:00 +02:00
Fabian Reinartz 3410559c1b Compact head block early
Let older head blocks be compacted once the newest once has samples at
50% of its total range. This allows the memory of the compacted blocks
to be released and garbage collected before a new head block gets
created. Thereby the number of head blocks is 1 or 2 instead of 2 or 3
and memory spikes are reduced.
2017-06-26 08:52:59 +02:00
Fabian Reinartz 9963a4c7c3 Merge pull request #95 from Gouthamve/wal-ahead
Fix race condition for 2 appenders having same ts
2017-06-12 11:17:49 +02:00
Goutham Veeramachaneni 73cc5bae51 Colocate defer statements near relevant functions
Signed-off-by: Goutham Veeramachaneni <goutham@boomerangcommerce.com>
2017-06-12 14:37:58 +05:30
Goutham Veeramachaneni b51a05044e
Fix race condition for 2 appenders having same ts
Race:
Suppose we have 100 existing series inside a HeadBlock.
Now we open two appenders in two routines A1, A2 and append 30 new series and
60 new series respectively with some common series.

Both try to commit at the same time and the following happens in the given order:

A2 executes createSeries()
A1 executes createSeries() (with its common series referencing the ids from A2)
A1 persists its newlabels, samples
A2 persists its newlabels, samples

Now when reading it back, we read A1's samples which reference A2's id and
thereby fail.

Ref: prometheus/promtheus#2795

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-08 16:26:25 +05:30
Fabian Reinartz 05e411a8eb Improve heuristic to spread chunks across block 2017-06-08 11:30:32 +02:00
Goutham Veeramachaneni a110a64abd
Add full Snapshot support
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-06 18:15:54 +05:30
Goutham Veeramachaneni a1c8425357
Initial implementation of HeadBlock Snapshots
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-05 13:48:31 +05:30
Goutham Veeramachaneni 29c73f05f2
Make sure that mint and maxt are not modified.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-27 21:59:49 +05:30
Goutham Veeramachaneni 44e9ae38b5
Incorporate PR feedback.
* Expose Stone as it is used in an exported method.
* Move from tombstoneReader to []Stone for the same reason as above.
* Make WAL reading a little cleaner.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-26 21:26:31 +05:30
Goutham Veeramachaneni 6febabeb28
Final delete fixes.
* Make sure no reads happen on the block when delete is in progress.
* Fix bugs in compaction.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-26 16:31:45 +05:30
Goutham Veeramachaneni c211ec4f49
Fix concurrent map access.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-24 16:58:04 +05:30
Goutham Veeramachaneni f29fb62fba
Make TombstoneReader a Getter.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-24 11:24:24 +05:30
Goutham Veeramachaneni 9bf7aa9af1
Misc. fixes incorporating feedback.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-23 18:13:30 +05:30
Goutham Veeramachaneni 31cf939448
Add NumTombstones to BlockMeta.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-23 17:37:04 +05:30
Goutham Veeramachaneni 3eb4119ab1
Make HeadBlock use WAL.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-23 16:15:16 +05:30
Goutham Veeramachaneni 244b73fce1
Rename for clarity and consistency.
Misc. changes for code cleanliness.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-22 16:42:36 +05:30
Goutham Veeramachaneni 8434019ad9
Merge branch 'master' into deletes-1
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-22 12:58:38 +05:30
Goutham Veeramachaneni 662d8173fe
Make Appends after Delete visible.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-22 11:28:24 +05:30
Goutham Veeramachaneni d6bd64357b
Fix Delete on HeadBlock
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-19 22:54:29 +05:30
Goutham Veeramachaneni 45d3db4e9e
Use a *mapTombstoneReader instead of map
We need to recalculate the sorted ref list everytime we make a
Tombstones() call. This avoids that.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-19 11:56:37 +05:30
Fabian Reinartz 39df7e2bba Switch blocks to ULID directories, drop sequenc numbers 2017-05-18 16:09:30 +02:00
Fabian Reinartz 285bc07030 Switch append refs to string 2017-05-18 10:56:57 +02:00
Goutham Veeramachaneni 22c1b5b492
Make SeriesSets use tombstones.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-17 14:49:42 +05:30
Goutham Veeramachaneni 34a86af3c6
Move tombstones to their own thing.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-17 08:36:56 +05:30
Goutham Veeramachaneni cea3c88f17
Add Tombstones() method to Block.
Also add Seek() to TombstoneReader

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-16 19:48:28 +05:30
Goutham Veeramachaneni 4f1d857590
Implement Delete on HeadBlock
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-15 23:28:14 +05:30
Goutham Veeramachaneni 5579efbd5b
Initial implentation of Deletes on persistedBlock
Very much a WIP

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-14 14:36:26 +05:30
Fabian Reinartz 8b51b7e2be Make WAL for HeadBlock composeable. 2017-05-13 18:14:18 +02:00
Fabian Reinartz 4862b261d0 Abstract WAL into interface 2017-05-13 17:09:26 +02:00
Fabian Reinartz 535532ca02 Export refdSample
The type was part of a exported method signatures and should therefore
be exported as well.
2017-05-12 17:06:26 +02:00
Fabian Reinartz 5534e6c53c Make HeadBlock impl public, make interface private 2017-05-12 16:34:41 +02:00
Goutham Veeramachaneni 2fa647f50b Fix missing postings in Merge and Intersect (#77)
* Test for a previous implematation of Intersect

Before we were moving the postings list everytime we create a new
chained `intersectPostings`. That was causing some postings to be
skipped. This test fails on the older version.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

* Advance on Seek only when valid.

Issue:
Before in mergedPostings and others we advance everytime we `Seek`,
which causes issues with `Intersect`.

Take the case, where we have a mergedPostings = m merging, a: {10, 20, 30} and
b: {15, 25, 35}. Everytime we `Seek`, we do a.Seek and b.Seek.

Now if we Intersect m with {21, 22, 23, 30}, we would do Seek({21,22,23}) which
would advance a and b beyond 30.

Fix:
Now we advance only when the seeking value is greater than the current
value, as the definition specifies.

Also, posting 0 will not be a valid posting and will be used to signal
finished or un-initialized PostingsList.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

* Add test for Merge+Intersect edgecase.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

* Add comments to trivial tests.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-12 09:44:41 +02:00
Fabian Reinartz 291137781b Merge branch 'panic-fix2' of https://github.com/Gouthamve/tsdb into Gouthamve-panic-fix2 2017-05-09 16:22:19 +02:00