Commit graph

242 commits

Author SHA1 Message Date
Fabian Reinartz 82796db37b Ensure near-empty chunks end at correct boundary
We were determining a chunk's end time once it was one quarter full to
compute it so all chunks have uniform number of samples.
This accidentally skipped the case where series started near the end of
a chunk range/block and never reached that threshold. As a result they
got persisted but were continued across the range.

This resulted in corrupted persisted data.
2017-10-25 09:51:55 +02:00
Fabian Reinartz c93162c751 Merge pull request #185 from prometheus/walslice
Limit WAL sample processing batch size
2017-10-25 09:30:52 +02:00
Fabian Reinartz 820bd8b170 Merge pull request #181 from prometheus/gcchunk
Fix dangling chunk reference panic
2017-10-25 09:30:16 +02:00
Fabian Reinartz 7bc07d80b6 Merge pull request #180 from prometheus/fixbugz
Fix race in symbol table re-creation
2017-10-24 11:39:13 +02:00
Fabian Reinartz 9749aa2a3e head: limit WAL sample processing batch size 2017-10-23 16:22:24 +02:00
Fabian Reinartz d17104f1f0 Prefix all metrics with prometheus_* 2017-10-20 12:32:32 +02:00
Fabian Reinartz ea817e169b Return nop iterator for invalid chunk references 2017-10-20 09:43:52 +02:00
Fabian Reinartz 6dcca97755 Fix race in symbol table re-creation 2017-10-20 09:29:03 +02:00
Fabian Reinartz 7f8fa07cf7 Merge pull request #176 from prometheus/races
Races
2017-10-12 15:27:08 +02:00
Fabian Reinartz 065f42f58c head: track number of series not found errors in metric 2017-10-12 15:25:12 +02:00
Fabian Reinartz 88305e7612 Access chunk time range while holding lock 2017-10-11 10:17:59 +02:00
Fabian Reinartz 106eaf39d1 Ensure workers terminated fully before reading unknownRefs 2017-10-11 10:12:29 +02:00
Fabian Reinartz 665955da48 Clarify postings index semantics, handle staleness
The postings list index may point to series that no longer
exist during garbage collection. This clarifies that this is valid
behavior.
It would be possible, though more complex, to always keep them in sync.
However, series existance means nothing in itself as the queried time
range defines whether there's actual data. Thus our definition is sane
overall as long as drift is kept small.
2017-10-11 09:37:19 +02:00
Fabian Reinartz c3e502b194 Merge pull request #168 from prometheus/fasterwal
wal: decode and process in separate threads.
2017-10-10 18:11:44 +02:00
Fabian Reinartz fb9da52b11 Add more verbose error handling for closing, reduce locking
This commit introduces error returns in various places and is explicit
about closing persisted blocks.
{Index,Chunk,Tombstone}Readers are more consistent about their Close()
method. Whenever a reader is retrieved, the corresponding close method
must eventually be called. We use this to track pending readers against
persisted blocks.

Querier's against the DB no longer hold a read lock for their entire
lifecycle. This avoids long running queriers to starve new ones when we
have to acquire a write lock when reloading blocks.
2017-10-10 12:13:37 +02:00
Fabian Reinartz 7efb830d70 wal: parallelize sample processing 2017-10-09 15:22:38 +02:00
Simon Pasquier e858c0826c Fix innocuous typo in variable names
This change fixes the variable names holding the tsdb_head_max_time and
tsdb_head_min_time metrics. It is a cosmetic change to improve the
code readability as the metric values are taken from the correct
variables.
2017-10-09 12:24:53 +02:00
Fabian Reinartz d3682d701c wal: decode and process in separate threads. 2017-10-06 14:46:52 +02:00
Fabian Reinartz cd2e26b7fc Load postings in batch on startup
This allows to insert IDs to postings out of order until
a trigger function is called. This avoids the insertion sort we usually
do which can be very costly since WAL entries are more out of order than
regular adds.
2017-10-06 10:39:10 +02:00
Goutham Veeramachaneni 203012169a
snapshot: Remove truncation check to restore func.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-10-04 16:58:07 +05:30
Goutham Veeramachaneni c35d3a65bd
Add levels to all log lines.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-28 12:49:34 +05:30
Fabian Reinartz 69f105f4f9 Merge pull request #151 from prometheus/waltrunc
Use boolean function instead of postings to drop WAL series
2017-09-21 15:04:43 +02:00
Fabian Reinartz 1e88ba06b4 Use boolean function instead of postings to drop WAL series
There is not guarantee or requirement for WAL writers to only add
series entries in increasing order of IDs. A postings list cannot look
back and thus unordered WAL entries would skip over IDs to not truncate
from the WAL.
We replace it with a simple boolean check function that does not require
order.
2017-09-21 13:31:01 +02:00
Fabian Reinartz 6ee254e353 Ensure postings are always sorted
IDs for new series are handed out before the postings are locked. Thus
series are not indexed in order of their IDs, which could result in only
partially sorted postings list.
Iterating over those silently skipped elements as the sort invariant was
violated.
2017-09-21 09:38:18 +02:00
Fabian Reinartz 162a48e4f2 Create series with ID recorded in WAL when reading it back 2017-09-19 11:31:16 +02:00
Fabian Reinartz 7ada9cd805 Simplify series create logic in head 2017-09-18 12:38:39 +02:00
Fabian Reinartz ab8d9b9706 Add missing unlock on early return 2017-09-18 11:23:22 +02:00
Fabian Reinartz f904cd385f Do not build a superflous 'all' postings 2017-09-08 18:41:43 +02:00
Fabian Reinartz 6892fc6dcb Finish old WAL segment async, default to no fsync
We were still fsyncing while holding the write lock when we cut a new
segment. Given we cannot do anything but logging errors, we might just
as well complete segments asynchronously.

There's not realistic use case where one would fsync after every WAL
entry, thus make the default of a flush interval of 0 to never fsync
which is a much more likely use case.
2017-09-08 18:41:12 +02:00
Fabian Reinartz 1d5f85817d Fix various races 2017-09-08 08:48:19 +02:00
Fabian Reinartz 0db4c227b7 Fix min/max time handling and concurrent crc32 usage 2017-09-07 13:04:02 +02:00
Fabian Reinartz 81222849bc Filter WAL data in Head, misc fixes 2017-09-06 16:20:37 +02:00
Fabian Reinartz 33e9bdf403 WAL refactoring and truncation fixes and test 2017-09-06 14:59:25 +02:00
Fabian Reinartz c36d574290 Replace single head lock with granular locks
This adds various new locks to replace the single big lock on
the head. All parts now must be COW as they may be held by clients
after initial retrieval.
Series by ID and hashes are now held in a stripe lock to reduce
contention and total holding time during GC. This should reduce
starvation of readers.
2017-09-05 14:41:39 +02:00
Fabian Reinartz 1ddedf2b30 Change series ID from uint32 to uint64 2017-09-04 16:08:38 +02:00
Goutham Veeramachaneni 1698c516ad [WIP]: WAL implementation
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-04 14:52:40 +02:00
Fabian Reinartz 893b6ec506 Add tests for GC and chunk truncation 2017-09-01 14:38:49 +02:00
Fabian Reinartz 4f037da462 Remove defer statement in hot path 2017-09-01 12:09:29 +02:00
Fabian Reinartz 5cf2662074 Refactor WAL into Head and misc improvements 2017-09-01 11:50:58 +02:00
Fabian Reinartz 8209e3ec23 Add various metrics 2017-09-01 11:50:58 +02:00
Fabian Reinartz 3901b6e70b Remove multiple heads
This changes the structure to a single WAL backed by a single head
block.
Parts of the head block can be compacted. This relieves us from any head
amangement and greatly simplifies any consistency and isolation concerns
by just having a single head.
2017-09-01 11:50:58 +02:00
Goutham Veeramachaneni 7438ed7035 Expose Intervals type for use by TombstoneReader.
TombstoneReader is exposed but Intervals is not.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-08-25 16:06:36 +05:30
Fabian Reinartz 905af27cf9 Refactor compactor 2017-08-09 11:10:29 +02:00
Fabian Reinartz 66ff7b12e9 Pool Chunk objects during compaction 2017-08-08 17:35:34 +02:00
Fabian Reinartz 2644c8665c Don't allocate ChunkMetas, reuse postings slices 2017-08-06 20:41:24 +02:00
Fabian Reinartz 96d7f540d4 Persist series without allocating the full set
Change index persistence for series to not be accumulated in memory
before being written as one large batch. `Labels` and `ChunkMeta`
objects are reused.
This cuts down memory spikes during compaction of multiple blocks
significantly.

As part of the the Index{Reader,Writer} now have an explicit notion of
symbols and series must be inserted in order.
2017-08-06 12:06:41 +02:00
Goutham Veeramachaneni f1ae239c20 Persist the right MaxTime when snapshotting
This is because we cut a new block from where the snapshotted block ends
if we restore from backups and highTimestamp would be where we should be
 starting from.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-07-12 13:48:13 +02:00
Fabian Reinartz 1e74c155eb Return empty string to signal non-caching 2017-06-26 14:58:00 +02:00
Fabian Reinartz 3410559c1b Compact head block early
Let older head blocks be compacted once the newest once has samples at
50% of its total range. This allows the memory of the compacted blocks
to be released and garbage collected before a new head block gets
created. Thereby the number of head blocks is 1 or 2 instead of 2 or 3
and memory spikes are reduced.
2017-06-26 08:52:59 +02:00
Fabian Reinartz 9963a4c7c3 Merge pull request #95 from Gouthamve/wal-ahead
Fix race condition for 2 appenders having same ts
2017-06-12 11:17:49 +02:00
Goutham Veeramachaneni 73cc5bae51 Colocate defer statements near relevant functions
Signed-off-by: Goutham Veeramachaneni <goutham@boomerangcommerce.com>
2017-06-12 14:37:58 +05:30
Goutham Veeramachaneni b51a05044e
Fix race condition for 2 appenders having same ts
Race:
Suppose we have 100 existing series inside a HeadBlock.
Now we open two appenders in two routines A1, A2 and append 30 new series and
60 new series respectively with some common series.

Both try to commit at the same time and the following happens in the given order:

A2 executes createSeries()
A1 executes createSeries() (with its common series referencing the ids from A2)
A1 persists its newlabels, samples
A2 persists its newlabels, samples

Now when reading it back, we read A1's samples which reference A2's id and
thereby fail.

Ref: prometheus/promtheus#2795

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-08 16:26:25 +05:30
Fabian Reinartz 05e411a8eb Improve heuristic to spread chunks across block 2017-06-08 11:30:32 +02:00
Goutham Veeramachaneni a110a64abd
Add full Snapshot support
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-06 18:15:54 +05:30
Goutham Veeramachaneni a1c8425357
Initial implementation of HeadBlock Snapshots
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-05 13:48:31 +05:30
Goutham Veeramachaneni 29c73f05f2
Make sure that mint and maxt are not modified.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-27 21:59:49 +05:30
Goutham Veeramachaneni 44e9ae38b5
Incorporate PR feedback.
* Expose Stone as it is used in an exported method.
* Move from tombstoneReader to []Stone for the same reason as above.
* Make WAL reading a little cleaner.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-26 21:26:31 +05:30
Goutham Veeramachaneni 6febabeb28
Final delete fixes.
* Make sure no reads happen on the block when delete is in progress.
* Fix bugs in compaction.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-26 16:31:45 +05:30
Goutham Veeramachaneni c211ec4f49
Fix concurrent map access.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-24 16:58:04 +05:30
Goutham Veeramachaneni f29fb62fba
Make TombstoneReader a Getter.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-24 11:24:24 +05:30
Goutham Veeramachaneni 9bf7aa9af1
Misc. fixes incorporating feedback.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-23 18:13:30 +05:30
Goutham Veeramachaneni 31cf939448
Add NumTombstones to BlockMeta.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-23 17:37:04 +05:30
Goutham Veeramachaneni 3eb4119ab1
Make HeadBlock use WAL.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-23 16:15:16 +05:30
Goutham Veeramachaneni 244b73fce1
Rename for clarity and consistency.
Misc. changes for code cleanliness.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-22 16:42:36 +05:30
Goutham Veeramachaneni 8434019ad9
Merge branch 'master' into deletes-1
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-22 12:58:38 +05:30
Goutham Veeramachaneni 662d8173fe
Make Appends after Delete visible.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-22 11:28:24 +05:30
Goutham Veeramachaneni d6bd64357b
Fix Delete on HeadBlock
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-19 22:54:29 +05:30
Goutham Veeramachaneni 45d3db4e9e
Use a *mapTombstoneReader instead of map
We need to recalculate the sorted ref list everytime we make a
Tombstones() call. This avoids that.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-19 11:56:37 +05:30
Fabian Reinartz 39df7e2bba Switch blocks to ULID directories, drop sequenc numbers 2017-05-18 16:09:30 +02:00
Fabian Reinartz 285bc07030 Switch append refs to string 2017-05-18 10:56:57 +02:00
Goutham Veeramachaneni 22c1b5b492
Make SeriesSets use tombstones.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-17 14:49:42 +05:30
Goutham Veeramachaneni 34a86af3c6
Move tombstones to their own thing.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-17 08:36:56 +05:30
Goutham Veeramachaneni cea3c88f17
Add Tombstones() method to Block.
Also add Seek() to TombstoneReader

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-16 19:48:28 +05:30
Goutham Veeramachaneni 4f1d857590
Implement Delete on HeadBlock
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-15 23:28:14 +05:30
Goutham Veeramachaneni 5579efbd5b
Initial implentation of Deletes on persistedBlock
Very much a WIP

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-14 14:36:26 +05:30
Fabian Reinartz 8b51b7e2be Make WAL for HeadBlock composeable. 2017-05-13 18:14:18 +02:00
Fabian Reinartz 4862b261d0 Abstract WAL into interface 2017-05-13 17:09:26 +02:00
Fabian Reinartz 535532ca02 Export refdSample
The type was part of a exported method signatures and should therefore
be exported as well.
2017-05-12 17:06:26 +02:00
Fabian Reinartz 5534e6c53c Make HeadBlock impl public, make interface private 2017-05-12 16:34:41 +02:00
Goutham Veeramachaneni 2fa647f50b Fix missing postings in Merge and Intersect (#77)
* Test for a previous implematation of Intersect

Before we were moving the postings list everytime we create a new
chained `intersectPostings`. That was causing some postings to be
skipped. This test fails on the older version.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

* Advance on Seek only when valid.

Issue:
Before in mergedPostings and others we advance everytime we `Seek`,
which causes issues with `Intersect`.

Take the case, where we have a mergedPostings = m merging, a: {10, 20, 30} and
b: {15, 25, 35}. Everytime we `Seek`, we do a.Seek and b.Seek.

Now if we Intersect m with {21, 22, 23, 30}, we would do Seek({21,22,23}) which
would advance a and b beyond 30.

Fix:
Now we advance only when the seeking value is greater than the current
value, as the definition specifies.

Also, posting 0 will not be a valid posting and will be used to signal
finished or un-initialized PostingsList.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

* Add test for Merge+Intersect edgecase.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

* Add comments to trivial tests.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-12 09:44:41 +02:00
Fabian Reinartz 291137781b Merge branch 'panic-fix2' of https://github.com/Gouthamve/tsdb into Gouthamve-panic-fix2 2017-05-09 16:22:19 +02:00
Fabian Reinartz 09cd2021de Merge pull request #75 from Gouthamve/head-gen
E2E test for headBlock
2017-05-05 18:56:53 +02:00
Goutham Veeramachaneni 8096d11e4e
Add bounds check to headBlockAppender
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-05 19:52:11 +05:30
Goutham Veeramachaneni adaf4d2099
Handle duplicate & out of order values in same txn
Add docs about not erroring out on exact dupes.
Moved tests to require.*

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-03 02:36:40 +05:30
Brian Brazil 72443bddfc Remove dead code. 2017-04-20 13:45:07 +01:00
Brian Brazil bceb5c1b16 When checking for amended points, do it in terms of bits.
NaN != NaN, so the previous code would incorrectly report
it as changed.

There's also plans to take advantage of the NaN payload,
so look at the entire value.
2017-04-12 16:25:32 +01:00
Fabian Reinartz 778103b450 Add liecence file and headers 2017-04-10 20:59:45 +02:00
Fabian Reinartz c73a397da2 Adjust maximum samples per chunk. 2017-04-07 10:58:37 +02:00
Goutham Veeramachaneni a51b2666d7 Fix Panic When Accessing Uncut memorySeries
When calling AddFast, we check the details of the head chunk of the
referred memorySeries. But it could happen that there are no chunks in
the series at all.

Currently, we are deferring chunk creation to when we actually append
samples, but we can be sure that there will be samples if the series is
created. We will be consuming no extra memory by cutting a chunk when we
create the series.

Ref: #28 comment 2

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-04-06 00:22:31 +05:30
Fabian Reinartz 10c7c9acbe Adjust import names to new repository organisation 2017-04-04 11:27:26 +02:00
Fabian Reinartz 87d48bf9de Merge branch 'master' of github.com:fabxc/tsdb 2017-03-27 19:07:27 +02:00
Fabian Reinartz a52980e0a8 Add workaround for deadlocks
This adds a workaround to avoid deadlocks for inconsistent write lock
order across headBlocks.
Things keep working if transactions only append data for the same
timestamp, which is generally the case for Prometheus.

Full behavior should be restored in a subsequent change.
2017-03-27 19:05:34 +02:00
Goutham Veeramachaneni 61f866bb94
Add Sample Back
The compilation and tests are broken as head.go requires sample which
has been moved to another package while moving BufferedSeriesIterator.

Duplication seemed better compared to exposing sample from tsdbutil.
2017-03-26 23:22:58 +05:30
Fabian Reinartz 3be4ef94ce Move BufferedSeriesIterator in own package
This functionality is useful for a lot of clients but not relevant to
the TSDB's core features.
2017-03-24 13:23:32 +01:00
Fabian Reinartz e478d0e3bc Actually close olds blocks in reloadBlocks
This fixes a bug leaking memory because blocks were not actually closed
as the closing call references the initial, empty slice
2017-03-23 18:27:20 +01:00
Fabian Reinartz 789e8224ff Fix wrong comparison in head block resorting 2017-03-21 12:12:33 +01:00
Fabian Reinartz 55ee4b5b3b Merge branch 'master' of github.com:fabxc/tsdb 2017-03-21 10:11:39 +01:00
Fabian Reinartz c18e055d7c Fix races and add comments on remaining ones 2017-03-21 10:11:23 +01:00
Fabian Reinartz e837034360 Merge pull request #14 from Gouthamve/log-update
Update kit/log To New API
2017-03-21 09:56:32 +01:00
Fabian Reinartz 9c93f8f2aa Fix various races
This fixes different race condition encoutnered when running Prometheus.
It reduces the overall performance in the synthetic benchmark a fair bit
but has no indiciations of impacting a real-world setup notably.
2017-03-20 14:45:27 +01:00