Commit graph

185 commits

Author SHA1 Message Date
Fabian Reinartz 6dcca97755 Fix race in symbol table re-creation 2017-10-20 09:29:03 +02:00
Fabian Reinartz 7f8fa07cf7 Merge pull request #176 from prometheus/races
Races
2017-10-12 15:27:08 +02:00
Fabian Reinartz 065f42f58c head: track number of series not found errors in metric 2017-10-12 15:25:12 +02:00
Fabian Reinartz 88305e7612 Access chunk time range while holding lock 2017-10-11 10:17:59 +02:00
Fabian Reinartz 106eaf39d1 Ensure workers terminated fully before reading unknownRefs 2017-10-11 10:12:29 +02:00
Fabian Reinartz 665955da48 Clarify postings index semantics, handle staleness
The postings list index may point to series that no longer
exist during garbage collection. This clarifies that this is valid
behavior.
It would be possible, though more complex, to always keep them in sync.
However, series existance means nothing in itself as the queried time
range defines whether there's actual data. Thus our definition is sane
overall as long as drift is kept small.
2017-10-11 09:37:19 +02:00
Fabian Reinartz c3e502b194 Merge pull request #168 from prometheus/fasterwal
wal: decode and process in separate threads.
2017-10-10 18:11:44 +02:00
Fabian Reinartz fb9da52b11 Add more verbose error handling for closing, reduce locking
This commit introduces error returns in various places and is explicit
about closing persisted blocks.
{Index,Chunk,Tombstone}Readers are more consistent about their Close()
method. Whenever a reader is retrieved, the corresponding close method
must eventually be called. We use this to track pending readers against
persisted blocks.

Querier's against the DB no longer hold a read lock for their entire
lifecycle. This avoids long running queriers to starve new ones when we
have to acquire a write lock when reloading blocks.
2017-10-10 12:13:37 +02:00
Fabian Reinartz 7efb830d70 wal: parallelize sample processing 2017-10-09 15:22:38 +02:00
Simon Pasquier e858c0826c Fix innocuous typo in variable names
This change fixes the variable names holding the tsdb_head_max_time and
tsdb_head_min_time metrics. It is a cosmetic change to improve the
code readability as the metric values are taken from the correct
variables.
2017-10-09 12:24:53 +02:00
Fabian Reinartz d3682d701c wal: decode and process in separate threads. 2017-10-06 14:46:52 +02:00
Fabian Reinartz cd2e26b7fc Load postings in batch on startup
This allows to insert IDs to postings out of order until
a trigger function is called. This avoids the insertion sort we usually
do which can be very costly since WAL entries are more out of order than
regular adds.
2017-10-06 10:39:10 +02:00
Goutham Veeramachaneni 203012169a
snapshot: Remove truncation check to restore func.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-10-04 16:58:07 +05:30
Goutham Veeramachaneni c35d3a65bd
Add levels to all log lines.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-28 12:49:34 +05:30
Fabian Reinartz 69f105f4f9 Merge pull request #151 from prometheus/waltrunc
Use boolean function instead of postings to drop WAL series
2017-09-21 15:04:43 +02:00
Fabian Reinartz 1e88ba06b4 Use boolean function instead of postings to drop WAL series
There is not guarantee or requirement for WAL writers to only add
series entries in increasing order of IDs. A postings list cannot look
back and thus unordered WAL entries would skip over IDs to not truncate
from the WAL.
We replace it with a simple boolean check function that does not require
order.
2017-09-21 13:31:01 +02:00
Fabian Reinartz 6ee254e353 Ensure postings are always sorted
IDs for new series are handed out before the postings are locked. Thus
series are not indexed in order of their IDs, which could result in only
partially sorted postings list.
Iterating over those silently skipped elements as the sort invariant was
violated.
2017-09-21 09:38:18 +02:00
Fabian Reinartz 162a48e4f2 Create series with ID recorded in WAL when reading it back 2017-09-19 11:31:16 +02:00
Fabian Reinartz 7ada9cd805 Simplify series create logic in head 2017-09-18 12:38:39 +02:00
Fabian Reinartz ab8d9b9706 Add missing unlock on early return 2017-09-18 11:23:22 +02:00
Fabian Reinartz f904cd385f Do not build a superflous 'all' postings 2017-09-08 18:41:43 +02:00
Fabian Reinartz 6892fc6dcb Finish old WAL segment async, default to no fsync
We were still fsyncing while holding the write lock when we cut a new
segment. Given we cannot do anything but logging errors, we might just
as well complete segments asynchronously.

There's not realistic use case where one would fsync after every WAL
entry, thus make the default of a flush interval of 0 to never fsync
which is a much more likely use case.
2017-09-08 18:41:12 +02:00
Fabian Reinartz 1d5f85817d Fix various races 2017-09-08 08:48:19 +02:00
Fabian Reinartz 0db4c227b7 Fix min/max time handling and concurrent crc32 usage 2017-09-07 13:04:02 +02:00
Fabian Reinartz 81222849bc Filter WAL data in Head, misc fixes 2017-09-06 16:20:37 +02:00
Fabian Reinartz 33e9bdf403 WAL refactoring and truncation fixes and test 2017-09-06 14:59:25 +02:00
Fabian Reinartz c36d574290 Replace single head lock with granular locks
This adds various new locks to replace the single big lock on
the head. All parts now must be COW as they may be held by clients
after initial retrieval.
Series by ID and hashes are now held in a stripe lock to reduce
contention and total holding time during GC. This should reduce
starvation of readers.
2017-09-05 14:41:39 +02:00
Fabian Reinartz 1ddedf2b30 Change series ID from uint32 to uint64 2017-09-04 16:08:38 +02:00
Goutham Veeramachaneni 1698c516ad [WIP]: WAL implementation
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-04 14:52:40 +02:00
Fabian Reinartz 893b6ec506 Add tests for GC and chunk truncation 2017-09-01 14:38:49 +02:00
Fabian Reinartz 4f037da462 Remove defer statement in hot path 2017-09-01 12:09:29 +02:00
Fabian Reinartz 5cf2662074 Refactor WAL into Head and misc improvements 2017-09-01 11:50:58 +02:00
Fabian Reinartz 8209e3ec23 Add various metrics 2017-09-01 11:50:58 +02:00
Fabian Reinartz 3901b6e70b Remove multiple heads
This changes the structure to a single WAL backed by a single head
block.
Parts of the head block can be compacted. This relieves us from any head
amangement and greatly simplifies any consistency and isolation concerns
by just having a single head.
2017-09-01 11:50:58 +02:00
Goutham Veeramachaneni 7438ed7035 Expose Intervals type for use by TombstoneReader.
TombstoneReader is exposed but Intervals is not.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-08-25 16:06:36 +05:30
Fabian Reinartz 905af27cf9 Refactor compactor 2017-08-09 11:10:29 +02:00
Fabian Reinartz 66ff7b12e9 Pool Chunk objects during compaction 2017-08-08 17:35:34 +02:00
Fabian Reinartz 2644c8665c Don't allocate ChunkMetas, reuse postings slices 2017-08-06 20:41:24 +02:00
Fabian Reinartz 96d7f540d4 Persist series without allocating the full set
Change index persistence for series to not be accumulated in memory
before being written as one large batch. `Labels` and `ChunkMeta`
objects are reused.
This cuts down memory spikes during compaction of multiple blocks
significantly.

As part of the the Index{Reader,Writer} now have an explicit notion of
symbols and series must be inserted in order.
2017-08-06 12:06:41 +02:00
Goutham Veeramachaneni f1ae239c20 Persist the right MaxTime when snapshotting
This is because we cut a new block from where the snapshotted block ends
if we restore from backups and highTimestamp would be where we should be
 starting from.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-07-12 13:48:13 +02:00
Fabian Reinartz 1e74c155eb Return empty string to signal non-caching 2017-06-26 14:58:00 +02:00
Fabian Reinartz 3410559c1b Compact head block early
Let older head blocks be compacted once the newest once has samples at
50% of its total range. This allows the memory of the compacted blocks
to be released and garbage collected before a new head block gets
created. Thereby the number of head blocks is 1 or 2 instead of 2 or 3
and memory spikes are reduced.
2017-06-26 08:52:59 +02:00
Fabian Reinartz 9963a4c7c3 Merge pull request #95 from Gouthamve/wal-ahead
Fix race condition for 2 appenders having same ts
2017-06-12 11:17:49 +02:00
Goutham Veeramachaneni 73cc5bae51 Colocate defer statements near relevant functions
Signed-off-by: Goutham Veeramachaneni <goutham@boomerangcommerce.com>
2017-06-12 14:37:58 +05:30
Goutham Veeramachaneni b51a05044e
Fix race condition for 2 appenders having same ts
Race:
Suppose we have 100 existing series inside a HeadBlock.
Now we open two appenders in two routines A1, A2 and append 30 new series and
60 new series respectively with some common series.

Both try to commit at the same time and the following happens in the given order:

A2 executes createSeries()
A1 executes createSeries() (with its common series referencing the ids from A2)
A1 persists its newlabels, samples
A2 persists its newlabels, samples

Now when reading it back, we read A1's samples which reference A2's id and
thereby fail.

Ref: prometheus/promtheus#2795

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-08 16:26:25 +05:30
Fabian Reinartz 05e411a8eb Improve heuristic to spread chunks across block 2017-06-08 11:30:32 +02:00
Goutham Veeramachaneni a110a64abd
Add full Snapshot support
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-06 18:15:54 +05:30
Goutham Veeramachaneni a1c8425357
Initial implementation of HeadBlock Snapshots
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-05 13:48:31 +05:30
Goutham Veeramachaneni 29c73f05f2
Make sure that mint and maxt are not modified.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-27 21:59:49 +05:30
Goutham Veeramachaneni 44e9ae38b5
Incorporate PR feedback.
* Expose Stone as it is used in an exported method.
* Move from tombstoneReader to []Stone for the same reason as above.
* Make WAL reading a little cleaner.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-26 21:26:31 +05:30