This changes the structure to a single WAL backed by a single head
block.
Parts of the head block can be compacted. This relieves us from any head
amangement and greatly simplifies any consistency and isolation concerns
by just having a single head.
Change index persistence for series to not be accumulated in memory
before being written as one large batch. `Labels` and `ChunkMeta`
objects are reused.
This cuts down memory spikes during compaction of multiple blocks
significantly.
As part of the the Index{Reader,Writer} now have an explicit notion of
symbols and series must be inserted in order.
This is because we cut a new block from where the snapshotted block ends
if we restore from backups and highTimestamp would be where we should be
starting from.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
Let older head blocks be compacted once the newest once has samples at
50% of its total range. This allows the memory of the compacted blocks
to be released and garbage collected before a new head block gets
created. Thereby the number of head blocks is 1 or 2 instead of 2 or 3
and memory spikes are reduced.
Race:
Suppose we have 100 existing series inside a HeadBlock.
Now we open two appenders in two routines A1, A2 and append 30 new series and
60 new series respectively with some common series.
Both try to commit at the same time and the following happens in the given order:
A2 executes createSeries()
A1 executes createSeries() (with its common series referencing the ids from A2)
A1 persists its newlabels, samples
A2 persists its newlabels, samples
Now when reading it back, we read A1's samples which reference A2's id and
thereby fail.
Ref: prometheus/promtheus#2795
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Expose Stone as it is used in an exported method.
* Move from tombstoneReader to []Stone for the same reason as above.
* Make WAL reading a little cleaner.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Make sure no reads happen on the block when delete is in progress.
* Fix bugs in compaction.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
We need to recalculate the sorted ref list everytime we make a
Tombstones() call. This avoids that.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Test for a previous implematation of Intersect
Before we were moving the postings list everytime we create a new
chained `intersectPostings`. That was causing some postings to be
skipped. This test fails on the older version.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Advance on Seek only when valid.
Issue:
Before in mergedPostings and others we advance everytime we `Seek`,
which causes issues with `Intersect`.
Take the case, where we have a mergedPostings = m merging, a: {10, 20, 30} and
b: {15, 25, 35}. Everytime we `Seek`, we do a.Seek and b.Seek.
Now if we Intersect m with {21, 22, 23, 30}, we would do Seek({21,22,23}) which
would advance a and b beyond 30.
Fix:
Now we advance only when the seeking value is greater than the current
value, as the definition specifies.
Also, posting 0 will not be a valid posting and will be used to signal
finished or un-initialized PostingsList.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Add test for Merge+Intersect edgecase.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Add comments to trivial tests.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>