Change index persistence for series to not be accumulated in memory
before being written as one large batch. `Labels` and `ChunkMeta`
objects are reused.
This cuts down memory spikes during compaction of multiple blocks
significantly.
As part of the the Index{Reader,Writer} now have an explicit notion of
symbols and series must be inserted in order.
Let older head blocks be compacted once the newest once has samples at
50% of its total range. This allows the memory of the compacted blocks
to be released and garbage collected before a new head block gets
created. Thereby the number of head blocks is 1 or 2 instead of 2 or 3
and memory spikes are reduced.
* Expose Stone as it is used in an exported method.
* Move from tombstoneReader to []Stone for the same reason as above.
* Make WAL reading a little cleaner.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Make sure no reads happen on the block when delete is in progress.
* Fix bugs in compaction.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
We need to recalculate the sorted ref list everytime we make a
Tombstones() call. This avoids that.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
This adds the Queryable interface to the Block interface. Head and
persisted blocks now implement their own Querier() method and thus
isolate customization (e.g. remapPostings) more cleanly.
This adds more lower-leve interfaces which are used to compose
to different Block interfaces.
The DB only uses interfaces instead of explicit persistedBlock and
headBlock. The headBlock generation property is dropped as the use-case
can be implemented using block sequence numbers.
This adds write path support for segmented chunk data files.
Files of 512MB are pre-allocated and written to. If the file size
is exceeded, the next file is started. On completion, files
are truncated to their final size.
File locks have a multitude of problems that make them hard to use
correctly. As they are just advisory, they are only meaningful to
prevent accidents like running the same process twice.
A simple PID file lock works reliably in those cases and is simpler.
This is an initial (and hacky) first pass on allowing
appending to multiple blocks simultaniously to avoid
dropping samples right after cutting a new head block.
It's also required for cases like the PGW, where a scrape may
contain varying timestamps.