IDs for new series are handed out before the postings are locked. Thus
series are not indexed in order of their IDs, which could result in only
partially sorted postings list.
Iterating over those silently skipped elements as the sort invariant was
violated.
This adds various new locks to replace the single big lock on
the head. All parts now must be COW as they may be held by clients
after initial retrieval.
Series by ID and hashes are now held in a stripe lock to reduce
contention and total holding time during GC. This should reduce
starvation of readers.
This changes the structure to a single WAL backed by a single head
block.
Parts of the head block can be compacted. This relieves us from any head
amangement and greatly simplifies any consistency and isolation concerns
by just having a single head.
Change index persistence for series to not be accumulated in memory
before being written as one large batch. `Labels` and `ChunkMeta`
objects are reused.
This cuts down memory spikes during compaction of multiple blocks
significantly.
As part of the the Index{Reader,Writer} now have an explicit notion of
symbols and series must be inserted in order.
* Expose Stone as it is used in an exported method.
* Move from tombstoneReader to []Stone for the same reason as above.
* Make WAL reading a little cleaner.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Make sure no reads happen on the block when delete is in progress.
* Fix bugs in compaction.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
This fixes different race condition encoutnered when running Prometheus.
It reduces the overall performance in the synthetic benchmark a fair bit
but has no indiciations of impacting a real-world setup notably.
This has been a common source of hard to debug issues. Its a premature
and unbenchmarked optimization and semantically, we want ChunkMetas to
be references in all changed cases.
This adds write path support for segmented chunk data files.
Files of 512MB are pre-allocated and written to. If the file size
is exceeded, the next file is started. On completion, files
are truncated to their final size.
File locks have a multitude of problems that make them hard to use
correctly. As they are just advisory, they are only meaningful to
prevent accidents like running the same process twice.
A simple PID file lock works reliably in those cases and is simpler.
This adds naive compaction that tries to compact three
blocks of roughly equal size.
It decides based on samples present in a block and has no
safety measures considering the actual file size.
This adds a position mapper that takes series from a head block
in the order they were appended and creates a mapping representing
them in order of their label sets.
Write-repair of the postings list would cause very expensive writing.
Hence, we keep them as they are and only apply the postition mapping
at the very end, after a postings list has been sufficienctly reduced
through intersections etc.
This adds a basic compactor that will merge two persisted blocks into
one. It simply fully rewrites the index and concatenates the chunk
lists.
It just writes into the current working dir and doesn't properly handle
which blocks to compact for now.