The postings list index may point to series that no longer
exist during garbage collection. This clarifies that this is valid
behavior.
It would be possible, though more complex, to always keep them in sync.
However, series existance means nothing in itself as the queried time
range defines whether there's actual data. Thus our definition is sane
overall as long as drift is kept small.
This commit introduces error returns in various places and is explicit
about closing persisted blocks.
{Index,Chunk,Tombstone}Readers are more consistent about their Close()
method. Whenever a reader is retrieved, the corresponding close method
must eventually be called. We use this to track pending readers against
persisted blocks.
Querier's against the DB no longer hold a read lock for their entire
lifecycle. This avoids long running queriers to starve new ones when we
have to acquire a write lock when reloading blocks.
This change loads the full symbol table when we open a persisted block
and allocates a string for each. This ensures that strings retrieved
through the index can be used after the block was closed.
Before we backed the strings by the mmap'd byte regions which would
segfault in this case.
Also remove an inconsistency in the disk format and move both offset
tables to the end (breaking change).
When decoding data from mmaped blocks, we would like to retrieve
a string backed by the mmaped region. As the underlying byte slice
never changes, this is safe.
Change index persistence for series to not be accumulated in memory
before being written as one large batch. `Labels` and `ChunkMeta`
objects are reused.
This cuts down memory spikes during compaction of multiple blocks
significantly.
As part of the the Index{Reader,Writer} now have an explicit notion of
symbols and series must be inserted in order.
This removes the flag from the label value index and serializes it using
encbufs.
TODO: move CRC32 checksum into label value index hash table for
referntial integrity.
This has been a common source of hard to debug issues. Its a premature
and unbenchmarked optimization and semantically, we want ChunkMetas to
be references in all changed cases.