Commit graph

60 commits

Author SHA1 Message Date
Fabian Reinartz ac5bd71d8f Doc fixes 2017-11-10 14:10:20 +00:00
Fabian Reinartz b7c3cfecbf index: abstract ByteSlice and adjust indexReader
This replaces the builtin byte slice with an interface for the index
reader. This allows the complex decoding of the index file format
to be used against more generalized implementations.
2017-11-09 17:38:32 +00:00
Fabian Reinartz c354d6bd59 index: simplify checksum validation 2017-11-09 15:58:36 +00:00
Fabian Reinartz 798f2bdb0a
Merge pull request #189 from sunhay/checksum-checks
Index checksum validation on reads
2017-11-09 15:35:09 +00:00
Nipun Talukdar 791a2dda4d Fixed a problem of adding padding of 4 zero bytes in some cases (#194)
* Fixed a problem of adding padding of 4 zero bytes in some cases

* Incorporated review comments
2017-11-03 20:16:19 +01:00
ranbochen a27cf34a36 fix bugs on platform windows to pass all test case. (#192)
* fix bugs on platform windows to pass all test case.

* fix bugs on platform windows to pass all test case

* clean up codes
2017-10-31 15:37:41 +01:00
Sunny Klair aecac3b521 Resuse single CRC for index checksum validation 2017-10-27 12:29:59 -04:00
Sunny Klair b65dd43c5b Validate index series/postings/symbol table checksums on read 2017-10-26 15:34:31 -04:00
Sunny Klair ab02ea4de4 Validate index offset table checksums on read 2017-10-26 13:48:31 -04:00
Sunny Klair 4fdf9b195c Validate index TOC checksum on read 2017-10-25 18:12:13 -04:00
Fabian Reinartz 665955da48 Clarify postings index semantics, handle staleness
The postings list index may point to series that no longer
exist during garbage collection. This clarifies that this is valid
behavior.
It would be possible, though more complex, to always keep them in sync.
However, series existance means nothing in itself as the queried time
range defines whether there's actual data. Thus our definition is sane
overall as long as drift is kept small.
2017-10-11 09:37:19 +02:00
Fabian Reinartz fb9da52b11 Add more verbose error handling for closing, reduce locking
This commit introduces error returns in various places and is explicit
about closing persisted blocks.
{Index,Chunk,Tombstone}Readers are more consistent about their Close()
method. Whenever a reader is retrieved, the corresponding close method
must eventually be called. We use this to track pending readers against
persisted blocks.

Querier's against the DB no longer hold a read lock for their entire
lifecycle. This avoids long running queriers to starve new ones when we
have to acquire a write lock when reloading blocks.
2017-10-10 12:13:37 +02:00
Goutham Veeramachaneni da565f975e Merge pull request #161 from prometheus/fileutil
Remove dependency on etcd/pkg/fileutil
2017-10-04 17:08:54 +05:30
Fabian Reinartz bbe72dccb9 Remove dependency on etcd/pkg/fileutil 2017-10-04 10:23:41 +02:00
Fabian Reinartz 78df406dac Allocate and cache strings for persisted blocks
This change loads the full symbol table when we open a persisted block
and allocates a string for each. This ensures that strings retrieved
through the index can be used after the block was closed.
Before we backed the strings by the mmap'd byte regions which would
segfault in this case.

Also remove an inconsistency in the disk format and move both offset
tables to the end (breaking change).
2017-10-02 15:56:57 +02:00
Goutham Veeramachaneni 8919baef03
Expose NewIndexReader() and cleanups
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-13 13:47:20 +05:30
Fabian Reinartz 3870ec285c Merge pull request #140 from prometheus/locks
Fix various races
2017-09-11 10:41:33 +02:00
Fabian Reinartz b09d90c79c Add decoding method to retrieve unsafe strings
When decoding data from mmaped blocks, we would like to retrieve
a string backed by the mmaped region. As the underlying byte slice
never changes, this is safe.
2017-09-08 18:41:43 +02:00
Goutham Veeramachaneni afaf12fe45
Compress the series chunk details in index.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-08 20:25:19 +05:30
Fabian Reinartz 1ddedf2b30 Change series ID from uint32 to uint64 2017-09-04 16:08:38 +02:00
Matt Layher 78b15c3434
Add newCRC32 function to simplify hash initialization 2017-08-26 12:04:00 -04:00
Fabian Reinartz 2644c8665c Don't allocate ChunkMetas, reuse postings slices 2017-08-06 20:41:24 +02:00
Fabian Reinartz 96d7f540d4 Persist series without allocating the full set
Change index persistence for series to not be accumulated in memory
before being written as one large batch. `Labels` and `ChunkMeta`
objects are reused.
This cuts down memory spikes during compaction of multiple blocks
significantly.

As part of the the Index{Reader,Writer} now have an explicit notion of
symbols and series must be inserted in order.
2017-08-06 12:06:41 +02:00
Goutham Veeramachaneni a110a64abd
Add full Snapshot support
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-06 18:15:54 +05:30
Goutham Veeramachaneni 34a86af3c6
Move tombstones to their own thing.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-17 08:36:56 +05:30
Goutham Veeramachaneni 4f1d857590
Implement Delete on HeadBlock
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-15 23:28:14 +05:30
Goutham Veeramachaneni 5579efbd5b
Initial implentation of Deletes on persistedBlock
Very much a WIP

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-14 14:36:26 +05:30
Fabian Reinartz 2032a11d98 Add padding between fixed-sized index sections 2017-05-02 12:43:51 +02:00
Fabian Reinartz 34ba92eeeb Move CRC back to chunks file, alignment for fixed-sized ints 2017-04-30 10:18:07 +02:00
Fabian Reinartz a54f46d5e7 Migrate last IndexWriter pieces to decbuf 2017-04-30 10:18:07 +02:00
Fabian Reinartz 94f3fd9812 Move encoding helpers into separate file 2017-04-30 10:18:07 +02:00
Fabian Reinartz 35b62f001e Change offset table layout, add TOC, ... 2017-04-30 10:18:07 +02:00
Fabian Reinartz 8b1f514a2d index: validate current write stages 2017-04-30 10:18:07 +02:00
Fabian Reinartz 9b4eafcc4c Simplify and document postings serialization 2017-04-30 10:10:18 +02:00
Fabian Reinartz 0aad526d1a Simplify label value index
This removes the flag from the label value index and serializes it using
encbufs.
TODO: move CRC32 checksum into label value index hash table for
referntial integrity.
2017-04-30 10:10:18 +02:00
Fabian Reinartz d30b181406 Switch series serialization to use encbufs 2017-04-30 10:10:18 +02:00
Fabian Reinartz 2ebaf1af4f Add encode buffer and simplify symbol serialization 2017-04-30 10:10:18 +02:00
Fabian Reinartz 433e73f865 Change series and symbol table format 2017-04-30 10:10:18 +02:00
Fabian Reinartz df96d97dab Move chunk checksum 2017-04-30 10:10:18 +02:00
Julius Volz 8d1fb4fa01 Minor comment fixes and additions. 2017-04-28 15:41:42 +02:00
Fabian Reinartz 778103b450 Add liecence file and headers 2017-04-10 20:59:45 +02:00
Fabian Reinartz 7de2217011 Add fast-path for equality matching 2017-04-05 15:37:48 +02:00
Fabian Reinartz 10c7c9acbe Adjust import names to new repository organisation 2017-04-04 11:27:26 +02:00
Goutham Veeramachaneni 71e05a22c7
Add mockIndex And Refactor Tests To Use That 2017-03-30 04:48:41 +05:30
Goutham Veeramachaneni 7b94a4e17d
Rename bytePostings To bigEndianPostings
* To be more specific about the contents of the byte slice.
2017-03-27 14:04:42 +05:30
Goutham Veeramachaneni efb0dfe1be
Implement Postings Iterator Over Bytes
Closes fabxc/tsdb#18
2017-03-26 23:40:12 +05:30
Fabian Reinartz 2ef3682560 Hotfix erroneous "label index missing" error 2017-03-20 11:37:06 +01:00
Fabian Reinartz a8e8903350 Use ChunkMeta references for clarity
This has been a common source of hard to debug issues. Its a premature
and unbenchmarked optimization and semantically, we want ChunkMetas to
be references in all changed cases.
2017-03-14 15:40:16 +01:00
Fabian Reinartz 8a7addfc44 Split persistence by chunk/index instead of read/write 2017-03-07 12:48:52 +01:00
Fabian Reinartz 2a825f6c28 Consolidate mem index into HeadBlock 2016-12-22 01:12:28 +01:00