From 1a4e54cfbb536fec93f174449ea3f8f80a85ea1b Mon Sep 17 00:00:00 2001 From: beorn7 Date: Mon, 18 Oct 2021 17:49:28 +0200 Subject: [PATCH] tsdb: Complete chunk format documentation This also tweaks and fixes a few things done previously. Signed-off-by: beorn7 --- tsdb/chunkenc/histogram.go | 2 -- tsdb/docs/format/chunks.md | 71 ++++++++++++++++++++++++++++++++------ 2 files changed, 60 insertions(+), 13 deletions(-) diff --git a/tsdb/chunkenc/histogram.go b/tsdb/chunkenc/histogram.go index 39bbbf221..e68677129 100644 --- a/tsdb/chunkenc/histogram.go +++ b/tsdb/chunkenc/histogram.go @@ -27,8 +27,6 @@ const () // HistogramChunk holds encoded sample data for a sparse, high-resolution // histogram. // -// TODO(beorn7): Document the layout of chunk metadata. -// // Each sample has multiple "fields", stored in the following way (raw = store // number directly, delta = store delta to the previous number, dod = store // delta of the delta to the previous number, xor = what we do for regular diff --git a/tsdb/docs/format/chunks.md b/tsdb/docs/format/chunks.md index 30b9cd6f1..8318e0a54 100644 --- a/tsdb/docs/format/chunks.md +++ b/tsdb/docs/format/chunks.md @@ -42,9 +42,9 @@ Notes: ## XOR chunk data ``` -┌──────────────────────┬───────────────┬───────────────┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┬─────┐ -│ num_samples │ ts_0 │ v_0 │ ts_1_delta │ v_1_xor │ ts_n_dod │ v_n_xor │ ... │ -└──────────────────────┴───────────────┴───────────────┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┴─────┘ +┌──────────────────────┬───────────────┬───────────────┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┬─────┬──────────────────────┬──────────────────────┬──────────────────┐ +│ num_samples │ ts_0 │ v_0 │ ts_1_delta │ v_1_xor │ ts_2_dod │ v_2_xor │ ... │ ts_n_dod │ v_n_xor │ padding │ +└──────────────────────┴───────────────┴───────────────┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┴─────┴──────────────────────┴──────────────────────┴──────────────────┘ ``` ### Notes: @@ -55,41 +55,90 @@ Notes: * `` and `` have 1 to 10 bytes each. * `ts_1_delta` is `ts_1` – `ts_0`. * `ts_n_dod` is the “delta of deltas” of timestamps, i.e. (`ts_n` – `ts_n-1`) – (`ts_n-1` – `ts_n-2`). -* `` is the result of `v_n` XOR `v_n-1`. +* `v_n_xor` is the result of `v_n` XOR `v_n-1`. * `` is a specific variable bitwidth encoding of the result of XORing the current and the previous value. It has between 1 bit and 77 bits. See [code for details](https://github.com/prometheus/prometheus/blob/7309c20e7e5774e7838f183ec97c65baa4362edc/tsdb/chunkenc/xor.go#L220-L253). * `` is a specific variable bitwidth encoding for the “delta of deltas” of timestamps (signed integers that are ideally small). It has between 1 and 68 bits. see [code for details](https://github.com/prometheus/prometheus/blob/7309c20e7e5774e7838f183ec97c65baa4362edc/tsdb/chunkenc/xor.go#L179-L205). +* `padding` of 0 to 7 bits so that the whole chunk data is byte-aligned. +* The chunk can have as few as one sample, i.e. `ts_1`, `v_1`, etc. are optional. ## Histogram chunk data ``` -┌──────────────────────┬───────────────────────────────┬─────────────────────┬──────────────────┬──────────────────┬────────────────┐ -│ num_samples │ zero_threshold <1 or 9 bytes> │ schema │ pos_spans │ neg_spans │ samples │ -└──────────────────────┴───────────────────────────────┴─────────────────────┴──────────────────┴──────────────────┴────────────────┘ +┌──────────────────────┬──────────────────────────┬───────────────────────────────┬─────────────────────┬──────────────────┬──────────────────┬────────────────┬──────────────────┐ +│ num_samples │ histogram_flags <1 byte> │ zero_threshold <1 or 9 bytes> │ schema │ pos_spans │ neg_spans │ samples │ padding │ +└──────────────────────┴──────────────────────────┴───────────────────────────────┴─────────────────────┴──────────────────┴──────────────────┴────────────────┴──────────────────┘ ``` ### Positive and negative spans data: ``` -┌───────────────────┬────────────────────────┬───────────────────────┬─────┬──────────────────────────┬─────────────────────────┐ -│ num │ length_1 │ offset_1 │ ... │ length_num │ offset_num │ -└───────────────────┴────────────────────────┴───────────────────────┴─────┴──────────────────────────┴─────────────────────────┘ +┌─────────────────────────┬────────────────────────┬───────────────────────┬────────────────────────┬───────────────────────┬─────┬────────────────────────┬───────────────────────┐ +│ num_spans │ length_0 │ offset_0 │ length_1 │ offset_1 │ ... │ length_n │ offset_n │ +└─────────────────────────┴────────────────────────┴───────────────────────┴────────────────────────┴───────────────────────┴─────┴────────────────────────┴───────────────────────┘ ``` ### Samples data: ``` -TODO +┌──────────────────────────┐ +│ sample_0 │ +├──────────────────────────┤ +│ sample_1 │ +├──────────────────────────┤ +│ sample_2 │ +├──────────────────────────┤ +│ ... │ +├──────────────────────────┤ +│ Sample_n │ +└──────────────────────────┘ +``` + +#### Sample 0 data: + +``` +┌─────────────────┬─────────────────────┬──────────────────────────┬───────────────┬───────────────────────────┬─────┬───────────────────────────┬───────────────────────────┬─────┬───────────────────────────┐ +│ ts │ count │ zero_count │ sum │ pos_bucket_0 │ ... │ pos_bucket_n │ neg_bucket_0 │ ... │ neg_bucket_n │ +└─────────────────┴─────────────────────┴──────────────────────────┴───────────────┴───────────────────────────┴─────┴───────────────────────────┴───────────────────────────┴─────┴───────────────────────────┘ +``` + +#### Sample 1 data: + +``` +┌────────────────────────┬───────────────────────────┬────────────────────────────────┬──────────────────────┬─────────────────────────────────┬─────┬─────────────────────────────────┬─────────────────────────────────┬─────┬─────────────────────────────────┐ +│ ts_delta │ count_delta │ zero_count_delta │ sum_xor │ pos_bucket_0_delta │ ... │ pos_bucket_n_delta │ neg_bucket_0_delta │ ... │ neg_bucket_n_delta │ +└────────────────────────┴───────────────────────────┴────────────────────────────────┴──────────────────────┴─────────────────────────────────┴─────┴─────────────────────────────────┴─────────────────────────────────┴─────┴─────────────────────────────────┘ +``` + +#### Sample 2 data and following: + +``` +┌─────────────────────┬────────────────────────┬─────────────────────────────┬──────────────────────┬───────────────────────────────┬─────┬───────────────────────────────┬───────────────────────────────┬─────┬───────────────────────────────┐ +│ ts_dod │ count_dod │ zero_count_dod │ sum_xor │ pos_bucket_0_dod │ ... │ pos_bucket_n_dod │ neg_bucket_0_dod │ ... │ neg_bucket_n_dod │ +└─────────────────────┴────────────────────────┴─────────────────────────────┴──────────────────────┴───────────────────────────────┴─────┴───────────────────────────────┴───────────────────────────────┴─────┴───────────────────────────────┘ ``` ### Notes: +* `histogram_flags` is a byte of which currently only the first two bits are used: + * `10`: Counter reset between the previous chunk and this one. + * `01`: No counter reset between the previous chunk and this one. + * `00`: Counter reset status unknown. + * `11`: Chunk is part of a gauge histogram, no counter resets are happening. * `zero_threshold` has a specific encoding: * If 0, it is a single zero byte. * If a power of two between 2^-243 and 2^10, it is a single byte between 1 and 254. * Otherwise, it is a byte with all bits set (255), followed by a float64, resulting in 9 bytes length. * `schema` is a specific value defined by the exposition format. Currently valid values are -4 <= n <= 8. * `` is a variable bitwidth encoding for signed integers, optimized for “delta of deltas” of bucket deltas. It has between 1 bit and 9 bytes. + See [code for details](https://github.com/prometheus/prometheus/blob/8c1507ebaa4ca552958ffb60c2d1b21afb7150e4/tsdb/chunkenc/varbit.go#L31-L60). * `` is a variable bitwidth encoding for unsigned integers with the same bit-bucketing as ``. + See [code for details](https://github.com/prometheus/prometheus/blob/8c1507ebaa4ca552958ffb60c2d1b21afb7150e4/tsdb/chunkenc/varbit.go#L136-L165). +* `` is a specific variable bitwidth encoding of the result of XORing the current and the previous value. It has between 1 bit and 77 bits. + See [code for details](https://github.com/prometheus/prometheus/blob/8c1507ebaa4ca552958ffb60c2d1b21afb7150e4/tsdb/chunkenc/histogram.go#L538-L574). +* `padding` of 0 to 7 bits so that the whole chunk data is byte-aligned. +* Note that buckets are inherently deltas between the current bucket and the previous bucket. Only `bucket_0` is an absolute count. +* The chunk can have as few as one sample, i.e. sample 1 and following are optional. +* Similarly, there could be down to zero spans and down to zero buckets.