mirror of
https://github.com/prometheus/prometheus.git
synced 2024-11-13 17:14:05 -08:00
Merge pull request #15135 from prometheus/beorn7/docs
docs: Update chunk layot for NHCB
This commit is contained in:
commit
720a2599a6
|
@ -29,14 +29,15 @@ in-file offset (lower 4 bytes) and segment sequence number (upper 4 bytes).
|
|||
# Chunk
|
||||
|
||||
```
|
||||
┌───────────────┬───────────────────┬──────────────┬────────────────┐
|
||||
│ len <uvarint> │ encoding <1 byte> │ data <bytes> │ CRC32 <4 byte> │
|
||||
└───────────────┴───────────────────┴──────────────┴────────────────┘
|
||||
┌───────────────┬───────────────────┬─────────────┬────────────────┐
|
||||
│ len <uvarint> │ encoding <1 byte> │ data <data> │ CRC32 <4 byte> │
|
||||
└───────────────┴───────────────────┴─────────────┴────────────────┘
|
||||
```
|
||||
|
||||
Notes:
|
||||
* `<uvarint>` has 1 to 10 bytes.
|
||||
* `encoding`: Currently either `XOR`, `histogram`, or `float histogram`.
|
||||
* `encoding`: Currently either `XOR`, `histogram`, or `floathistogram`, see
|
||||
[code for numerical values](https://github.com/prometheus/prometheus/blob/02d0de9987ad99dee5de21853715954fadb3239f/tsdb/chunkenc/chunk.go#L28-L47).
|
||||
* `data`: See below for each encoding.
|
||||
|
||||
## XOR chunk data
|
||||
|
@ -67,9 +68,9 @@ Notes:
|
|||
## Histogram chunk data
|
||||
|
||||
```
|
||||
┌──────────────────────┬──────────────────────────┬───────────────────────────────┬─────────────────────┬──────────────────┬──────────────────┬────────────────┬──────────────────┐
|
||||
│ num_samples <uint16> │ histogram_flags <1 byte> │ zero_threshold <1 or 9 bytes> │ schema <varbit_int> │ pos_spans <data> │ neg_spans <data> │ samples <data> │ padding <x bits> │
|
||||
└──────────────────────┴──────────────────────────┴───────────────────────────────┴─────────────────────┴──────────────────┴──────────────────┴────────────────┴──────────────────┘
|
||||
┌──────────────────────┬──────────────────────────┬───────────────────────────────┬─────────────────────┬──────────────────┬──────────────────┬──────────────────────┬────────────────┬──────────────────┐
|
||||
│ num_samples <uint16> │ histogram_flags <1 byte> │ zero_threshold <1 or 9 bytes> │ schema <varbit_int> │ pos_spans <data> │ neg_spans <data> │ custom_values <data> │ samples <data> │ padding <x bits> │
|
||||
└──────────────────────┴──────────────────────────┴───────────────────────────────┴─────────────────────┴──────────────────┴──────────────────┴──────────────────────┴────────────────┴──────────────────┘
|
||||
```
|
||||
|
||||
### Positive and negative spans data:
|
||||
|
@ -80,6 +81,16 @@ Notes:
|
|||
└─────────────────────────┴────────────────────────┴───────────────────────┴────────────────────────┴───────────────────────┴─────┴────────────────────────┴───────────────────────┘
|
||||
```
|
||||
|
||||
### Custom values data:
|
||||
|
||||
The `custom_values` data is currently only used for schema -53 (custom bucket boundaries). For other schemas, it is empty (length of zero).
|
||||
|
||||
```
|
||||
┌──────────────────────────┬──────────────────┬──────────────────┬─────┬──────────────────┐
|
||||
│ num_values <varbit_uint> │ value_0 <custom> │ value_1 <custom> │ ... │ value_n <custom> │
|
||||
└──────────────────────────┴─────────────────────────────────────┴─────┴──────────────────┘
|
||||
```
|
||||
|
||||
### Samples data:
|
||||
|
||||
```
|
||||
|
@ -131,7 +142,9 @@ Notes:
|
|||
* If 0, it is a single zero byte.
|
||||
* If a power of two between 2^-243 and 2^10, it is a single byte between 1 and 254.
|
||||
* Otherwise, it is a byte with all bits set (255), followed by a float64, resulting in 9 bytes length.
|
||||
* `schema` is a specific value defined by the exposition format. Currently valid values are -4 <= n <= 8.
|
||||
* `schema` is a specific value defined by the exposition format. Currently
|
||||
valid values are either -4 <= n <= 8 (standard exponential schemas) or -53
|
||||
(custom bucket boundaries).
|
||||
* `<varbit_int>` is a variable bitwidth encoding for signed integers, optimized for “delta of deltas” of bucket deltas. It has between 1 bit and 9 bytes.
|
||||
See [code for details](https://github.com/prometheus/prometheus/blob/8c1507ebaa4ca552958ffb60c2d1b21afb7150e4/tsdb/chunkenc/varbit.go#L31-L60).
|
||||
* `<varbit_uint>` is a variable bitwidth encoding for unsigned integers with the same bit-bucketing as `<varbit_int>`.
|
||||
|
@ -143,6 +156,28 @@ Notes:
|
|||
* The chunk can have as few as one sample, i.e. sample 1 and following are optional.
|
||||
* Similarly, there could be down to zero spans and down to zero buckets.
|
||||
|
||||
The `<custom>` encoding within the custom values data depends on the schema.
|
||||
For schema -53 (custom bucket boundaries, currently the only use case for
|
||||
custom values), the values to encode are bucket boundaries in the form of
|
||||
floats. The encoding of a given float value _x_ works as follows:
|
||||
|
||||
1. Create an intermediate value _y_ = _x_ * 1000.
|
||||
2. If 0 ≤ _y_ ≤ 33554430 _and_ if the decimal value of _y_ is integer, store
|
||||
_y_ + 1 as `<varbit_uint>`.
|
||||
3. Otherwise, store a 0 bit, followed by the 64 bit of the original _x_
|
||||
encoded as plain `<float64>`.
|
||||
|
||||
Note that values stored as per (2) will always start with a 1 bit, which allow
|
||||
decoders to recognize this case in contrast to values stores as per (3), which
|
||||
always start with a 0 bit.
|
||||
|
||||
The rational behind this encoding is that most custom bucket boundaries are set
|
||||
by humans as decimal numbers with not very many decimal places. In most cases,
|
||||
the encoding will therefore result in a short varbit representation. The upper
|
||||
bound of 33554430 is picked so that the varbit encoded value will take at most
|
||||
4 bytes.
|
||||
|
||||
|
||||
## Float histogram chunk data
|
||||
|
||||
Float histograms have the same layout as histograms apart from the encoding of samples.
|
||||
|
|
Loading…
Reference in a new issue