Merge branch 'prometheus:main' into openstack-loadbalancer-discovery

This commit is contained in:
Paulo Dias 2024-12-09 14:43:23 +00:00 committed by GitHub
commit c5bb06586d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 125 additions and 15 deletions

View file

@ -1688,7 +1688,7 @@ func (s *shards) updateMetrics(_ context.Context, err error, sampleCount, exempl
s.enqueuedHistograms.Sub(int64(histogramCount))
}
// sendSamples to the remote storage with backoff for recoverable errors.
// sendSamplesWithBackoff to the remote storage with backoff for recoverable errors.
func (s *shards) sendSamplesWithBackoff(ctx context.Context, samples []prompb.TimeSeries, sampleCount, exemplarCount, histogramCount, metadataCount int, pBuf *proto.Buffer, buf *[]byte, enc Compression) (WriteResponseStats, error) {
// Build the WriteRequest with no metadata.
req, highest, lowest, err := buildWriteRequest(s.qm.logger, samples, nil, pBuf, buf, nil, enc)
@ -1802,7 +1802,7 @@ func (s *shards) sendSamplesWithBackoff(ctx context.Context, samples []prompb.Ti
return accumulatedStats, err
}
// sendV2Samples to the remote storage with backoff for recoverable errors.
// sendV2SamplesWithBackoff to the remote storage with backoff for recoverable errors.
func (s *shards) sendV2SamplesWithBackoff(ctx context.Context, samples []writev2.TimeSeries, labels []string, sampleCount, exemplarCount, histogramCount, metadataCount int, pBuf, buf *[]byte, enc Compression) (WriteResponseStats, error) {
// Build the WriteRequest with no metadata.
req, highest, lowest, err := buildV2WriteRequest(s.qm.logger, samples, labels, pBuf, buf, nil, enc)

View file

@ -1,15 +1,32 @@
# WAL Disk Format
This document describes the official Prometheus WAL format.
The write ahead log operates in segments that are numbered and sequential,
e.g. `000000`, `000001`, `000002`, etc., and are limited to 128MB by default.
A segment is written to in pages of 32KB. Only the last page of the most recent segment
and are limited to 128MB by default.
## Segment filename
The sequence number is captured in the segment filename,
e.g. `000000`, `000001`, `000002`, etc. The first unsigned integer represents
the sequence number of the segment, typically encoded with six digits.
## Segment encoding
This section describes the segment encoding.
A segment encodes an array of records. It does not contain any header. A segment
is written to pages of 32KB. Only the last page of the most recent segment
may be partial. A WAL record is an opaque byte slice that gets split up into sub-records
should it exceed the remaining space of the current page. Records are never split across
segment boundaries. If a single record exceeds the default segment size, a segment with
a larger size will be created.
The encoding of pages is largely borrowed from [LevelDB's/RocksDB's write ahead log.](https://github.com/facebook/rocksdb/wiki/Write-Ahead-Log-File-Format)
Notable deviations are that the record fragment is encoded as:
### Records encoding
Each record fragment is encoded as:
```
┌───────────┬──────────┬────────────┬──────────────┐
@ -17,7 +34,8 @@ Notable deviations are that the record fragment is encoded as:
└───────────┴──────────┴────────────┴──────────────┘
```
The initial type byte is made up of three components: a 3-bit reserved field, a 1-bit zstd compression flag, a 1-bit snappy compression flag, and a 3-bit type flag.
The initial type byte is made up of three components: a 3-bit reserved field,
a 1-bit zstd compression flag, a 1-bit snappy compression flag, and a 3-bit type flag.
```
┌─────────────────┬──────────────────┬────────────────────┬──────────────────┐
@ -25,7 +43,7 @@ The initial type byte is made up of three components: a 3-bit reserved field, a
└─────────────────┴──────────────────┴────────────────────┴──────────────────┘
```
The lowest 3 bits within this flag represent the record type as follows:
The lowest 3 bits within the type flag represent the record type as follows:
* `0`: rest of page will be empty
* `1`: a full record encoded in a single fragment
@ -33,11 +51,16 @@ The lowest 3 bits within this flag represent the record type as follows:
* `3`: middle fragment of a record
* `4`: final fragment of a record
## Record encoding
After the type byte, 2-byte length and then 4-byte checksum of the following data are encoded.
The records written to the write ahead log are encoded as follows:
All float values are represented using the [IEEE 754 format](https://en.wikipedia.org/wiki/IEEE_754).
### Series records
### Record types
In the following sections, all the known record types are described. New types,
can be added in the future.
#### Series records
Series records encode the labels that identifies a series and its unique ID.
@ -58,7 +81,7 @@ Series records encode the labels that identifies a series and its unique ID.
└────────────────────────────────────────────┘
```
### Sample records
#### Sample records
Sample records encode samples as a list of triples `(series_id, timestamp, value)`.
Series reference and timestamp are encoded as deltas w.r.t the first sample.
@ -79,7 +102,7 @@ The first sample record begins at the second row.
└──────────────────────────────────────────────────────────────────┘
```
### Tombstone records
#### Tombstone records
Tombstone records encode tombstones as a list of triples `(series_id, min_time, max_time)`
and specify an interval for which samples of a series got deleted.
@ -95,7 +118,7 @@ and specify an interval for which samples of a series got deleted.
└─────────────────────────────────────────────────────┘
```
### Exemplar records
#### Exemplar records
Exemplar records encode exemplars as a list of triples `(series_id, timestamp, value)`
plus the length of the labels list, and all the labels.
@ -127,7 +150,7 @@ See: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/Op
└──────────────────────────────────────────────────────────────────┘
```
### Metadata records
#### Metadata records
Metadata records encode the metadata updates associated with a series.
@ -156,3 +179,90 @@ Metadata records encode the metadata updates associated with a series.
└────────────────────────────────────────────┘
```
#### Histogram records
Histogram records encode the integer and float native histogram samples.
A record with the integer native histograms with the exponential bucketing:
```
┌───────────────────────────────────────────────────────────────────────┐
│ type = 7 <1b>
├───────────────────────────────────────────────────────────────────────┤
│ ┌────────────────────┬───────────────────────────┐ │
│ │ id <8b> │ timestamp <8b> │ │
│ └────────────────────┴───────────────────────────┘ │
│ ┌────────────────────┬──────────────────────────────────────────────┐ │
│ │ id_delta <uvarint> │ timestamp_delta <uvarint> │ │
│ ├────────────────────┴────┬─────────────────────────────────────────┤ │
│ │ counter_reset_hint <1b> │ schema <varint> │ │
│ ├─────────────────────────┴────┬────────────────────────────────────┤ │
│ │ zero_threshold (float) <8b> │ zero_count <uvarint> │ │
│ ├─────────────────┬────────────┴────────────────────────────────────┤ │
│ │ count <uvarint> │ sum (float) <8b> │ │
│ ├─────────────────┴─────────────────────────────────────────────────┤ │
│ │ positive_spans_num <uvarint> │ │
│ ├─────────────────────────────────┬─────────────────────────────────┤ │
│ │ positive_span_offset_1 <varint> │ positive_span_len_1 <uvarint32> │ │
│ ├─────────────────────────────────┴─────────────────────────────────┤ │
│ │ . . . │ │
│ ├───────────────────────────────────────────────────────────────────┤ │
│ │ negative_spans_num <uvarint> │ │
│ ├───────────────────────────────┬───────────────────────────────────┤ │
│ │ negative_span_offset <varint> │ negative_span_len <uvarint32> │ │
│ ├───────────────────────────────┴───────────────────────────────────┤ │
│ │ . . . │ │
│ ├───────────────────────────────────────────────────────────────────┤ │
│ │ positive_bkts_num <uvarint> │ │
│ ├─────────────────────────┬───────┬─────────────────────────────────┤ │
│ │ positive_bkt_1 <varint> │ . . . │ positive_bkt_n <varint> │ │
│ ├─────────────────────────┴───────┴─────────────────────────────────┤ │
│ │ negative_bkts_num <uvarint> │ │
│ ├─────────────────────────┬───────┬─────────────────────────────────┤ │
│ │ negative_bkt_1 <varint> │ . . . │ negative_bkt_n <varint> │ │
│ └─────────────────────────┴───────┴─────────────────────────────────┘ │
│ . . . │
└───────────────────────────────────────────────────────────────────────┘
```
A records with the Float histograms:
```
┌───────────────────────────────────────────────────────────────────────┐
│ type = 8 <1b>
├───────────────────────────────────────────────────────────────────────┤
│ ┌────────────────────┬───────────────────────────┐ │
│ │ id <8b> │ timestamp <8b> │ │
│ └────────────────────┴───────────────────────────┘ │
│ ┌────────────────────┬──────────────────────────────────────────────┐ │
│ │ id_delta <uvarint> │ timestamp_delta <uvarint> │ │
│ ├────────────────────┴────┬─────────────────────────────────────────┤ │
│ │ counter_reset_hint <1b> │ schema <varint> │ │
│ ├─────────────────────────┴────┬────────────────────────────────────┤ │
│ │ zero_threshold (float) <8b> │ zero_count (float) <8b> │ │
│ ├────────────────────┬─────────┴────────────────────────────────────┤ │
│ │ count (float) <8b> │ sum (float) <8b> │ │
│ ├────────────────────┴──────────────────────────────────────────────┤ │
│ │ positive_spans_num <uvarint> │ │
│ ├─────────────────────────────────┬─────────────────────────────────┤ │
│ │ positive_span_offset_1 <varint> │ positive_span_len_1 <uvarint32> │ │
│ ├─────────────────────────────────┴─────────────────────────────────┤ │
│ │ . . . │ │
│ ├───────────────────────────────────────────────────────────────────┤ │
│ │ negative_spans_num <uvarint> │ │
│ ├───────────────────────────────┬───────────────────────────────────┤ │
│ │ negative_span_offset <varint> │ negative_span_len <uvarint32> │ │
│ ├───────────────────────────────┴───────────────────────────────────┤ │
│ │ . . . │ │
│ ├───────────────────────────────────────────────────────────────────┤ │
│ │ positive_bkts_num <uvarint> │ │
│ ├─────────────────────────────┬───────┬─────────────────────────────┤ │
│ │ positive_bkt_1 (float) <8b> │ . . . │ positive_bkt_n (float) <8b> │ │
│ ├─────────────────────────────┴───────┴─────────────────────────────┤ │
│ │ negative_bkts_num <uvarint> │ │
│ ├─────────────────────────────┬───────┬─────────────────────────────┤ │
│ │ negative_bkt_1 (float) <8b> │ . . . │ negative_bkt_n (float) <8b> │ │
│ └─────────────────────────────┴───────┴─────────────────────────────┘ │
│ . . . │
└───────────────────────────────────────────────────────────────────────┘
```