mirror of
https://github.com/prometheus/prometheus.git
synced 2024-12-24 05:04:05 -08:00
docs: add new WAL format
Signed-off-by: Fabian Reinartz <freinartz@google.com>
This commit is contained in:
parent
3f538817f8
commit
0ad2b8a349
72
docs/format/wal.md
Normal file
72
docs/format/wal.md
Normal file
|
@ -0,0 +1,72 @@
|
|||
# WAL Disk Format
|
||||
|
||||
The write ahead log operates in segments that that are numbered and sequential,
|
||||
e.g. `000000`, `000001`, `000002`, etc., and are limited to 128MB by default.
|
||||
A segment is written to in pages of 32KB. Only the last page of the most recent segment
|
||||
may be partial. A WAL record is an opaque byte slice that gets split up into sub-records
|
||||
should it exceed the remaining space of the current page. Records are never split across
|
||||
segment boundaries.
|
||||
The encoding of pages is largely borrowed from [LevelDB's/RocksDB's wirte ahead log.][1]
|
||||
|
||||
Notable deviations are that the record fragment is encoded as:
|
||||
|
||||
┌───────────┬──────────┬────────────┬──────────────┐
|
||||
│ type <1b> │ len <2b> │ CRC32 <4b> │ data <bytes> │
|
||||
└───────────┴──────────┴────────────┴──────────────┘
|
||||
|
||||
## Record encoding
|
||||
|
||||
The records written to the write ahead log are encoded as follows:
|
||||
|
||||
### Series records
|
||||
|
||||
Series records encode the labels that identifier a series and its unique ID.
|
||||
|
||||
┌────────────────────────────────────────────┐
|
||||
│ type = 1 <1b> │
|
||||
├────────────────────────────────────────────┤
|
||||
│ ┌─────────┬──────────────────────────────┐ │
|
||||
│ │ id <8b> │ n = len(labels) <uvarint> │ │
|
||||
│ ├─────────┴────────────┬─────────────────┤ │
|
||||
│ │ len(str_1) <uvarint> │ str_1 <bytes> │ │
|
||||
│ ├──────────────────────┴─────────────────┤ │
|
||||
│ │ ... │ │
|
||||
│ ├───────────────────────┬────────────────┤ │
|
||||
│ │ len(str_2n) <uvarint> │ str_2n <bytes> │ │
|
||||
│ └───────────────────────┴────────────────┘ │
|
||||
│ . . . │
|
||||
└────────────────────────────────────────────┘
|
||||
|
||||
### Sample records
|
||||
|
||||
Sample records encode samples as a list of triples `(series_id, timestamp, value)`.
|
||||
Series reference and timestamp are encoded as deltas w.r.t the first sample.
|
||||
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ type = 2 <1b> │
|
||||
├──────────────────────────────────────────────────────────────────┤
|
||||
│ ┌────────────────────┬───────────────────────────┬─────────────┐ │
|
||||
│ │ id <8b> │ timestamp <8b> │ value <8b> │ │
|
||||
│ └────────────────────┴───────────────────────────┴─────────────┘ │
|
||||
│ ┌────────────────────┬───────────────────────────┬─────────────┐ │
|
||||
│ │ id_delta <uvarint> │ timestamp_delta <uvarint> │ value <8b> │ │
|
||||
│ └────────────────────┴───────────────────────────┴─────────────┘ │
|
||||
│ . . . │
|
||||
└──────────────────────────────────────────────────────────────────┘
|
||||
|
||||
### Tombstone records
|
||||
|
||||
Tombstone records encode tombstones as a list of triples `(series_id, min_time, max_time)`
|
||||
and specify an interval for which samples of a series got deleted.
|
||||
|
||||
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ type = 3 <1b> │
|
||||
├─────────────────────────────────────────────────────┤
|
||||
│ ┌─────────┬───────────────────┬───────────────────┐ │
|
||||
│ │ id <8b> │ min_time <varint> │ max_time <varint> │ │
|
||||
│ └─────────┴───────────────────┴───────────────────┘ │
|
||||
│ . . . │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
|
||||
[1][https://github.com/facebook/rocksdb/wiki/Write-Ahead-Log-File-Format]
|
Loading…
Reference in a new issue