Add more details about retention to storage docs (#5842)

* Make compaction docs a little more clear, easy to find.
* Expand compaction docs slightly.
* Add notes about block cleanup to operational section.

Signed-off-by: Ben Kochie <superq@gmail.com>
This commit is contained in:
Ben Kochie 2019-08-07 18:04:48 +02:00 committed by Brian Brazil
parent 8318aa2d5d
commit ff40de7ca6

View file

@ -42,12 +42,17 @@ The directory structure of a Prometheus server's data directory will look someth
  └── checkpoint.000001
```
The initial two-hour blocks are eventually compacted into longer blocks in the background.
Note that a limitation of the local storage is that it is not clustered or replicated. Thus, it is not arbitrarily scalable or durable in the face of disk or node outages and should thus be treated as more of an ephemeral sliding window of recent data. However, if your durability requirements are not strict, you may still succeed in storing up to years of data in the local storage.
For further details on file format, see [TSDB format](https://github.com/prometheus/tsdb/blob/master/docs/format/README.md).
## Compaction
The initial two-hour blocks are eventually compacted into longer blocks in the background.
Compaction will create larger blocks up to 10% of the rention time, or 21 days, whichever is smaller.
## Operational aspects
Prometheus has several flags that allow configuring the local storage. The most important ones are:
@ -70,6 +75,8 @@ If your local storage becomes corrupted for whatever reason, your best bet is to
If both time and size retention policies are specified, whichever policy triggers first will be used at that instant.
Expired block cleanup happens on a background schedule. It may take up to two hours remove expired blocks. Expired blocks must be fully expired before they are cleaned up.
## Remote storage integrations
Prometheus's local storage is limited by single nodes in its scalability and durability. Instead of trying to solve clustered storage in Prometheus itself, Prometheus has a set of interfaces that allow integrating with remote storage systems.