Reword storage docs on corruption issues

Reword the section on what to do if major corruption happens. Users are being confused by the existing wording and cherry-picking the meaning from the single sentance about durability. Signed-off-by: SuperQ <superq@gmail.com>
2025-03-05 20:59:13 -08:00 · 2024-08-09 10:48:55 +02:00 · 2024-08-09 10:48:55 +02:00 · b7dd209663
parent cf62fb5c44
commit b7dd209663
1 changed files with 6 additions and 7 deletions
--- a/docs/storage.md
+++ b/docs/storage.md
@ -117,13 +117,12 @@ time series you scrape (fewer targets or fewer series per target), or you
 can increase the scrape interval. However, reducing the number of series is
 likely more effective, due to compression of samples within a series.

-If your local storage becomes corrupted for whatever reason, the best
-strategy to address the problem is to shut down Prometheus then remove the
-entire storage directory. You can also try removing individual block directories,
-or the WAL directory to resolve the problem. Note that this means losing
-approximately two hours data per block directory. Again, Prometheus's local
-storage is not intended to be durable long-term storage; external solutions
-offer extended retention and data durability.
+If your local storage becomes corrupted to the point where Prometheus will not
+start it is recommended to backup the storage directory and restore the
+corrupted block directories from your backups. If you do not have backups the
+last resort is to remove the corrupted files. For example you can try removing
+individual block directories or the write-ahead-log (wal) files. Note that this
+means losing the data for the time range those blocks or wal covers.

 CAUTION: Non-POSIX compliant filesystems are not supported for Prometheus'
 local storage as unrecoverable corruptions may happen. NFS filesystems