prometheus/tsdb
Łukasz Mierzwa b880cea613 Fix locks in db.reloadBlocks()
This partially reverts ae3d392aa9.

ae3d392aa9 added a call to db.mtx.Lock() that lasts for the entire duration of db.reloadBlocks(),
previous db.mtx would be locked only during critical part of db.reloadBlocks().
The motivation was to protect against races:
9e0351e161 (r555699794)
The 'reloads' being mentioned are (I think) reloadBlocks() calls, rather than db.reload() or other methods.
TestTombstoneCleanRetentionLimitsRace was added to catch this but I wasn't able to ever get any error out of it, even after disabling all calls to db.mtx in reloadBlocks() and CleanTombstones().
To make things more complicated CleanupTombstones() itself calls reloadBlocks(), so it seems that the real issue is that we might have concurrent calls to reloadBlocks().

The problem with this change is that db.reloadBlocks() can take a very long time, that's because it might need to load very large blocks from disk, which is slow.
While db.mtx is locked a large chunk of the db is locked, including queries, since db.mtx read lock is needed for db.Querier() call.
One of the issues this manifests itself as is a gap in all metrics and blocked queries just after a large block compaction happens.
When compaction merges multiple day-or-more blocks into a week-or-more block it create a single very big block.
After that block is written it needs to be loaded and that seems to be taking many seconds (30-45), during which mtx is held and everything is blocked.

Turns out that there is another lock that is more fine grained and aimed at this specific use case:

// cmtx ensures that compactions and deletions don't run simultaneously.
cmtx sync.Mutex

All calls to reloadBlocks() are wrapped inside cmtx lock. The only exception is db.reload() which this change fixes.
We can't add cmtx lock inside reloadBlocks() itself because it's called by a number of functions, some of which are already holding cmtx.

Looking at the code I think it is sufficient to hold cmtx and skip a reloadBlocks() wide mtx call.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2025-01-09 17:05:39 +00:00
..
agent fix(test): do not run automatic WAL truncate during test 2024-12-10 17:30:46 +01:00
chunkenc tsdb/chunkenc: don't reuse custom value slices between histograms 2024-11-29 16:28:09 +11:00
chunks enable errorf rule from perfsprint linter 2024-11-06 16:50:36 +01:00
docs Merge branch 'main' into cedwards/nhcb-wal-wbl 2025-01-02 12:50:19 +01:00
encoding Attempt for record type 2024-12-05 09:21:47 -08:00
errors Enable default revive rules (#13068) 2023-11-29 17:23:34 +00:00
fileutil tests: remove err from message when testify prints it already 2024-02-01 14:18:01 +00:00
goversion remove obsolete build tag 2024-01-17 22:26:32 +08:00
index Expose ListPostings Length via Len() method (#15678) 2025-01-07 17:58:26 +01:00
record record_test.go: avoid captures, simply return test refs 2025-01-02 12:45:20 +01:00
testdata tsdb: Delete blocks atomically; Remove tmp blocks on start; Added test. (#7772) 2020-08-11 06:56:08 +01:00
tombstones chore!: adopt log/slog, remove go-kit/log 2024-10-07 15:58:50 -04:00
tsdbutil Merge branch 'main' into cedwards/nhcb-wal-wbl 2025-01-02 12:50:19 +01:00
wlog Merge branch 'main' into cedwards/nhcb-wal-wbl 2025-01-02 12:50:19 +01:00
.gitignore Moving tsdb into its own subdirectory 2019-08-13 13:58:49 +05:30
block.go [PERF] TSDB: Optimize inverse matching (#14144) 2024-11-19 15:49:01 +00:00
block_test.go feat: Allow customizing TSDB postings decoder (#13567) 2024-11-11 07:59:24 +01:00
blockwriter.go chore!: adopt log/slog, remove go-kit/log 2024-10-07 15:58:50 -04:00
blockwriter_test.go feat: Allow customizing TSDB postings decoder (#13567) 2024-11-11 07:59:24 +01:00
CHANGELOG.md Rename default branch to main 2021-02-22 20:28:02 +01:00
compact.go Merge pull request #14489 from harry671003/implement_metadata_limit 2024-11-19 17:32:16 +01:00
compact_test.go feat: Allow customizing TSDB postings decoder (#13567) 2024-11-11 07:59:24 +01:00
db.go Fix locks in db.reloadBlocks() 2025-01-09 17:05:39 +00:00
db_test.go Merge branch 'main' into cedwards/nhcb-wal-wbl 2025-01-02 12:50:19 +01:00
example_test.go Add context argument to Querier.Select (#12660) 2023-09-12 12:37:38 +02:00
exemplar.go tsdb.CircularExemplarStorage: Avoid racing (#15231) 2024-10-29 10:40:46 +01:00
exemplar_test.go tsdb.CircularExemplarStorage: Avoid racing (#15231) 2024-10-29 10:40:46 +01:00
head.go [ENHANCEMENT] TSDB: Improve calculation of space used by labels (#13880) 2024-12-16 09:42:52 +00:00
head_append.go Use new record type only for NHCB 2024-12-06 13:46:20 -08:00
head_bench_test.go Revert "Fix MemPostings.Add and MemPostings.Get data race (#15141)" 2024-11-03 12:30:34 +00:00
head_dedupelabels.go chore!: adopt log/slog, remove go-kit/log 2024-10-07 15:58:50 -04:00
head_other.go chore!: adopt log/slog, remove go-kit/log 2024-10-07 15:58:50 -04:00
head_read.go TSDB: Move merge of head postings into index 2024-12-20 19:22:30 +00:00
head_read_test.go TSDB: Simplify OOO Select by copying the head chunk (#14396) 2024-07-03 15:08:07 +01:00
head_test.go Merge branch 'main' into cedwards/nhcb-wal-wbl 2025-01-02 12:50:19 +01:00
head_wal.go Use new record type only for NHCB 2024-12-06 13:46:20 -08:00
isolation.go tsdb: create isolation transaction slice on demand 2023-10-21 13:45:47 +00:00
isolation_test.go tsdb: turn off transaction isolation for head compaction (#11317) 2022-09-27 19:31:23 +05:30
mocks_test.go tsdb: use Go standard errors 2023-12-11 12:18:54 +00:00
ooo_head.go TSDB: Remove code for querying OOO-head only 2024-08-14 13:41:13 +01:00
ooo_head_read.go [PERF] TSDB: Optimize inverse matching (#14144) 2024-11-19 15:49:01 +00:00
ooo_head_read_test.go Rename old histogram record type, use old names for new records 2024-12-05 09:21:47 -08:00
ooo_head_test.go fix TestOOOHeadChunkReader_Chunk on 32-bit 2024-12-16 10:45:07 -05:00
ooo_isolation.go Fix issue where queries can fail or omit OOO samples if OOO head compaction occurs between creating a querier and reading chunks (#13115) 2023-11-24 12:38:38 +01:00
ooo_isolation_test.go Fix issue where queries can fail or omit OOO samples if OOO head compaction occurs between creating a querier and reading chunks (#13115) 2023-11-24 12:38:38 +01:00
querier.go Fix bug in lbl!~".+" shortcut (#15684) 2024-12-17 17:34:24 +01:00
querier_bench_test.go feat: Allow customizing TSDB postings decoder (#13567) 2024-11-11 07:59:24 +01:00
querier_test.go Merge pull request #15548 from TinfoilSubmarine/fix/386-test-failures 2024-12-18 15:49:30 +01:00
README.md Fixed broken link in tsdb README.md 2022-10-07 16:20:20 +00:00
repair.go chore!: adopt log/slog, remove go-kit/log 2024-10-07 15:58:50 -04:00
repair_test.go feat: Allow customizing TSDB postings decoder (#13567) 2024-11-11 07:59:24 +01:00
testutil.go Merge branch 'main' into cedwards/nhcb-wal-wbl 2025-01-02 12:50:19 +01:00
tsdbblockutil.go enable errorf rule from perfsprint linter 2024-11-06 16:50:36 +01:00

TSDB

GoPkg

This directory contains the Prometheus TSDB (Time Series DataBase) library, which handles storage and querying of all Prometheus v2 data.

Documentation

External resources

A series of blog posts explaining different components of TSDB: