prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2025-03-05 20:59:13 -08:00

Author	SHA1	Message	Date
Bryan Boreham	89bf6e1df9	tsdb: Tidy up some test code Use simpler utility function to create Labels objects, making fewer assumptions about the data structure. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-15 19:39:46 +00:00
Bryan Boreham	0853250695	Review feedback Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-15 18:32:45 +00:00
Bryan Boreham	463f5cafdd	storage: re-use iterators to save garbage Re-use previous memory if it is already of the correct type. In `NewListSeries` we hoist the conversion to an interface value out so it only allocates once. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-15 18:32:45 +00:00
Bryan Boreham	f0866c0774	tsdb: optimise block series iterators Re-use previous memory if it is already of the correct type. Also turn two levels of function closure into a single object that holds the required data. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-15 18:32:45 +00:00
Bryan Boreham	3c7de69059	storage: allow re-use of iterators Patterned after `Chunk.Iterator()`: pass the old iterator in so it can be re-used to avoid allocating a new object. (This commit does not do any re-use; it is just changing all the method signatures so re-use is possible in later commits.) Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-15 18:32:45 +00:00
Julien Pivotto	475cfe8a6b	Merge remote-tracking branch 'origin/release-2.40' Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2022-12-14 11:22:01 +01:00
Ganesh Vernekar	db99fc43e4	Merge pull request #11632 from bboreham/improve-bbss tsdb: improve blockBaseSeriesSet scan	2022-12-14 15:05:27 +05:30
Ganesh Vernekar	54739a1465	Merge pull request #11674 from bboreham/fix-tsdb-test-mem tsdb tests: allocate more reasonable sample slice	2022-12-14 15:01:04 +05:30
beorn7	5f366e9b62	histograms: Improve tests and fix exposed bugs This adds negative buckets and access of float histograms to TestHistogramChunkSameBuckets and TestHistogramChunkBucketChanges. It also exercises a specific pattern of reusing an iterator (one where no access has happened). This exposes two bugs (where entries for positive buckets where used where the corresponding entries for negative buckets should have been used). One was fixed in #11627 (not merged), which triggered the work in this commit. This commit fixes both issues, so #11627 can be closed. It also simplifies the code in the histogramIterator.Next method that aims to recycle existing slice capacity. Furthermore, this is on top of the release-2.40 branch because we should probably cut a bugfix release for this. Signed-off-by: beorn7 <beorn@grafana.com>	2022-12-12 00:08:23 +01:00
Julien Pivotto	0b302f8a39	Merge pull request #11662 from prometheus/release-2.40 Merge back release-2.40 branch again	2022-12-06 17:30:51 +01:00
Bryan Boreham	9853888f9b	tsdb tests: allocate more reasonable sample slice Typical parameters are one hour by 1 minute step, where the function would allocate a slice of 3.6 million samples instead of 60. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-05 17:15:02 +00:00
Ganesh Vernekar	72a48321da	Merge pull request #11633 from pstibrany/populate-error Enhance "cannot populate chunk" error message to include source block ID	2022-12-02 16:28:52 +05:30
Ganesh Vernekar	b8b0d45d69	Fix reset of a histogram chunk iterator Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-11-30 17:50:05 +05:30
Julien Pivotto	0372e259ba	Merge pull request #11634 from prometheus/release-2.40 Merge release-2.40 branch into main	2022-11-29 15:54:58 +01:00
Bryan Boreham	6bdecf377c	Switch from 'sanity' to more inclusive lanuage (#9376 ) * Switch from 'sanity' to more inclusive lanuage "Removing ableist language in code is important; it helps to create and maintain an environment that welcomes all developers of all backgrounds, while emphasizing that we as developers select the most articulate, precise, descriptive language we can rather than relying on metaphors. The phrase sanity check is ableist, and unnecessarily references mental health in our code bases. It denotes that people with mental illnesses are inferior, wrong, or incorrect, and the phrase sanity continues to be used by employers and other individuals to discriminate against these people." From https://gist.github.com/seanmhanson/fe370c2d8bd2b3228680e38899baf5cc Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-11-28 17:09:18 +00:00
Peter Štibraný	af838ccf83	Include source block in error message when loading chunk fails. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>	2022-11-28 09:12:54 +01:00
Bryan Boreham	1226922ff5	tsdb: improve blockBaseSeriesSet scan Inverting the test for chunks deleted by tombstones makes all three rejections consistent, and also avoids the case where a chunk is excluded but still causes `trimFront` or `trimBack` to be set. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-11-26 15:23:02 +00:00
Bryan Boreham	0c05f95e92	tsdb: use smaller allocation in blockBaseSeriesSet This reduces garbage, hence goes faster, when a short time range is required compared to the amount of chunks in the block. For example recording rules and alerts often look only at the last few minutes. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-11-26 14:56:22 +00:00
Ganesh Vernekar	ad79fb9f25	Do not error on empty chunk during iteration in populateWithDelChunkSeriesIterator Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-11-23 17:32:28 +05:30
Ganesh Vernekar	d0e683e26d	Add TestCompactHeadWithDeletion to test compaction failure after deletion Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-11-23 17:31:18 +05:30
Ganesh Vernekar	42633bd05c	Merge pull request #11485 from t00350320/prometheus-office GetRefByhash() will query a label's ref with hash value rather than lset.Hash().	2022-11-16 15:09:49 +01:00
tanghengjian	982007ecab	GetRefByhash will query a label's ref with hash value rather than lset.Hash(). Signed-off-by: tanghengjian <1040104807@qq.com>	2022-11-16 14:13:59 +01:00
Oleg Zaytsev	8553a98267	Optimize postings offset table reading (#11535 ) * Add BenchmarkOpenBlock * Use specific types when reading offset table Instead of reading a generic-ish []string, we can read a generic type which would be specifically labels.Label. This avoid allocating a slice that escapes to the heap, making it both faster and more efficient in terms of memory management. * Update error message for unexpected number of keys * s/posting offset table/postings offset table/ * Remove useless lastKey assignment * Use two []bytes vars, simplify Applied PR feedback: removed generics, moved the label indices reading to that specific test as we're not using it in production anyway, we're just testing what we've just built. Also using two []bytes variables for name and value that use the backing buffer instead of using strings, this reduces allocations a lot as we only copy them when we store them (this is optimized by the compiler). * Fix the dumb bug Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Marco Pracucci <marco@pracucci.com>	2022-11-14 17:48:16 +01:00
Julien Pivotto	739494d81b	Fix alignment of atomic int64 (#11547 ) * Fix atomix int64 placement * Test main for 386 Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2022-11-09 11:18:49 +01:00
Ganesh Vernekar	fa6e05903f	Merge pull request #11447 from prometheus/sparsehistogram Add Support for Native Histograms This PR merges all the coding work that has been done in sparsehistogram branch over the last 1 year into main branch. Design doc on native histograms: https://docs.google.com/document/d/1cLNv3aufPZb3fNfaJgdaRBZsInZKKIHo9E6HinJVbpM/edit Some sneak peak: https://www.youtube.com/watch?v=T2GvcYNth9U	2022-10-26 17:10:46 -04:00
Viacheslav Panasovets	3d2e18bad5	Fix time.Since() in defer. Wrap in anonymous function (#11489 ) Function arguments in defer evaluated during definition of defer, not during execution Signed-off-by: Slavik Panasovets <slavik@google.com> Signed-off-by: Slavik Panasovets <slavik@google.com>	2022-10-26 00:26:12 +02:00
Björn Rabenstein	503ffba49a	chunkenc: Slightly optimize xorWrite/xoRead (#11476 ) With these changes, the "happy path" when the leading and trailing number of bits don't need an update, fewer operations are needed. The change is probably very marginal (no change in the benchmark added here, but the benchmark also doesn't cover non-changing values), and an argument could me made that avoiding pointers also has its benefits. However, I think that reducing the number of return values improves readability. Which convinced me that I should at least propose this. Signed-off-by: beorn7 <beorn@grafana.com>	2022-10-20 15:08:01 +05:30
Ganesh Vernekar	8ee4dfd40c	Fix the build after conflict resolution Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-10-12 17:59:42 +05:30
Ganesh Vernekar	648be89822	Merge remote-tracking branch 'upstream/main' into fix-conflict Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-10-12 14:20:02 +05:30
Ganesh Vernekar	8e29110949	Add/Improve unit tests for compaction with histogram (#11342 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-10-12 13:31:12 +05:30
Ganesh Vernekar	507bfa46fd	Fix HistogramChunk's AtFloatHistogram() Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-10-12 10:38:13 +05:30
Signed-off-by: Jesus Vazquez	3362bf6d79	Fix merge conflicts Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-10-11 22:53:37 +05:30
Jesus Vazquez	775d90d5f8	TSDB: Rename wal package to wlog (#11352 ) The wlog.WL type can now be used to create a Write Ahead Log or a Write Behind Log. Before the prefix for wbl metrics was 'prometheus_tsdb_out_of_order_wal_' and has been replaced with 'prometheus_tsdb_out_of_order_wbl_'. Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> Signed-off-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>	2022-10-10 20:38:46 +05:30
Sonali Rajput	9165aedb49	Fixed broken link in tsdb README.md Signed-off-by: Sonali Rajput <sonalirajput1088@gmail.com>	2022-10-07 16:20:20 +00:00
Jesus Vazquez	e934d0f011	Merge 'main' into sparsehistogram Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>	2022-10-05 22:14:49 +02:00
Ganesh Vernekar	d0a6488c74	Update metrics for histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-10-03 13:48:59 +05:30
Bryan Boreham	9b31adc4e8	tsdb: fix up sort call with faster slices.Sort (#11380 ) This call was added by PR #11075 merged before #11318 which changed all similar calls to `sort.Sort` into a faster one. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-10-01 12:55:40 -04:00
Bryan Boreham	3330d85ba8	Replace sort.Strings and sort.Ints with faster slices.Sort (#11318 ) Use new experimental package `golang.org/x/exp/slices`. slices.Sort works on values that are directly comparable, like ints, so avoids the overhad of an interface call to `.Less()`. Left tests unchanged, because they don't need the speed and it may be a cross-check that slices.Sort gives the same answer. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-09-30 20:03:56 +05:30
Bryan Boreham	7f2374b703	tsdb: faster postings sort with generic slices.Sort (#11054 ) Use new experimental package `golang.org/x/exp/slices`. Some of the speedup comes from comparing SeriesRef (which is an int64) directly rather than through an interface `.Less()` call; some comes from exp/slices using "pattern-defeating quicksort(pdqsort)". Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-09-30 20:01:32 +05:30
Ganesh Vernekar	83d738e263	Fix 'invalid magic number 0' bug (#11338 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-09-28 21:43:58 +05:30
Ganesh Vernekar	f34aeefe6e	Allow overlapping blocks by default (#11331 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-09-28 19:17:54 +05:30
Robert Fratto	448cfda6c1	tsdb/agent: fix validation of default options (#9876 ) * tsdb/agent: fix application of defaults MaxTS was being incorrectly constrained to the truncation interval * add more tests to check validation * force MaxWALTime = MinWALTime if min > max Signed-off-by: Robert Fratto <robertfratto@gmail.com>	2022-09-27 19:41:43 +05:30
Bryan Boreham	d166da7b59	tsdb: stop saving a copy of last 4 samples in memSeries (#11296 ) * TSDB chunks: remove race between writing and reading Because the data is stored as a bit-stream, the last byte in the stream could change if the stream is appended to after an Iterator is obtained. Copy the last byte when the Iterator is created, so we don't have to read it later. Clarify in comments that concurrent Iterator and Appender are allowed, but the chunk must not be modified while an Iterator is created. (This was already the case, in order to copy the bstream slice header.) * TSDB: stop saving last 4 samples in memSeries This extra copy of the last 4 samples was introduced to avoid a race condition between reading the last byte of the chunk and writing to it. But now we have fixed that by having `bstreamReader` copy the last byte, we don't need to copy the last 4 samples. This change saves 56 bytes per series, which is very worthwhile when you have millions or tens of millions of series. * TSDB: tidy up stopIterator re-use Previous changes have left this code duplicating some lines; pull them out to a separate function and tidy up. * TSDB head_test: stop checking when iterators are wrapped The behaviour has changed so chunk iterators are only wrapped when transaction isolation requires them to stop short of the end. This makes tests fail which are checking the type. Tests should check the observable behaviour, not the type. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-09-27 19:32:05 +05:30
Bryan Boreham	ff00dee262	tsdb: turn off transaction isolation for head compaction (#11317 ) * tsdb: add a basic test for read/write isolation * tsdb: store the min time with isolationAppender So that we can see when appending has moved past a certain point in time. * tsdb: allow RangeHead to have isolation disabled This will be used when for head compaction. * tsdb: do head compaction with isolation disabled This saves a lot of work tracking appends done while compaction is ongoing. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-09-27 19:31:23 +05:30
Bryan Boreham	d0607435a2	tsdb: remove chunkRange and oooCapMax from memSeries (#11288 ) * tsdb: remove chunkRange from memSeries chunkRange is the (oddly-named) configured duration for the head block. We don't need a copy of this value per series. Pass it down where required, and remove the copy. The value in `Head` is only updated in `resetInMemoryState()`, which also discards all `memSeries`. * tsdb: remove oooCapMax from memSeries oooCapMax is the configured maximum capacity for an out-of-order chunk. Storing it per-series uses extra memory, and has surprising behaviour if users change the value in config - series created before the change will keep their old value. Instead, pass it down where required, and remove the per-series value. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-09-27 13:52:22 +05:30
Ganesh Vernekar	758e29258b	Add/Improve unit tests for compaction with histogram Part 2 (#11343 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-09-23 14:01:10 +05:30
Jesus Vazquez	c1b669bf9b	Add out-of-order sample support to the TSDB (#11075 ) * Introduce out-of-order TSDB support This implementation is based on this design doc: https://docs.google.com/document/d/1Kppm7qL9C-BJB1j6yb6-9ObG3AbdZnFUBYPNNWwDBYM/edit?usp=sharing This commit adds support to accept out-of-order ("OOO") sample into the TSDB up to a configurable time allowance. If OOO is enabled, overlapping querying are automatically enabled. Most of the additions have been borrowed from https://github.com/grafana/mimir-prometheus/ Here is the list ist of the original commits cherry picked from mimir-prometheus into this branch: - `4b2198d7ec` - `2836e5513f` - `00b379c3a5` - `ff0dc75758` - `a632c73352` - `c6f3d4ab33` - `5e8406a1d4` - `abde1e0ba1` - `e70e769889` - `df59320886` Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Dieter Plaetinck <dieter@grafana.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * gofumpt files Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Add license header to missing files Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix OOO tests due to existing chunk disk mapper implementation Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix truncate int overflow Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Add Sync method to the WAL and update tests Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * remove useless sync Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Update minOOOTime after truncating Head * Update minOOOTime after truncating Head Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix lint Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Add a unit test Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Load OutOfOrderTimeWindow only once per appender Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix OOO Head LabelValues and PostingsForMatchers Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix replay of OOO mmap chunks Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Remove unnecessary err check Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Prevent panic with ApplyConfig Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Run OOO compaction after restart if there is OOO data from WBL Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Apply Bartek's suggestions Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Refactor OOO compaction Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Address comments and TODOs - Added a comment explaining why we need the allow overlapping compaction toggle - Clarified TSDBConfig OutOfOrderTimeWindow doc - Added an owner to all the TODOs in the code Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Run go format Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix remaining review comments Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix tests Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Change wbl reference when truncating ooo in TestHeadMinOOOTimeUpdate Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix TestWBLAndMmapReplay test failure on windows Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Address most of the feedback Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Refactor the block meta for out of order Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix windows error Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix review comments Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Dieter Plaetinck <dieter@grafana.com> Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>	2022-09-20 22:35:50 +05:30
Bryan Boreham	af6167df58	WAL loading: don't send empty buffers over chan (#11319 ) If some shards did not get any samples mapped, the buffer will be empty so sending it over the chan to `processWALSamples()` is a waste of time. This is especially likely now we are checking `minValidTime` before sending. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-09-20 19:43:30 +05:30
Ganesh Vernekar	2474c6fb2c	Error on amending histograms on append (#11308 ) * Error on amending histograms on append Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Rename Matches to Equals Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-09-19 13:10:30 +05:30
Bryan Boreham	d2701be53a	tsdb: remove chunk pool from memSeries (#11280 ) The chunk pool belongs to the head not to the series. Pass it down where required, and remove the copy of the pointer that `memSeries` was holding. `safeChunk` also needs to hold it, because in scenarios where it is used we don't have a reference to the head. However it was already holding `chunkDiskMapper` for the same reason, so no big change. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-09-15 13:22:09 +05:30

1 2 3 4 5 ...

635 commits