prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-11 08:04:04 -08:00

Author	SHA1	Message	Date
beorn7	ff08f0b6fe	storage: ensure timestamp monotonicity within series. Fixes https://github.com/prometheus/prometheus/issues/481 While doing so, clean up and fix a few other things: - Fix `go vet` warnings (@fabxc to blame ;). - Fix a racey problem with unarchiving: Whenever we unarchive a series, we essentially want to do something with it. However, until we have done something with it, it appears like a series that is ready to be archived or even purged. So e.g. it would be ignored during checkpointing. With this fix, we always load the chunkDescs upon unarchiving. This is wasteful if we only want to add a new sample to an archived time series, but the (presumably more common) case where we access an archived time series in a query doesn't become more expensive. - The change above streamlined the getOrCreateSeries ond newMemorySeries flow. Also, the modTime is now always set correctly. - Fix the leveldb-backed implementation of KeyValueStore.Delete. It had the wrong behavior of still returning true, nil if a non-existing key has been passed in.	2015-07-15 18:56:53 +02:00
beorn7	9016917d1c	Increment dirty counter only if setDirty(true) is called. Currently, we increment the counter even if setDirty(false) is called, which sets the storage clean.	2015-06-22 18:12:55 +02:00
Fabian Reinartz	0de6edbdfc	Move pkg/ to util/	2015-06-01 21:12:32 +02:00
Fabian Reinartz	2317b001d0	Move flock package to pkg/flock	2015-06-01 21:12:31 +02:00
beorn7	3b9ab546e6	Add metrics to count inconsistencies and fp collisions.	2015-05-21 18:46:20 +02:00
Björn Rabenstein	c44e7cd105	Merge pull request #706 from prometheus/beorn7/persistence2 Improve iterator performance.	2015-05-21 13:48:52 +02:00
beorn7	3b9c421a69	Weed out all the [Gg]et* method names. The only exception is getNumChunksToPersist to avoid naming the struct member numChunksToPersist in a weird way.	2015-05-20 19:13:06 +02:00
Julius Volz	267fd34156	Switch Prometheus to use github.com/prometheus/log. This change is conceptually very simple, although the diff is large. It switches logging from "github.com/golang/glog" to "github.com/prometheus/log", while not actually changing any log messages. V(1)-style logging has been changed to be log.Debug*().	2015-05-20 18:19:32 +02:00
beorn7	cd5574bf8a	Make chunk and series iterators more efficient.	2015-05-20 16:19:34 +02:00
Fabian Reinartz	d8440d75f1	Do not start storage processing before Start() is called.	2015-05-19 13:51:45 +02:00
beorn7	c36e0e05f1	Add crash recovery of fingerprint mappings.	2015-05-07 18:58:14 +02:00
beorn7	2235cec175	Handle fingerprint collisions.	2015-05-07 18:17:59 +02:00
beorn7	a052d32609	Comment improvement.	2015-04-14 10:49:43 +02:00
beorn7	66fc61f9b7	Make bufPool a member of the persistence struct.	2015-04-14 10:43:09 +02:00
beorn7	b02d900e61	Improve chunk and chunkDesc loading. Also, clean up some things in the code (especially introduction of the chunkLenWithHeader constant to avoid the same expression all over the place). Benchmark results: BEFORE BenchmarkLoadChunksSequentially 5000 283580 ns/op 152143 B/op 312 allocs/op BenchmarkLoadChunksRandomly 20000 82936 ns/op 39310 B/op 99 allocs/op BenchmarkLoadChunkDescs 10000 110833 ns/op 15092 B/op 345 allocs/op AFTER BenchmarkLoadChunksSequentially 10000 146785 ns/op 152285 B/op 315 allocs/op BenchmarkLoadChunksRandomly 20000 67598 ns/op 39438 B/op 103 allocs/op BenchmarkLoadChunkDescs 20000 99631 ns/op 12636 B/op 192 allocs/op Note that everything is obviously loaded from the page cache (as the benchmark runs thousands of times with very small series files). In a real-world scenario, I expect a larger impact, as the disk operations will more often actually hit the disk. To load ~50 sequential chunks, this reduces the iops from 100 seeks and 100 reads to 1 seek and 1 read.	2015-04-13 21:06:04 +02:00
beorn7	6a21f73898	Fixes after review.	2015-03-19 17:54:59 +01:00
beorn7	51d35f4481	Instrument series maintenance durations.	2015-03-19 17:06:16 +01:00
beorn7	12ae6e9203	Increase resilience of the storage against data corruption - step 4. Step 4: Add a configurable sync'ing of series files after modification.	2015-03-19 15:58:02 +01:00
beorn7	11bd9ce1bd	Increase resilience of the storage against data corruption - step 3. Step 3: Remember the mtime of series files and make use of it to detect series files that are not the one the checkpoint thinks they are.	2015-03-19 15:44:11 +01:00
beorn7	e25cca823c	Increase resilience of the storage against data corruption - step 2. Step 2: Add a flag -storage.local.pedantic-checks to check every series file. Also, remove countPersistedHeadChunks channel, which is unused.	2015-03-19 12:06:15 +01:00
beorn7	da7c0461c6	Rename persist queue len/cap to num/max chunks to persist. Remove deprecated flag storage.incoming-samples-queue-capacity.	2015-03-18 19:36:41 +01:00
beorn7	1d8fc7d56f	Change minor things after code review.	2015-03-18 19:09:07 +01:00
beorn7	0056eaeb4f	Redesign series maintenance and chunk persistence.	2015-03-14 22:05:23 +01:00
beorn7	5bea942d8e	Improve various things around chunk encoding. A number of mostly minor things: - Rename chunk type -> chunk encoding. - After all, do not carry around the chunk encoding to all parts of the system, but just have one place where the encoding for new chunks is set based on the flag. The new approach has caveats as well, but the polution of so many method signatures is worse. - Use the default chunk encoding for new chunks of existing series. (Previously, only new _series_ would get chunks with the default encoding.) - Use an enum for chunk encoding. (But keep the version number for the flag, for reasons discussed previously.) - Add encoding() to the chunk interface (so that a chunk knows its own encoding - no need to have that in a different top-level function). - Got rid of newFollowUpChunk (which would keep the existing encoding for all chunks of a time series). Now only use newChunk(), which will create a chunk encoding according to the flag. - Simplified transcodeAndAdd. - Reordered methods of deltaEncodedChunk and doubleDeltaEncoded chunk to match the order in the chunk interface. - Only transcode if the chunk is not yet half full. If more than half full, add a new chunk instead.	2015-03-14 19:03:20 +01:00
beorn7	9ecf93526d	Sync the checkpoints. Because that's what should be done with checkpoints.	2015-03-11 19:10:51 +01:00
beorn7	13fcf1ddbc	Implement double-delta encoded chunks.	2015-03-05 20:33:26 +01:00
beorn7	0167083da6	Improvements after review.	2015-03-03 18:59:39 +01:00
beorn7	ebac14eff3	Add version guard to persistence.	2015-03-03 18:34:01 +01:00
beorn7	92991026bb	Fix chunkDescsTotal count in case of errors. Only increment the counter if we actually add the memory series to the fingerprintToSeries map.	2015-02-27 02:21:12 +01:00
beorn7	9406afad72	Do not double-count non-persisted head chunks on loading.	2015-02-27 00:06:16 +01:00
beorn7	edd716e63c	Fix the embarrassing bug introduced in commit `0851945`. In that commit, the 'maintainSeries' call was accidentally removed. This commit refactors things a bit so that there is now a clean 'maintainMemorySeries' and a 'maintainArchivedSeries' call. Straighten the nomenclature a bit (consistently use 'drop' for chunks and 'purge' for series/metrics). Remove the annoying 'Completed maintenance sweep through archived fingerprints' message if there were no archived fingerprints to do maintenance on.	2015-02-26 18:30:33 +01:00
beorn7	af91fb8e31	Improve persisting chunks to disk. This is done by bucketing chunks by fingerprint. If the persisting to disk falls behind, more and more chunks are in the queue. As soon as there are "double hits", we will now persist both chunks in one go, doubling the disk throughput (assuming it is limited by disk seeks). Should even more pile up so that we end wit "triple hits", we will persist those first, and so on. Even if we have millions of time series, this will still help, assuming not all of them are growing with the same speed. Series that get many samples and/or are not very compressable will accumulate chunks faster, and they will soon get double- or triple-writes. To improve the chance of double writes, -storage.local.persistence-queue-capacity could be set to a higher value. However, that will slow down shutdown a lot (as the queue has to be worked through). So we leave it to the user to set it to a really high value. A more fundamental solution would be to checkpoint not only head chunks, but also chunks still in the persist queue. That would be quite complicated for a rather limited use-case (running many time series with high ingestion rate on slow spinning disks).	2015-02-17 16:02:09 +01:00
beorn7	e22f26bc58	Move to a queue model for appending samples after all. Starting a goroutine takes 1-2µs on my laptop. From the "numbers every Go programmer should know", I had 300ns for a channel send in my mind. Turns out, on my laptop, it takes only 60ns. That's fast enough to warrant the machinery of yet another channel with a fixed set of worker goroutines feeding from it. The number chosen (8 for now) is low enough to not really afflict a measurable overhead (a big Prometheus server has >1000 goroutines running), but high enough to not make sample ingestion a bottleneck.	2015-02-13 14:26:54 +01:00
Bjoern Rabenstein	c24bfdf701	Move crash related code into separate file. persistence.go is way too long anyway, and a lot of code is just crash recovery, which is not important to understand the normal operation. Also, remove unused `exists` function.	2015-01-29 13:13:16 +01:00
Bjoern Rabenstein	73f6dc4d44	Make KeyValueStore.Delete report if the key to delete was found. Previously, it would return an error instead. Now we can distinguish the cases 'error while deleting known key' vs. 'key not in index' without testing for leveldb-internal kinds of errors.	2015-01-29 12:57:50 +01:00
Bjoern Rabenstein	2c8d324ca4	Remove check that did not check anything.	2015-01-26 13:48:24 +01:00
Bjoern Rabenstein	5859b74f1b	Clean up license issues. - Move CONTRIBUTORS.md to the more common AUTHORS. - Added the required NOTICE file. - Changed "Prometheus Team" to "The Prometheus Authors". - Reverted the erroneous changes to the Apache License.	2015-01-21 20:07:45 +01:00
Bjoern Rabenstein	baca6faa1c	Add double-start protection. This mimics the locking leveldb is performing anyway. Advantages of doing it separately: - Should we ever replace the leveldb implementation by one without double-start protection, we are still good. - In contrast to leveldb, the new code creates a meaningful error message.	2015-01-14 17:13:42 +01:00
Bjoern Rabenstein	622e8350cd	Fix a bug handling freshly unarchived series. Usually, if you unarchive a series, it is to add something to it, which will create a new head chunk. However, if a series in unarchived, and before anything is added to it, it is handled by the maintenance loop, it will be archived again. In that case, we have to load the chunkDescs to know the lastTime of the series to be archived. Usually, this case will happen only rarely (as a race, has never happened so far, possibly because the locking around unarchiving and the subsequent sample append is smart enough). However, during crash recovery, we sometimes treat series as "freshly unarchived" without directly appending a sample. We might add more cases of that type later, so better deal with archiving properly and load chunkDescs if required.	2015-01-08 16:25:50 +01:00
Brian Brazil	e56786b221	Have scrape time as a pseudovariable, not a prometheus variable. This ensures it has the right timestamp, and is easier to work with. Switch sd variable away from 'outcome', using total/failed instead.	2014-12-27 00:39:33 +00:00
Bjoern Rabenstein	ff24070a03	Fix embarrassing bug in crash recovery. (And yes, we always knew we need tests for that. I have added a TODO now.) Change-Id: I9cf52bbf98e263e0b79404bda4c442beba9696a8	2014-12-17 17:18:04 +01:00
Bjoern Rabenstein	66c80b5ebd	Fix typo. Change-Id: I72608c7841c00145458807d3c3ee29db7b5ac2bc	2014-11-28 12:50:19 +01:00
Bjoern Rabenstein	674624f1c8	Completed more TODOs. - Documented checkpoint file format. - High-level description of series sanitation. - Replace fp.LoadFromString panic with an error. (Change in client_golang already submitted.) - Introduced checks for series file size where appropriate. - Removed two Law of Demeter violations. Change-Id: I555d97a2c8f4769820c2fc8bf5d6f4e160222abc	2014-11-27 20:46:45 +01:00
Bjoern Rabenstein	7d11019aa2	Squash a few trivial TODOs. - Delete unneeded file view_adapter.go. - Assessed that we still need the fingerprints in nodes (to create iterators). - Turned numMemChunkDescs into a metric. Change-Id: I29be963c795a075ec00c095f76bf26405535609d	2014-11-27 18:26:06 +01:00
Bjoern Rabenstein	14bda4180c	Changes after pair code review. Change-Id: Ib72d40f8e9027818cfbbd32a7a7201eebda07455	2014-11-25 17:12:59 +01:00
Bjoern Rabenstein	c087ee35f7	Remove archiveMtx. Change-Id: Ie8019f860bbda68621f74380c90a4e57930d3d7a	2014-11-25 17:10:30 +01:00
Bjoern Rabenstein	7af42eda65	Optimize purging. Now only purge if there is something to purge. Also, set savedFirstTime and archived time range appropriately. (Which is needed for the optimization.) Change-Id: Idcd33319a84def3ce0318d886f10c6800369e7f9	2014-11-25 17:10:30 +01:00
Bjoern Rabenstein	33b959b898	Persist savedFirstTime in checkpoint. Change-Id: Ibdfdea16fad0608ec104fbccc749e824a171f227	2014-11-25 17:10:30 +01:00
Bjoern Rabenstein	904acd43da	Add crash recovery. Fix the behavior if preload for non-existent series is requested. Instead of returning an error (which triggers a panic further up), simply count those incidents. They can happen regularly, we just want to know if they happen too frequently because that would mean the indexing is behind or broken. Change-Id: I4b2d1b93c4146eeea897d188063cb9574a270f8b	2014-11-25 17:09:43 +01:00
Bjoern Rabenstein	4efc60174b	Tweak and verify a few parameters. Remove TODOs accordingly. Change-Id: Ic062e13b6ae89a9135d3f14011114fe1cca1cef8	2014-11-25 17:09:43 +01:00
Bjoern Rabenstein	5f8e9617ef	Add more tests. Add an end-to-end fuzz and race test. Fix a race exposed by the above. Change-Id: Ifaa39a90cefbde8d4c29bda197cc92592ded21bb	2014-11-25 17:09:17 +01:00
Bjoern Rabenstein	d215e013b7	Fix the weird chunkDesc shuffling bug. The root cause was that after chunkDesc eviction, the offset between memory representation of chunk layout (via chunkDescs in memory) was shiftet against chunks as layed out on disk. Keeping the offset up to date is by no means trivial, so this commit is pretty involved. Also, found a race that for some reason didn't bite us so far: Persisting chunks was completel unlocked, so if chunks were purged on disk at the same time, disaster would strike. However, locking the persisting of chunk revealed interesting dead locks. Basically, never queue under the fp lock. Change-Id: I1ea9e4e71024cabbc1f9601b28e74db0c5c55db8	2014-11-25 17:09:17 +01:00
Bjoern Rabenstein	f1de5b0c4e	Run checkpointing of in-memory metrics and head chunks periodically. Checkpointing interval is now a command line flag. Along the way, several things were refactored. - Restructure the way the storage is started and stopped.. - Number of series in checkpoint is now a uint64, not a varint. (Breaks old checkpoints, needs wipe!) - More consistent naming and order of methods. Change-Id: I883d9170c9a608ee716bb0ab3d0ded8ca03760d9	2014-11-25 17:09:04 +01:00
Bjoern Rabenstein	74c9b34a5e	Improve storage instrumentation even more. Add gauge for chunks and chunkdescs in memory (backed by a global variable to be used later not only for instrumentation but also for memory management). Refactored instrumentation code once more (instrumentation.go is back :). Change-Id: Ife39947e22a48cac4982db7369c231947f446e17	2014-11-25 17:09:04 +01:00
Bjoern Rabenstein	443dd33805	Improve instrumentation in storage. Also, fix some other minor bugs. Change-Id: If72f1c058b0f47d3e378fdf80228d7e9a8db06c7	2014-11-25 17:09:04 +01:00
Bjoern Rabenstein	95f392fb2c	Prevent an indexing death spiral. Change-Id: I86b20cd0830d02f87b2f020767257e2d3fb2033c	2014-11-25 17:09:04 +01:00
Bjoern Rabenstein	40354eaa29	Reduce directory depth by one. Change-Id: I7f89df61135ff19169ed97633a662685d414c448	2014-11-25 17:09:04 +01:00
Bjoern Rabenstein	096fa0f8b2	Squash a number of TODOs. - Staleness delta is no a proper function parameter and not replicated from package ast. - Named type 'chunks' replaced by explicit '[]chunk' to avoid confusion. - For the same reason, replaced 'chunkDescs' by '[]*chunkDescs'. - Verified that math.Modf is not a speed enhancement over conversion (actually 5x slower). - Renamed firstTimeField, lastTimeField into chunkFirstTime and chunkLastTime. - Verified unpin() is sufficiently goroutine-safe. - Decided not to update archivedFingerprintToTimeRange upon series truncation and added a rationale why. Change-Id: I863b8d785e5ad9f71eb63e229845eacf1bed8534	2014-11-25 17:09:04 +01:00
Julius Volz	7f5d3c2c29	Fix and improve the fp locker. Benchmark: $ go test -bench 'Fingerprint' -test.run 'Fingerprint' -test.cpu=1,2,4 OLD BenchmarkFingerprintLockerParallel 500000 3618 ns/op BenchmarkFingerprintLockerParallel-2 100000 12257 ns/op BenchmarkFingerprintLockerParallel-4 500000 10164 ns/op BenchmarkFingerprintLockerSerial 10000000 283 ns/op BenchmarkFingerprintLockerSerial-2 10000000 284 ns/op BenchmarkFingerprintLockerSerial-4 10000000 288 ns/op NEW BenchmarkFingerprintLockerParallel 1000000 1018 ns/op BenchmarkFingerprintLockerParallel-2 1000000 1164 ns/op BenchmarkFingerprintLockerParallel-4 2000000 910 ns/op BenchmarkFingerprintLockerSerial 50000000 56.0 ns/op BenchmarkFingerprintLockerSerial-2 50000000 47.9 ns/op BenchmarkFingerprintLockerSerial-4 50000000 54.5 ns/op Change-Id: I3c65a43822840e7e64c3c3cfe759e1de51272581	2014-11-25 17:07:45 +01:00
Bjoern Rabenstein	8fba3302bc	Bold changes to concurrency. (WIP. Probably doesn't work yet.) Change-Id: Id1537dfcca53831a1d428078a5863ece7bdf4875	2014-11-25 17:07:45 +01:00
Bjoern Rabenstein	7e6a03fbf9	Fix a few concurrency issues before starting to use the new fp locker. Change-Id: I8615e8816e79ef0882e123163ee590c739b79d12	2014-11-25 17:07:45 +01:00
Julius Volz	df1b2a2422	Fix indexing latency instrumentation. Change-Id: I532c170121cd2996d1a378adbb1fd551cd5a4e38	2014-11-25 17:07:44 +01:00
Julius Volz	a746fbb8bc	Instrument indexing: queue length, batch sizes and latencies. Change-Id: I60bcbd24b160e47d418a485d8cffa39344a257c6	2014-11-25 17:07:44 +01:00
Bjoern Rabenstein	e9ff29c547	Comment/code cleanup. Change-Id: I38736e3d0fec79759a2bafa35aecf914480ff810	2014-11-25 17:07:44 +01:00
Bjoern Rabenstein	0031a448e2	Add WaitForIndexing. Change-Id: I5a5c975c4246632f937413322c855bbe63d00802	2014-11-25 17:07:44 +01:00
Bjoern Rabenstein	c7aad110fb	Add an indexing queue and batch the ops. Some other improvements on the way, in particular codec -> codable renaming and addition of LookupSet methods. Change-Id: I978f8f3f84ca8e4d39a9d9f152ae0ad274bbf4e2	2014-11-25 17:07:44 +01:00
Bjoern Rabenstein	71206dbc06	More code cleanups. Add license text everywhere. And others.... Change-Id: I11ccde267a2ef7eb366c4788ba7aeae14ba7545c	2014-11-25 17:07:44 +01:00
Julius Volz	630b5a087a	Also consider on-disk fingerprints during purge. This reintroduces LevelDB iterators so that we can iterate through all the on-disk fingerprints. Change-Id: I007ee4638d038d2a4461bbda27f30fcaad411474	2014-11-25 17:07:35 +01:00
Bjoern Rabenstein	f5f9f3514a	Major code cleanup. - Make it go-vet and golint clean. - Add comments, TODOs, etc. Change-Id: If1392d96f3d5b4cdde597b10c8dff1769fcfabe2	2014-11-25 17:02:53 +01:00
Bjoern Rabenstein	bbf49200ab	Implement methods in persistence.go. Change-Id: I804cdd0b30420e171825fd86fe1281eca0d5e638	2014-11-25 17:02:23 +01:00
Bjoern Rabenstein	5a128a04a9	Major reorganization of the storage. Most important, the heads file will now persist all the chunk descs, too. Implicitly, it will serve as the persisted form of the fp-to-series map. Change-Id: Ic867e78f2714d54c3b5733939cc5aef43f7bd08d	2014-11-25 17:02:01 +01:00
Bjoern Rabenstein	4770cf76a4	Make index package more self-contained. Moved interna from diskPersistence into the indexer. TotalIndexer now called diskIndexer. Change-Id: I6c8c62cb171f12bbd8a5474773af7786d71ba388	2014-11-25 17:02:01 +01:00
Bjoern Rabenstein	89f10e8eb2	Move to using the standard library interfaces for encoding/decoding. BinaryMarshaler instead of encodable. BinaryUnmarshaler instead of decodable. Left 'codable' in place for lack of a better word. Change-Id: I8a104be7d6db916e8dbc47ff95e6ff73b845ac22	2014-11-25 17:02:01 +01:00
Julius Volz	7e85711df0	Beginnings of a tiered index implementation. This reintroduces a LevelDB-based metrics index. Change-Id: I4111540301c52255a07b2f570761707a32f72c05	2014-11-25 17:02:00 +01:00
Julius Volz	8dfaa5ecd2	Remove use of freelists for chunk bufs. Change-Id: Ib887fdb61e1d96da0cd32545817b925ba88831c1	2014-11-25 17:02:00 +01:00
Bjoern Rabenstein	ecdf5ab14f	Index-persistence switched from gob to a hand-coded solution. Change-Id: Ib4ec42535bd08df16d34d4774bb638e35c5a1841	2014-11-25 17:02:00 +01:00
Julius Volz	e7ed39c9a6	Initial experimental snapshot of next-gen storage. Change-Id: Ifb8709960dbedd1d9f5efd88cdd359ee9fa9d26d	2014-11-25 17:02:00 +01:00

1 2 3

127 commits