prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-10 23:54:05 -08:00

Author	SHA1	Message	Date
beorn7	af91fb8e31	Improve persisting chunks to disk. This is done by bucketing chunks by fingerprint. If the persisting to disk falls behind, more and more chunks are in the queue. As soon as there are "double hits", we will now persist both chunks in one go, doubling the disk throughput (assuming it is limited by disk seeks). Should even more pile up so that we end wit "triple hits", we will persist those first, and so on. Even if we have millions of time series, this will still help, assuming not all of them are growing with the same speed. Series that get many samples and/or are not very compressable will accumulate chunks faster, and they will soon get double- or triple-writes. To improve the chance of double writes, -storage.local.persistence-queue-capacity could be set to a higher value. However, that will slow down shutdown a lot (as the queue has to be worked through). So we leave it to the user to set it to a really high value. A more fundamental solution would be to checkpoint not only head chunks, but also chunks still in the persist queue. That would be quite complicated for a rather limited use-case (running many time series with high ingestion rate on slow spinning disks).	2015-02-17 16:02:09 +01:00
beorn7	e22f26bc58	Move to a queue model for appending samples after all. Starting a goroutine takes 1-2µs on my laptop. From the "numbers every Go programmer should know", I had 300ns for a channel send in my mind. Turns out, on my laptop, it takes only 60ns. That's fast enough to warrant the machinery of yet another channel with a fixed set of worker goroutines feeding from it. The number chosen (8 for now) is low enough to not really afflict a measurable overhead (a big Prometheus server has >1000 goroutines running), but high enough to not make sample ingestion a bottleneck.	2015-02-13 14:26:54 +01:00
Bjoern Rabenstein	c24bfdf701	Move crash related code into separate file. persistence.go is way too long anyway, and a lot of code is just crash recovery, which is not important to understand the normal operation. Also, remove unused `exists` function.	2015-01-29 13:13:16 +01:00
Bjoern Rabenstein	73f6dc4d44	Make KeyValueStore.Delete report if the key to delete was found. Previously, it would return an error instead. Now we can distinguish the cases 'error while deleting known key' vs. 'key not in index' without testing for leveldb-internal kinds of errors.	2015-01-29 12:57:50 +01:00
Bjoern Rabenstein	2c8d324ca4	Remove check that did not check anything.	2015-01-26 13:48:24 +01:00
Bjoern Rabenstein	5859b74f1b	Clean up license issues. - Move CONTRIBUTORS.md to the more common AUTHORS. - Added the required NOTICE file. - Changed "Prometheus Team" to "The Prometheus Authors". - Reverted the erroneous changes to the Apache License.	2015-01-21 20:07:45 +01:00
Bjoern Rabenstein	baca6faa1c	Add double-start protection. This mimics the locking leveldb is performing anyway. Advantages of doing it separately: - Should we ever replace the leveldb implementation by one without double-start protection, we are still good. - In contrast to leveldb, the new code creates a meaningful error message.	2015-01-14 17:13:42 +01:00
Bjoern Rabenstein	622e8350cd	Fix a bug handling freshly unarchived series. Usually, if you unarchive a series, it is to add something to it, which will create a new head chunk. However, if a series in unarchived, and before anything is added to it, it is handled by the maintenance loop, it will be archived again. In that case, we have to load the chunkDescs to know the lastTime of the series to be archived. Usually, this case will happen only rarely (as a race, has never happened so far, possibly because the locking around unarchiving and the subsequent sample append is smart enough). However, during crash recovery, we sometimes treat series as "freshly unarchived" without directly appending a sample. We might add more cases of that type later, so better deal with archiving properly and load chunkDescs if required.	2015-01-08 16:25:50 +01:00
Brian Brazil	e56786b221	Have scrape time as a pseudovariable, not a prometheus variable. This ensures it has the right timestamp, and is easier to work with. Switch sd variable away from 'outcome', using total/failed instead.	2014-12-27 00:39:33 +00:00
Bjoern Rabenstein	ff24070a03	Fix embarrassing bug in crash recovery. (And yes, we always knew we need tests for that. I have added a TODO now.) Change-Id: I9cf52bbf98e263e0b79404bda4c442beba9696a8	2014-12-17 17:18:04 +01:00
Bjoern Rabenstein	66c80b5ebd	Fix typo. Change-Id: I72608c7841c00145458807d3c3ee29db7b5ac2bc	2014-11-28 12:50:19 +01:00
Bjoern Rabenstein	674624f1c8	Completed more TODOs. - Documented checkpoint file format. - High-level description of series sanitation. - Replace fp.LoadFromString panic with an error. (Change in client_golang already submitted.) - Introduced checks for series file size where appropriate. - Removed two Law of Demeter violations. Change-Id: I555d97a2c8f4769820c2fc8bf5d6f4e160222abc	2014-11-27 20:46:45 +01:00
Bjoern Rabenstein	7d11019aa2	Squash a few trivial TODOs. - Delete unneeded file view_adapter.go. - Assessed that we still need the fingerprints in nodes (to create iterators). - Turned numMemChunkDescs into a metric. Change-Id: I29be963c795a075ec00c095f76bf26405535609d	2014-11-27 18:26:06 +01:00
Bjoern Rabenstein	14bda4180c	Changes after pair code review. Change-Id: Ib72d40f8e9027818cfbbd32a7a7201eebda07455	2014-11-25 17:12:59 +01:00
Bjoern Rabenstein	c087ee35f7	Remove archiveMtx. Change-Id: Ie8019f860bbda68621f74380c90a4e57930d3d7a	2014-11-25 17:10:30 +01:00
Bjoern Rabenstein	7af42eda65	Optimize purging. Now only purge if there is something to purge. Also, set savedFirstTime and archived time range appropriately. (Which is needed for the optimization.) Change-Id: Idcd33319a84def3ce0318d886f10c6800369e7f9	2014-11-25 17:10:30 +01:00
Bjoern Rabenstein	33b959b898	Persist savedFirstTime in checkpoint. Change-Id: Ibdfdea16fad0608ec104fbccc749e824a171f227	2014-11-25 17:10:30 +01:00
Bjoern Rabenstein	904acd43da	Add crash recovery. Fix the behavior if preload for non-existent series is requested. Instead of returning an error (which triggers a panic further up), simply count those incidents. They can happen regularly, we just want to know if they happen too frequently because that would mean the indexing is behind or broken. Change-Id: I4b2d1b93c4146eeea897d188063cb9574a270f8b	2014-11-25 17:09:43 +01:00
Bjoern Rabenstein	4efc60174b	Tweak and verify a few parameters. Remove TODOs accordingly. Change-Id: Ic062e13b6ae89a9135d3f14011114fe1cca1cef8	2014-11-25 17:09:43 +01:00
Bjoern Rabenstein	5f8e9617ef	Add more tests. Add an end-to-end fuzz and race test. Fix a race exposed by the above. Change-Id: Ifaa39a90cefbde8d4c29bda197cc92592ded21bb	2014-11-25 17:09:17 +01:00
Bjoern Rabenstein	d215e013b7	Fix the weird chunkDesc shuffling bug. The root cause was that after chunkDesc eviction, the offset between memory representation of chunk layout (via chunkDescs in memory) was shiftet against chunks as layed out on disk. Keeping the offset up to date is by no means trivial, so this commit is pretty involved. Also, found a race that for some reason didn't bite us so far: Persisting chunks was completel unlocked, so if chunks were purged on disk at the same time, disaster would strike. However, locking the persisting of chunk revealed interesting dead locks. Basically, never queue under the fp lock. Change-Id: I1ea9e4e71024cabbc1f9601b28e74db0c5c55db8	2014-11-25 17:09:17 +01:00
Bjoern Rabenstein	f1de5b0c4e	Run checkpointing of in-memory metrics and head chunks periodically. Checkpointing interval is now a command line flag. Along the way, several things were refactored. - Restructure the way the storage is started and stopped.. - Number of series in checkpoint is now a uint64, not a varint. (Breaks old checkpoints, needs wipe!) - More consistent naming and order of methods. Change-Id: I883d9170c9a608ee716bb0ab3d0ded8ca03760d9	2014-11-25 17:09:04 +01:00
Bjoern Rabenstein	74c9b34a5e	Improve storage instrumentation even more. Add gauge for chunks and chunkdescs in memory (backed by a global variable to be used later not only for instrumentation but also for memory management). Refactored instrumentation code once more (instrumentation.go is back :). Change-Id: Ife39947e22a48cac4982db7369c231947f446e17	2014-11-25 17:09:04 +01:00
Bjoern Rabenstein	443dd33805	Improve instrumentation in storage. Also, fix some other minor bugs. Change-Id: If72f1c058b0f47d3e378fdf80228d7e9a8db06c7	2014-11-25 17:09:04 +01:00
Bjoern Rabenstein	95f392fb2c	Prevent an indexing death spiral. Change-Id: I86b20cd0830d02f87b2f020767257e2d3fb2033c	2014-11-25 17:09:04 +01:00
Bjoern Rabenstein	40354eaa29	Reduce directory depth by one. Change-Id: I7f89df61135ff19169ed97633a662685d414c448	2014-11-25 17:09:04 +01:00
Bjoern Rabenstein	096fa0f8b2	Squash a number of TODOs. - Staleness delta is no a proper function parameter and not replicated from package ast. - Named type 'chunks' replaced by explicit '[]chunk' to avoid confusion. - For the same reason, replaced 'chunkDescs' by '[]*chunkDescs'. - Verified that math.Modf is not a speed enhancement over conversion (actually 5x slower). - Renamed firstTimeField, lastTimeField into chunkFirstTime and chunkLastTime. - Verified unpin() is sufficiently goroutine-safe. - Decided not to update archivedFingerprintToTimeRange upon series truncation and added a rationale why. Change-Id: I863b8d785e5ad9f71eb63e229845eacf1bed8534	2014-11-25 17:09:04 +01:00
Julius Volz	7f5d3c2c29	Fix and improve the fp locker. Benchmark: $ go test -bench 'Fingerprint' -test.run 'Fingerprint' -test.cpu=1,2,4 OLD BenchmarkFingerprintLockerParallel 500000 3618 ns/op BenchmarkFingerprintLockerParallel-2 100000 12257 ns/op BenchmarkFingerprintLockerParallel-4 500000 10164 ns/op BenchmarkFingerprintLockerSerial 10000000 283 ns/op BenchmarkFingerprintLockerSerial-2 10000000 284 ns/op BenchmarkFingerprintLockerSerial-4 10000000 288 ns/op NEW BenchmarkFingerprintLockerParallel 1000000 1018 ns/op BenchmarkFingerprintLockerParallel-2 1000000 1164 ns/op BenchmarkFingerprintLockerParallel-4 2000000 910 ns/op BenchmarkFingerprintLockerSerial 50000000 56.0 ns/op BenchmarkFingerprintLockerSerial-2 50000000 47.9 ns/op BenchmarkFingerprintLockerSerial-4 50000000 54.5 ns/op Change-Id: I3c65a43822840e7e64c3c3cfe759e1de51272581	2014-11-25 17:07:45 +01:00
Bjoern Rabenstein	8fba3302bc	Bold changes to concurrency. (WIP. Probably doesn't work yet.) Change-Id: Id1537dfcca53831a1d428078a5863ece7bdf4875	2014-11-25 17:07:45 +01:00
Bjoern Rabenstein	7e6a03fbf9	Fix a few concurrency issues before starting to use the new fp locker. Change-Id: I8615e8816e79ef0882e123163ee590c739b79d12	2014-11-25 17:07:45 +01:00
Julius Volz	df1b2a2422	Fix indexing latency instrumentation. Change-Id: I532c170121cd2996d1a378adbb1fd551cd5a4e38	2014-11-25 17:07:44 +01:00
Julius Volz	a746fbb8bc	Instrument indexing: queue length, batch sizes and latencies. Change-Id: I60bcbd24b160e47d418a485d8cffa39344a257c6	2014-11-25 17:07:44 +01:00
Bjoern Rabenstein	e9ff29c547	Comment/code cleanup. Change-Id: I38736e3d0fec79759a2bafa35aecf914480ff810	2014-11-25 17:07:44 +01:00
Bjoern Rabenstein	0031a448e2	Add WaitForIndexing. Change-Id: I5a5c975c4246632f937413322c855bbe63d00802	2014-11-25 17:07:44 +01:00
Bjoern Rabenstein	c7aad110fb	Add an indexing queue and batch the ops. Some other improvements on the way, in particular codec -> codable renaming and addition of LookupSet methods. Change-Id: I978f8f3f84ca8e4d39a9d9f152ae0ad274bbf4e2	2014-11-25 17:07:44 +01:00
Bjoern Rabenstein	71206dbc06	More code cleanups. Add license text everywhere. And others.... Change-Id: I11ccde267a2ef7eb366c4788ba7aeae14ba7545c	2014-11-25 17:07:44 +01:00
Julius Volz	630b5a087a	Also consider on-disk fingerprints during purge. This reintroduces LevelDB iterators so that we can iterate through all the on-disk fingerprints. Change-Id: I007ee4638d038d2a4461bbda27f30fcaad411474	2014-11-25 17:07:35 +01:00
Bjoern Rabenstein	f5f9f3514a	Major code cleanup. - Make it go-vet and golint clean. - Add comments, TODOs, etc. Change-Id: If1392d96f3d5b4cdde597b10c8dff1769fcfabe2	2014-11-25 17:02:53 +01:00
Bjoern Rabenstein	bbf49200ab	Implement methods in persistence.go. Change-Id: I804cdd0b30420e171825fd86fe1281eca0d5e638	2014-11-25 17:02:23 +01:00
Bjoern Rabenstein	5a128a04a9	Major reorganization of the storage. Most important, the heads file will now persist all the chunk descs, too. Implicitly, it will serve as the persisted form of the fp-to-series map. Change-Id: Ic867e78f2714d54c3b5733939cc5aef43f7bd08d	2014-11-25 17:02:01 +01:00
Bjoern Rabenstein	4770cf76a4	Make index package more self-contained. Moved interna from diskPersistence into the indexer. TotalIndexer now called diskIndexer. Change-Id: I6c8c62cb171f12bbd8a5474773af7786d71ba388	2014-11-25 17:02:01 +01:00
Bjoern Rabenstein	89f10e8eb2	Move to using the standard library interfaces for encoding/decoding. BinaryMarshaler instead of encodable. BinaryUnmarshaler instead of decodable. Left 'codable' in place for lack of a better word. Change-Id: I8a104be7d6db916e8dbc47ff95e6ff73b845ac22	2014-11-25 17:02:01 +01:00
Julius Volz	7e85711df0	Beginnings of a tiered index implementation. This reintroduces a LevelDB-based metrics index. Change-Id: I4111540301c52255a07b2f570761707a32f72c05	2014-11-25 17:02:00 +01:00
Julius Volz	8dfaa5ecd2	Remove use of freelists for chunk bufs. Change-Id: Ib887fdb61e1d96da0cd32545817b925ba88831c1	2014-11-25 17:02:00 +01:00
Bjoern Rabenstein	ecdf5ab14f	Index-persistence switched from gob to a hand-coded solution. Change-Id: Ib4ec42535bd08df16d34d4774bb638e35c5a1841	2014-11-25 17:02:00 +01:00
Julius Volz	e7ed39c9a6	Initial experimental snapshot of next-gen storage. Change-Id: Ifb8709960dbedd1d9f5efd88cdd359ee9fa9d26d	2014-11-25 17:02:00 +01:00

46 commits