prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2025-03-05 20:59:13 -08:00

Author	SHA1	Message	Date
Julius Volz	c6e9f085a3	Update used Go version to 1.3. Go downloads moved to a different URL and require following redirects (curl's '-L' option) now. Go 1.3 deliberately randomizes ranges over maps, which uncovered some bugs in our tests. These are fixed too. Change-Id: Id2d9e185d8d2379a9b7b8ad5ba680024565d15f4	2014-11-25 17:02:00 +01:00
Bjoern Rabenstein	1909686789	Make metrics exported by the Prometheus server itself more consistent. - Always spell out the time unit (e.g. milliseconds instead of ms). - Remove "_total" from the names of metrics that are not counters. - Make use of the "Namespace" and "Subsystem" fields in the options. - Removed the "capacity" facet from all metrics about channels/queues. These are all fixed via command line flags and will never change during the runtime of a process. Also, they should not be part of the same metric family. I have added separate metrics for the capacity of queues as convenience. (They will never change and are only set once.) - I left "metric_disk_latency_microseconds" unchanged, although that metric measures the latency of the storage device, even if it is not a spinning disk. "SSD" is read by many as "solid state disk", so it's not too far off. (It should be "solid state drive", of course, but "metric_drive_latency_microseconds" is probably confusing.) - Brian suggested to not mix "failure" and "success" outcome in the same metric family (distinguished by labels). For now, I left it as it is. We are touching some bigger issue here, especially as other parts in the Prometheus ecosystem are following the same principle. We still need to come to terms here and then change things consistently everywhere. Change-Id: If799458b450d18f78500f05990301c12525197d3	2014-11-25 17:02:00 +01:00
Julius Volz	80b3d3bf34	Speed up disk flushes by removing unnecessary sort. The first sort in groupByFingerprint already ensures that all resulting sample lists contain only one fingerprint. We also already assume that all samples passed into AppendSamples (and thus groupByFingerprint) are chronologically sorted within each fingerprint. The extra chronological sort is thus superfluous. Furthermore, this second sort didn't only sort chronologically, but also compared all metric fingerprints again (although we already know that we're only sorting within samples for the same fingerprint). This caused a huge memory and runtime overhead. In a heavily loaded real Prometheus, this brought down disk flush times from ~9 minutes to ~1 minute. OLD: BenchmarkLevelDBAppendRepeatingValues 5 331391808 ns/op 44542953 B/op 597788 allocs/op BenchmarkLevelDBAppendsRepeatingValues 5 329893512 ns/op 46968288 B/op 3104373 allocs/op NEW: BenchmarkLevelDBAppendRepeatingValues 5 299298635 ns/op 43329497 B/op 567616 allocs/op BenchmarkLevelDBAppendsRepeatingValues 20 92204601 ns/op `1779454` B/op 70975 allocs/op Change-Id: Ie2d8db3569b0102a18010f9e106e391fda7f7883	2014-11-25 17:01:59 +01:00
Julius Volz	21cafe6cd7	Only evict memory series after they are on disk. This fixes the problem where samples become temporarily unavailable for queries while they are being flushed to disk. Although the entire flushing code could use some major refactoring, I'm explicitly trying to do the minimal change to fix the problem since there's a whole new storage implementation in the pipeline. Change-Id: I0f5393a30b88654c73567456aeaea62f8b3756d9	2014-11-25 17:01:59 +01:00
Bjoern Rabenstein	8956faeccb	Migrate to new client_golang. This change will only be submitted when the new client_golang has been moved to the new version. Change-Id: Ifceb59333072a08286a8ac910709a8ba2e3a1581	2014-11-25 17:01:59 +01:00
Brian Brazil	e041c0cd46	Add console and alert templates with access to all data. Move rulemanager to it's own package to break cicrular dependency. Make NewTestTieredStorage available to tests, remove duplication. Change-Id: I33b321245a44aa727bfc3614a7c9ae5005b34e03	2014-05-30 16:24:56 +01:00
Bjoern Rabenstein	ca6a4fccef	Weed out our homegrown test.Tester. The Go stdlib has testing.TB now, which fulfills the exact same purpose. Change-Id: I0db9c73400e208ca376b932a02b7e3402234b87c	2014-05-21 19:27:24 +02:00
Julius Volz	4df5c7ab18	Optimize label matcher memory and runtime behavior. This optimizes the runtime and memory allocation behavior for label matchers other than type "Equal". Instead of creating a new set for every union of fingerprints, this simply adds new fingerprints to the existing set to achieve the same effect. The current behavior made a production Prometheus unresponsive when running a NotEqual match against the "instance" label (a label with high value cardinality). BEFORE: BenchmarkGetFingerprintsForNotEqualMatcher 10 170430297 ns/op 39229944 B/op 40709 allocs/op AFTER: BenchmarkGetFingerprintsForNotEqualMatcher 5000 706260 ns/op 217717 B/op 1116 allocs/op Change-Id: Ifd78e81e7dfbf5d7249e50ad1903a5d9c42c347a	2014-05-05 11:29:17 -04:00
Bjoern Rabenstein	de9a88b964	Ensure temporal order in streams. BenchmarkAppendSample.* before this change: BenchmarkAppendSample1 1000000 1142 ns/op --- BENCH: BenchmarkAppendSample1 memory_test.go:81: 1 cycles with 9992.000000 bytes per cycle, totalling 9992 memory_test.go:81: 100 cycles with 250.399994 bytes per cycle, totalling 25040 memory_test.go:81: 10000 cycles with 239.428802 bytes per cycle, totalling 2394288 memory_test.go:81: 1000000 cycles with 255.504684 bytes per cycle, totalling 255504688 BenchmarkAppendSample10 500000 3823 ns/op --- BENCH: BenchmarkAppendSample10 memory_test.go:81: 1 cycles with 15536.000000 bytes per cycle, totalling 15536 memory_test.go:81: 100 cycles with 662.239990 bytes per cycle, totalling 66224 memory_test.go:81: 10000 cycles with 601.937622 bytes per cycle, totalling 6019376 memory_test.go:81: 500000 cycles with 598.582764 bytes per cycle, totalling 299291408 BenchmarkAppendSample100 50000 41111 ns/op --- BENCH: BenchmarkAppendSample100 memory_test.go:81: 1 cycles with 79824.000000 bytes per cycle, totalling 79824 memory_test.go:81: 100 cycles with 4924.479980 bytes per cycle, totalling 492448 memory_test.go:81: 10000 cycles with 4278.019043 bytes per cycle, totalling 42780192 memory_test.go:81: 50000 cycles with 4275.242676 bytes per cycle, totalling 213762144 BenchmarkAppendSample1000 5000 533933 ns/op --- BENCH: BenchmarkAppendSample1000 memory_test.go:81: 1 cycles with 840224.000000 bytes per cycle, totalling 840224 memory_test.go:81: 100 cycles with 62789.281250 bytes per cycle, totalling 6278928 memory_test.go:81: 5000 cycles with 55208.601562 bytes per cycle, totalling 276043008 ok github.com/prometheus/prometheus/storage/metric/tiered 27.828s BenchmarkAppendSample.* after this change: BenchmarkAppendSample1 1000000 1109 ns/op --- BENCH: BenchmarkAppendSample1 memory_test.go:131: 1 cycles with 9992.000000 bytes per cycle, totalling 9992 memory_test.go:131: 100 cycles with 250.399994 bytes per cycle, totalling 25040 memory_test.go:131: 10000 cycles with 239.220795 bytes per cycle, totalling 2392208 memory_test.go:131: 1000000 cycles with 255.492630 bytes per cycle, totalling 255492624 BenchmarkAppendSample10 500000 3663 ns/op --- BENCH: BenchmarkAppendSample10 memory_test.go:131: 1 cycles with 15536.000000 bytes per cycle, totalling 15536 memory_test.go:131: 100 cycles with 662.239990 bytes per cycle, totalling 66224 memory_test.go:131: 10000 cycles with 601.889587 bytes per cycle, totalling 6018896 memory_test.go:131: 500000 cycles with 598.550903 bytes per cycle, totalling 299275472 BenchmarkAppendSample100 50000 40694 ns/op --- BENCH: BenchmarkAppendSample100 memory_test.go:131: 1 cycles with 78976.000000 bytes per cycle, totalling 78976 memory_test.go:131: 100 cycles with 4928.319824 bytes per cycle, totalling 492832 memory_test.go:131: 10000 cycles with 4277.961426 bytes per cycle, totalling 42779616 memory_test.go:131: 50000 cycles with 4275.054199 bytes per cycle, totalling 213752720 BenchmarkAppendSample1000 5000 530744 ns/op --- BENCH: BenchmarkAppendSample1000 memory_test.go:131: 1 cycles with 842192.000000 bytes per cycle, totalling 842192 memory_test.go:131: 100 cycles with 62765.441406 bytes per cycle, totalling 6276544 memory_test.go:131: 5000 cycles with 55209.812500 bytes per cycle, totalling 276049056 ok github.com/prometheus/prometheus/storage/metric/tiered 27.468s Change-Id: Idaa339cd83539b5e4391614541a2c3a04002d66d	2014-04-22 15:22:54 +02:00
Julius Volz	1b29975865	Fix RWLock memory storage deadlock. This fixes https://github.com/prometheus/prometheus/issues/390 The cause for the deadlock was a lock semantic in Go that wasn't obvious to me when introducing this bug: http://golang.org/pkg/sync/#RWMutex.Lock Key phrase: "To ensure that the lock eventually becomes available, a blocked Lock call excludes new readers from acquiring the lock." In the memory series storage, we have one function (GetFingerprintsForLabelMatchers) acquiring an RLock(), which calls another function also acquiring the same RLock() (GetLabelValuesForLabelName). That normally doesn't deadlock, unless a Lock() call from another goroutine happens right in between the two RLock() calls, blocking both the Lock() and the second RLock() call from ever completing. GoRoutine 1 GoRoutine 2 ====================================== RLock() ... Lock() [DEADLOCK] RLock() [DEADLOCK] Unlock() RUnlock() RUnlock() Testing deadlocks is tricky, but the regression test I added does reliably detect the deadlock in the original code on my machine within a normal concurrent reader/writer run duration of 250ms. Change-Id: Ib34c2bb8df1a80af44550cc2bf5007055cdef413	2014-04-17 13:43:13 +02:00
Julius Volz	01f652cb4c	Separate storage implementation from interfaces. This was initially motivated by wanting to distribute the rule checker tool under `tools/rule_checker`. However, this was not possible without also distributing the LevelDB dynamic libraries because the tool transitively depended on Levigo: rule checker -> query layer -> tiered storage layer -> leveldb This change separates external storage interfaces from the implementation (tiered storage, leveldb storage, memory storage) by putting them into separate packages: - storage/metric: public, implementation-agnostic interfaces - storage/metric/tiered: tiered storage implementation, including memory and LevelDB storage. I initially also considered splitting up the implementation into separate packages for tiered storage, memory storage, and LevelDB storage, but these are currently so intertwined that it would be another major project in itself. The query layers and most other parts of Prometheus now have notion of the storage implementation anymore and just use whatever implementation they get passed in via interfaces. The rule_checker is now a static binary :) Change-Id: I793bbf631a8648ca31790e7e772ecf9c2b92f7a0	2014-04-16 13:30:19 +02:00
Matt T. Proud	3e969a8ca2	Parameterize the buffer for marshal/unmarshal. We are not reusing buffers yet. This could introduce problems, so the behavior is disabled for now. Cursory benchmark data: - Marshal for 10,000 samples: -30% overhead. - Unmarshal for 10,000 samples: -15% overhead. Change-Id: Ib006bdc656af45dca2b92de08a8f905d8d728cac	2014-04-16 12:16:59 +02:00
Matt T. Proud	58ef638e72	Merge "Use idiomatic one-to-many one-time signal pattern."	2014-04-15 21:26:31 +02:00
Matt T. Proud	6ec72393c4	Correct size of unmarshalling destination buffer. The format header size is not deducted from the size of the byte stream when calculating the output buffer size for samples. I have yet to notice problems directly as a result of this, but it is good to fix. Change-Id: Icb07a0718366c04ddac975d738a6305687773af0	2014-04-15 11:55:44 +02:00
Matt T. Proud	81367893fd	Use idiomatic one-to-many one-time signal pattern. The idiomatic pattern for signalling a one-time message to multiple consumers from a single producer is as follows: ``` c := make(chan struct{}) w := new(sync.WaitGroup) // Boilerplate to ensure synchronization. for i := 0; i < 1000; i++ { w.Add(1) go func() { defer w.Done() for { select { case _, ok := <- c: if !ok { return } default: // Do something here. } } }() } close(c) // Signal the one-to-many single-use message. w.Wait() ``` Change-Id: I755f73ba4c70a923afd342a4dea63365bdf2144b	2014-04-15 10:15:25 +02:00
Julius Volz	c7c0b33d0b	Add regex-matching support for labels. There are four label-matching ops for selecting timeseries now: - Equal: = - NotEqual: != - RegexMatch: =~ - RegexNoMatch: !~ Instead of looking up labels by a simple clientmodel.LabelSet (basically an equals op for every key/value pair in the set), timeseries fingerprint selection is now done via a list of metric.LabelMatchers. Change-Id: I510a83f761198e80946146770ebb64e4abc3bb96	2014-04-01 14:24:53 +02:00
Julius Volz	ae30453214	Add label names -> label values index. Change-Id: Ie39b4044558afc4d1aa937de7dcf8df61f821fb4	2014-03-28 15:16:37 +01:00
Julius Volz	7a577b86b7	Fix interval op special case. In the case that a getValuesAtIntervalOp's ExtractSamples() is called with a current time after the last chunk time, we return without extracting any further values beyond the last one in the chunk (correct), but also without advancing the op's time (incorrect). This leads to an infinite loop in renderView(), since the op is called repeatedly without ever being advanced and consumed. This adds handling for this special case. When detecting this case, we immediately set the op to be consumed, since we would always get a value after the current time passed in if there was one. Change-Id: Id99149e07b5188d655331382b8b6a461b677005c	2014-03-26 13:29:03 +01:00
Bjoern Rabenstein	257b720e87	Fix typo. Change-Id: I6e7edcb48ace7fe4d6de4ff16519da5bb326b6ce	2014-03-25 12:22:18 +01:00
Bjoern Rabenstein	caf47b2fbc	New encoding for OpenTSDB tag values (and metric names). Change-Id: I0f4393f638c6e2bb2b2ce14e58e38b49ce456da8	2014-03-21 17:18:44 +01:00
Julius Volz	9d5c367745	Fix incorrect interval op advancement. This fixes a bug where an interval op might advance too far past the end of the currently extracted chunk, effectively skipping over relevant (to-be-extracted) values in the subsequent chunk. The result: missing samples at chunk boundaries in the resulting view. Change-Id: Iebf5d086293a277d330039c69f78e1eaf084b3c8	2014-03-18 16:22:50 +01:00
Julius Volz	cc04238a85	Switch to new "__name__" metric name label. This also fixes the compaction test, which before worked only because the input sample sorting was accidentally equal to the resulting on-disk sample sorting. Change-Id: I2a21c4b46ba562424b27058fc02eba84fa6a6006	2014-03-14 16:52:37 +01:00
Bjoern Rabenstein	c3b282bd14	Add regression tests for 'loop until op is consumed' bug. - Most of this is the actual regression test in tiered_test.go. - Working on that regression tests uncovered problems in tiered_test.go that are fixed in this commit. - The 'op.consumed = false' line added to freelist.go was actually not fixing a bug. Instead, there was no bug at all. So this commit removes that line again, but adds a regression test to make sure that the assumed bug is indeed not there (cf. freelist_test.go). - Removed more code duplication in operation.go (following the same approach as before, i.e. embedding op type A into op type B if everything in A is the same as in B with the exception of String() and ExtractSample()). (This change make struct literals for ops more clunky, but that only affects tests. No code change whatsoever was necessary in the actual code after this refactoring.) - Fix another op leak in tiered.go. Change-Id: Ia165c52e33290ad4f6aba9c83d92318d4f583517	2014-03-12 18:40:24 +01:00
Julius Volz	86fc13a52e	Convert metric.Values to slice of values. The initial impetus for this was that it made unmarshalling sample values much faster. Other relevant benchmark changes in ns/op: Benchmark old new speedup ================================================================== BenchmarkMarshal 179170 127996 1.4x BenchmarkUnmarshal 404984 132186 3.1x BenchmarkMemoryGetValueAtTime 57801 50050 1.2x BenchmarkMemoryGetBoundaryValues 64496 53194 1.2x BenchmarkMemoryGetRangeValues 66585 54065 1.2x BenchmarkStreamAdd 45.0 75.3 0.6x BenchmarkAppendSample1 1157 1587 0.7x BenchmarkAppendSample10 4090 4284 0.95x BenchmarkAppendSample100 45660 44066 1.0x BenchmarkAppendSample1000 579084 582380 1.0x BenchmarkMemoryAppendRepeatingValues 22796594 22005502 1.0x Overall, this gives us good speedups in the areas where they matter most: decoding values from disk and accessing the memory storage (which is also used for views). Some of the smaller append examples take minimally longer, but the cost seems to get amortized over larger appends, so I'm not worried about these. Also, we're currently not bottlenecked on the write path and have plenty of other optimizations available in that area if it becomes necessary. Memory allocations during appends don't change measurably at all. Change-Id: I7dc7394edea09506976765551f35b138518db9e8	2014-03-11 18:23:37 +01:00
Julius Volz	a7d0973fe3	Add version field to LevelDB sample format. This doesn't add complex discriminator logic yet, but adds a single version byte to the beginning of each samples chunk. If we ever need to change the disk format again, this will make it easy to do so without having to wipe the entire database. Change-Id: I60c39274256f790bc2da83167a1effaa174588fe	2014-03-11 14:08:40 +01:00
Julius Volz	1eee448bc1	Store samples in custom binary encoding. This has been shown to provide immense decoding speed benefits. See also: https://groups.google.com/forum/#!topic/prometheus-developers/FeGl_qzGrYs Change-Id: I7d45b4650e44ddecaa91dad9d7fdb3cd0b9f15fe	2014-03-09 22:31:38 +01:00
Julius Volz	c2a2a20f36	Remove obsolete scanjobs timer. Change-Id: Ifb29b4d93c9c1c6cacb8b098d5237866925c9fac	2014-03-07 17:10:28 +01:00
Julius Volz	dd4892dcad	Ensure no ops are leaked in renderView(). Change-Id: I6970a9098be305fcd010d46443b040d864d9740a	2014-03-07 14:33:13 +01:00
Julius Volz	5745ce0a60	Fixups for single-op-per-fingerprint view rendering. Change-Id: Ie496d4529b65a3819c6042f43d7cf99e0e1ac60b	2014-03-07 00:54:28 +01:00
Björn Rabenstein	8b43497002	Merge "Fix memory series indexing bug."	2014-03-06 11:53:10 +01:00
Björn Rabenstein	0bb33b6525	Merge "Remove unused labelname -> fingerprints index."	2014-03-06 11:40:09 +01:00
Julius Volz	d6827b6898	Fix memory series indexing bug. This fixes https://github.com/prometheus/prometheus/issues/381. For any stale series we dropped from memory, this bug caused us to also drop any other series from the labelpair->fingerprints memory index if they had any label/value-pairs in common with the intentionally dropped series. To fix this issue more easily, I converted the labelpair->fingerprints index map values to a utility.Set of clientmodel.Fingerprints. This makes handling this index much easier in general. Change-Id: If5e81e202e8c542261bbd9797aa1257376c5c074	2014-03-06 01:23:22 +01:00
Julius Volz	c6013ff309	Remove unused labelname -> fingerprints index. Change-Id: Ie4ccea3a230532e670030ca64ede9435b1b3e506	2014-03-05 23:49:33 +01:00
Bjoern Rabenstein	9ea9189dd1	Remove the multi-op-per-fingerprint capability. Currently, rendering a view is capable of handling multiple ops for the same fingerprint efficiently. However, this capability requires a lot of complexity in the code, which we are not using at all because the way we assemble a viewRequest will never have more than one operation per fingerprint. This commit weeds out the said capability, along with all the code needed for it. It is still possible to have more than one operation for the same fingerprint, it will just be handled in a less efficient way (as proven by the unit tests). As a result, scanjob.go could be removed entirely. This commit also contains a few related refactorings and removals of dead code in operation.go, view,go, and freelist.go. Also, the docstrings received some love. Change-Id: I032b976e0880151c3f3fdb3234fb65e484f0e2e5	2014-03-04 16:29:56 +01:00
Bjoern Rabenstein	e11e8c7a23	Unify LevelDB.Options. We have seven different types all called like LevelDB.Options. One of them is the plain LevelDBOptions. All others are just wrapping that type without adding anything except clunkier handling. If there ever was a plan to add more specific options to the various LevelDB.*Options types, history has proven that nothing like that is going to happen anytime soon. To keep the code a bit shorter and more focused on the real (quite significant) complexities we have to deal with here, this commit reduces all uses of LevelDBOptions to the actual LevelDBOptions type. 1576 fewer characters to read... Change-Id: I3d7a2b7ffed78b337aa37f812c53c058329ecaa6	2014-02-27 16:03:58 +01:00
Bjoern Rabenstein	6bc083f38b	Major code cleanup in storage. - Mostly docstring fixed/additions. (Please review these carefully, since most of them were missing, I had to guess them from an outsider's perspective. (Which on the other hand proves how desperately required many of these docstrings are.)) - Removed all uses of new(...) to meet our own style guide (draft). - Fixed all other 'go vet' and 'golint' issues (except those that are not fixable (i.e. caused by bugs in or by design of 'go vet' and 'golint')). - Some trivial refactorings, like reorder functions, minor renames, ... - Some slightly less trivial refactoring, mostly to reduce code duplication by embedding types instead of writing many explicit forwarders. - Cleaned up the interface structure a bit. (Most significant probably the removal of the View-like methods from MetricPersistenc. Now they are only in View and not duplicated anymore.) - Removed dead code. (Probably not all of it, but it's a first step...) - Fixed a leftover in storage/metric/end_to_end_test.go (that made some parts of the code never execute (incidentally, those parts were broken (and I fixed them, too))). Change-Id: Ibcac069940d118a88f783314f5b4595dce6641d5	2014-02-27 15:22:37 +01:00
Björn Rabenstein	59febe771a	Merge "Minor code cleanups."	2014-02-13 15:29:16 +01:00
Julius Volz	c4adfc4f25	Minor code cleanups. Change-Id: Ib3729cf38b107b7f2186ccf410a745e0472e3630	2014-02-13 15:24:43 +01:00
Julius Volz	8cadae6102	Merge "Fix LevelDB closing order."	2014-02-03 23:22:30 +01:00
Julius Volz	94666e20b7	Minor test error reporting cleanup. Change-Id: Ie11c16b4e60de7c179c6d2a86e063f4432e2000f	2014-02-03 12:27:01 +01:00
Julius Volz	fd2158e746	Store copy of metric during fingerprint caching Problem description: ==================== If a rule evaluation referencing a metric/timeseries M happens at a time when M doesn't have a memory timeseries yet, looking up the fingerprint for M (via TieredStorage.GetMetricForFingerprint()) will create a new Metric object for M which gets both: a) attached to a new empty memory timeseries (so we don't have to ask disk for the Metric's fingerprint next time), and b) returned to the rule evaluation layer. However, the rule evaluation layer replaces the name label (and possibly other labels) of the metric with the name of the recorded rule. Since both the rule evaluator and the memory storage share a reference to the same Metric object, the original memory timeseries will now also be incorrectly renamed. Fix: ==== Instead of storing a reference to a shared metric object, take a copy of the object when creating an empty memory timeseries for caching purposes. Change-Id: I9f2172696c16c10b377e6708553a46ef29390f1e	2014-02-02 17:11:08 +01:00
Julius Volz	718ad2224b	Fix LevelDB closing order. The storage itself should be closed before any of the objects passed into it are closed (otherwise closing the storage can randomly freeze). Defers are executed in reverse order, so closing the storage should be the last of the defer statements. Change-Id: Id920318b876f5b94767ed48c81221b3456770620	2014-01-28 15:16:06 +01:00
Bjoern Rabenstein	c342ad33a0	Fix OperatorError. This used to work with Go 1.1, but only because of a compiler bug. The bug is fixed in Go 1.2, so we have to fix our code now. Change-Id: I5a9f3a15878afd750e848be33e90b05f3aa055e1	2014-01-21 16:49:51 +01:00
Julius Volz	d5ef0c64dc	Merge "Add optional sample replication to OpenTSDB."	2014-01-08 17:45:08 +01:00
Julius Volz	61d26e8445	Add optional sample replication to OpenTSDB. Prometheus needs long-term storage. Since we don't have enough resources to build our own timeseries storage from scratch ontop of Riak, Cassandra or a similar distributed datastore at the moment, we're planning on using OpenTSDB as long-term storage for Prometheus. It's data model is roughly compatible with that of Prometheus, with some caveats. As a first step, this adds write-only replication from Prometheus to OpenTSDB, with the following things worth noting: 1) I tried to keep the integration lightweight, meaning that anything related to OpenTSDB is isolated to its own package and only main knows about it (essentially it tees all samples to both the existing storage and TSDB). It's not touching the existing TieredStorage at all to avoid more complexity in that area. This might change in the future, especially if we decide to implement a read path for OpenTSDB through Prometheus as well. 2) Backpressure while sending to OpenTSDB is handled by simply dropping samples on the floor when the in-memory queue of samples destined for OpenTSDB runs full. Prometheus also only attempts to send samples once, rather than implementing a complex retry algorithm. Thus, replication to OpenTSDB is best-effort for now. If needed, this may be extended in the future. 3) Samples are sent in batches of limited size to OpenTSDB. The optimal batch size, timeout parameters, etc. may need to be adjusted in the future. 4) OpenTSDB has different rules for legal characters in tag (label) values. While Prometheus allows any characters in label values, OpenTSDB limits them to a to z, A to Z, 0 to 9, -, _, . and /. Currently any illegal characters in Prometheus label values are simply replaced by an underscore. Especially when integrating OpenTSDB with the read path in Prometheus, we'll need to reconsider this: either we'll need to introduce the same limitations for Prometheus labels or escape/encode illegal characters in OpenTSDB in such a way that they are fully decodable again when reading through Prometheus, so that corresponding timeseries in both systems match in their labelsets. Change-Id: I8394c9c55dbac3946a0fa497f566d5e6e2d600b5	2014-01-02 18:21:38 +01:00
Stuart Nelson	0c58e388f6	rename curation metrics to prometheus_curation Change-Id: I6a0bf277e88ea8eb737670b7e865ae20f2cbfb91	2013-12-13 17:45:01 -05:00
Stuart Nelson	28f59edf16	Added telemetry for counting stored samples Change-Id: I0f36f7c2738d070ca2f107fcb315f98e46803af3	2013-12-12 10:06:41 -05:00
Tobias Schmidt	6947ee9bc9	Try to create metrics root directory if missing This change tries to be nice and create the metrics directoy first before erroring out. Change-Id: I72691cdc32469708cd671c6ef1fb7db55fe60430	2013-12-03 18:16:13 +07:00
Julius Volz	740d448983	Use custom timestamp type for sample timestamps and related code. So far we've been using Go's native time.Time for anything related to sample timestamps. Since the range of time.Time is much bigger than what we need, this has created two problems: - there could be time.Time values which were out of the range/precision of the time type that we persist to disk, therefore causing incorrectly ordered keys. One bug caused by this was: https://github.com/prometheus/prometheus/issues/367 It would be good to use a timestamp type that's more closely aligned with what the underlying storage supports. - sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit Unix timestamp (possibly even a 32-bit one). Since we store samples in large numbers, this seriously affects memory usage. Furthermore, copying/working with the data will be faster if it's smaller. MEMORY USAGE RESULTS Initial memory usage comparisons for a running Prometheus with 1 timeseries and 100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my tests, this advantage for some reason decreased a bit the more samples the timeseries had (to 5-7% for millions of samples). This I can't fully explain, but perhaps garbage collection issues were involved. WHEN TO USE THE NEW TIMESTAMP TYPE The new clientmodel.Timestamp type should be used whenever time calculations are either directly or indirectly related to sample timestamps. For example: - the timestamp of a sample itself - all kinds of watermarks - anything that may become or is compared to a sample timestamp (like the timestamp passed into Target.Scrape()). When to still use time.Time: - for measuring durations/times not related to sample timestamps, like duration telemetry exporting, timers that indicate how frequently to execute some action, etc. NOTE ON OPERATOR OPTIMIZATION TESTS We don't use operator optimization code anymore, but it still lives in the code as dead code. It still has tests, but I couldn't get all of them to pass with the new timestamp format. I commented out the failing cases for now, but we should probably remove the dead code soon. I just didn't want to do that in the same change as this. Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f	2013-12-03 09:11:28 +01:00
Julius Volz	6b7de31a3c	Upgrade to LevelDB 1.14.0 to fix LevelDB bugs. This tentatively fixes https://github.com/prometheus/prometheus/issues/368 due to an upstream bugfix in snapshotted LevelDB iterator handling, which got fixed in LevelDB 1.14.0: https://code.google.com/p/leveldb/issues/detail?id=200 Change-Id: Ib0cc67b7d3dc33913a1c16736eff32ef702c63bf	2013-12-03 09:07:15 +01:00

1 2 3 4 5 ...

321 commits