prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-14 09:34:05 -08:00

Author	SHA1	Message	Date
Julius Volz	01f652cb4c	Separate storage implementation from interfaces. This was initially motivated by wanting to distribute the rule checker tool under `tools/rule_checker`. However, this was not possible without also distributing the LevelDB dynamic libraries because the tool transitively depended on Levigo: rule checker -> query layer -> tiered storage layer -> leveldb This change separates external storage interfaces from the implementation (tiered storage, leveldb storage, memory storage) by putting them into separate packages: - storage/metric: public, implementation-agnostic interfaces - storage/metric/tiered: tiered storage implementation, including memory and LevelDB storage. I initially also considered splitting up the implementation into separate packages for tiered storage, memory storage, and LevelDB storage, but these are currently so intertwined that it would be another major project in itself. The query layers and most other parts of Prometheus now have notion of the storage implementation anymore and just use whatever implementation they get passed in via interfaces. The rule_checker is now a static binary :) Change-Id: I793bbf631a8648ca31790e7e772ecf9c2b92f7a0	2014-04-16 13:30:19 +02:00
Matt T. Proud	3e969a8ca2	Parameterize the buffer for marshal/unmarshal. We are not reusing buffers yet. This could introduce problems, so the behavior is disabled for now. Cursory benchmark data: - Marshal for 10,000 samples: -30% overhead. - Unmarshal for 10,000 samples: -15% overhead. Change-Id: Ib006bdc656af45dca2b92de08a8f905d8d728cac	2014-04-16 12:16:59 +02:00
Julius Volz	c7c0b33d0b	Add regex-matching support for labels. There are four label-matching ops for selecting timeseries now: - Equal: = - NotEqual: != - RegexMatch: =~ - RegexNoMatch: !~ Instead of looking up labels by a simple clientmodel.LabelSet (basically an equals op for every key/value pair in the set), timeseries fingerprint selection is now done via a list of metric.LabelMatchers. Change-Id: I510a83f761198e80946146770ebb64e4abc3bb96	2014-04-01 14:24:53 +02:00
Julius Volz	ae30453214	Add label names -> label values index. Change-Id: Ie39b4044558afc4d1aa937de7dcf8df61f821fb4	2014-03-28 15:16:37 +01:00
Julius Volz	86fc13a52e	Convert metric.Values to slice of values. The initial impetus for this was that it made unmarshalling sample values much faster. Other relevant benchmark changes in ns/op: Benchmark old new speedup ================================================================== BenchmarkMarshal 179170 127996 1.4x BenchmarkUnmarshal 404984 132186 3.1x BenchmarkMemoryGetValueAtTime 57801 50050 1.2x BenchmarkMemoryGetBoundaryValues 64496 53194 1.2x BenchmarkMemoryGetRangeValues 66585 54065 1.2x BenchmarkStreamAdd 45.0 75.3 0.6x BenchmarkAppendSample1 1157 1587 0.7x BenchmarkAppendSample10 4090 4284 0.95x BenchmarkAppendSample100 45660 44066 1.0x BenchmarkAppendSample1000 579084 582380 1.0x BenchmarkMemoryAppendRepeatingValues 22796594 22005502 1.0x Overall, this gives us good speedups in the areas where they matter most: decoding values from disk and accessing the memory storage (which is also used for views). Some of the smaller append examples take minimally longer, but the cost seems to get amortized over larger appends, so I'm not worried about these. Also, we're currently not bottlenecked on the write path and have plenty of other optimizations available in that area if it becomes necessary. Memory allocations during appends don't change measurably at all. Change-Id: I7dc7394edea09506976765551f35b138518db9e8	2014-03-11 18:23:37 +01:00
Julius Volz	1eee448bc1	Store samples in custom binary encoding. This has been shown to provide immense decoding speed benefits. See also: https://groups.google.com/forum/#!topic/prometheus-developers/FeGl_qzGrYs Change-Id: I7d45b4650e44ddecaa91dad9d7fdb3cd0b9f15fe	2014-03-09 22:31:38 +01:00
Julius Volz	c6013ff309	Remove unused labelname -> fingerprints index. Change-Id: Ie4ccea3a230532e670030ca64ede9435b1b3e506	2014-03-05 23:49:33 +01:00
Bjoern Rabenstein	e11e8c7a23	Unify LevelDB.Options. We have seven different types all called like LevelDB.Options. One of them is the plain LevelDBOptions. All others are just wrapping that type without adding anything except clunkier handling. If there ever was a plan to add more specific options to the various LevelDB.*Options types, history has proven that nothing like that is going to happen anytime soon. To keep the code a bit shorter and more focused on the real (quite significant) complexities we have to deal with here, this commit reduces all uses of LevelDBOptions to the actual LevelDBOptions type. 1576 fewer characters to read... Change-Id: I3d7a2b7ffed78b337aa37f812c53c058329ecaa6	2014-02-27 16:03:58 +01:00
Bjoern Rabenstein	6bc083f38b	Major code cleanup in storage. - Mostly docstring fixed/additions. (Please review these carefully, since most of them were missing, I had to guess them from an outsider's perspective. (Which on the other hand proves how desperately required many of these docstrings are.)) - Removed all uses of new(...) to meet our own style guide (draft). - Fixed all other 'go vet' and 'golint' issues (except those that are not fixable (i.e. caused by bugs in or by design of 'go vet' and 'golint')). - Some trivial refactorings, like reorder functions, minor renames, ... - Some slightly less trivial refactoring, mostly to reduce code duplication by embedding types instead of writing many explicit forwarders. - Cleaned up the interface structure a bit. (Most significant probably the removal of the View-like methods from MetricPersistenc. Now they are only in View and not duplicated anymore.) - Removed dead code. (Probably not all of it, but it's a first step...) - Fixed a leftover in storage/metric/end_to_end_test.go (that made some parts of the code never execute (incidentally, those parts were broken (and I fixed them, too))). Change-Id: Ibcac069940d118a88f783314f5b4595dce6641d5	2014-02-27 15:22:37 +01:00
Julius Volz	740d448983	Use custom timestamp type for sample timestamps and related code. So far we've been using Go's native time.Time for anything related to sample timestamps. Since the range of time.Time is much bigger than what we need, this has created two problems: - there could be time.Time values which were out of the range/precision of the time type that we persist to disk, therefore causing incorrectly ordered keys. One bug caused by this was: https://github.com/prometheus/prometheus/issues/367 It would be good to use a timestamp type that's more closely aligned with what the underlying storage supports. - sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit Unix timestamp (possibly even a 32-bit one). Since we store samples in large numbers, this seriously affects memory usage. Furthermore, copying/working with the data will be faster if it's smaller. MEMORY USAGE RESULTS Initial memory usage comparisons for a running Prometheus with 1 timeseries and 100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my tests, this advantage for some reason decreased a bit the more samples the timeseries had (to 5-7% for millions of samples). This I can't fully explain, but perhaps garbage collection issues were involved. WHEN TO USE THE NEW TIMESTAMP TYPE The new clientmodel.Timestamp type should be used whenever time calculations are either directly or indirectly related to sample timestamps. For example: - the timestamp of a sample itself - all kinds of watermarks - anything that may become or is compared to a sample timestamp (like the timestamp passed into Target.Scrape()). When to still use time.Time: - for measuring durations/times not related to sample timestamps, like duration telemetry exporting, timers that indicate how frequently to execute some action, etc. NOTE ON OPERATOR OPTIMIZATION TESTS We don't use operator optimization code anymore, but it still lives in the code as dead code. It still has tests, but I couldn't get all of them to pass with the new timestamp format. I commented out the failing cases for now, but we should probably remove the dead code soon. I just didn't want to do that in the same change as this. Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f	2013-12-03 09:11:28 +01:00
Julius Volz	eb461a707d	Add chunk sanity checking to dumper tool. Also, move codecs/filters to common location so they can be used in subsequent test. Change-Id: I3ffeb09188b8f4552e42683cbc9279645f45b32e	2013-10-23 01:06:49 +02:00
Matt T. Proud	86fcbe5bde	Retain DTO on each cycle. Change-Id: Ifc6f68f98eacb01097771d0dbf043c98bba1d518	2013-09-05 10:14:34 +02:00
Matt T. Proud	4a87c002e8	Update low-level i'faces to reflect wireformats. This commit fixes a critique of the old storage API design, whereby the input parameters were always as raw bytes and never Protocol Buffer messages that encapsulated the data, meaning every place a read or mutation was conducted needed to manually perform said translations on its own. This is taxing. Change-Id: I4786938d0d207cefb7782bd2bd96a517eead186f	2013-09-04 17:13:58 +02:00
Matt T. Proud	1ceb25b701	Publication of LevelDBMetricPersistence Fields. This will enable us to break down the onerous construction method. Change-Id: Ia89337ba39d6745af6757180af2485ec8a990a3b	2013-08-13 00:36:12 +02:00
Julius Volz	0003027dce	Add needed trailing spaces in logs.	2013-08-12 18:22:48 +02:00
Julius Volz	aa5d251f8d	Use github.com/golang/glog for all logging.	2013-08-12 17:54:36 +02:00
Matt T. Proud	a5141e4d0a	Depointerize storage conf. and chain ingester. The storage builders need to work with the assumption that they have a copy of the underlying configuration data if any mutations are made.	2013-08-12 17:07:03 +02:00
Matt T. Proud	a3bf2efdd5	Replace index writes with wrapped interface. This commit is the first of several and should not be regarded as the desired end state for these cleanups. What this one does it, however, is wrap the query index writing behind an interface type that can be injected into the storage stack and have its lifecycle managed separately as needed. It also would mean we can swap out underlying implementations to support remote indexing, buffering, no-op indexing very easily. In the future, most of the individual index interface members in the tiered storage will go away in favor of agents that can query and resolve what they need from the datastore without the user knowing how and why they work.	2013-08-07 12:15:48 +02:00
Matt T. Proud	cc989c68e1	Replace direct curation table access with wrapper.	2013-08-06 12:02:52 +02:00
Matt T. Proud	07ac921aec	Code Review: First pass.	2013-08-05 17:31:49 +02:00
Matt T. Proud	d8792cfd86	Extract HighWatermarking. Clean up the rest.	2013-08-05 11:03:03 +02:00
Matt T. Proud	f4669a812c	Extract index storage into separate types.	2013-08-04 15:31:52 +02:00
Matt T. Proud	772d3d6b11	Consolidate LevelDB storage construction. There are too many parameters to constructing a LevelDB storage instance for a construction method, so I've opted to take an idiomatic approach of embedding them in a struct for easier mediation and versioning.	2013-08-03 17:25:03 +02:00
Matt T. Proud	30b1cf80b5	WIP - Snapshot of Moving to Client Model.	2013-06-25 15:52:42 +02:00
juliusv	42198c1f1c	Merge pull request #311 from prometheus/fix/watermarking/on-first-write Ensure new metrics are watermarked early.	2013-06-25 03:13:58 -07:00
Matt T. Proud	4137c75523	Shrink default LRU cache sizes. Observing Prometheus in production confirms we can lower these values safely.	2013-06-24 12:09:16 +02:00
Matt T. Proud	ee840904d2	Code Review: !Before -> After.	2013-06-21 18:26:40 +02:00
Matt T. Proud	a1a23fbaf8	Ensure new metrics are watermarked early. With the checking of fingerprint freshness to cull stale metrics from queries, we should write watermarks early to aid in more accurate responses.	2013-06-21 16:38:46 +02:00
Matt T. Proud	a73f061d3c	Persist solely Protocol Buffers. An design question was open for me in the beginning was whether to serialize other types to disk, but Protocol Buffers quickly won out, which allows us to drop support for other types. This is a good start to cleaning up a lot of cruft in the storage stack and can let us eventually decouple the various moving parts into separate subsystems for easier reasoning. This commit is not strictly required, but it is a start to making the rest a lot more enjoyable to interact with.	2013-06-08 11:02:35 +02:00
Julius Volz	7b9ee95030	Minor LevelDB watermark handling cleanups.	2013-06-06 23:56:31 +02:00
Julius Volz	750f862d9a	Use GetBoundaryValues() for non-counter deltas.	2013-05-22 19:13:47 +02:00
Matt T. Proud	c07abf8521	Initial move away from skiplist.	2013-05-22 17:59:53 +02:00
Matt T. Proud	74a66fd938	Spawn grouping of fingerprints with free semaphore. The previous implementation spawned N goroutines to group samples together and would not start work until the semaphore unblocked. While this didn't leak, it polluted the scheduling space. Thusly, the routine only starts after a semaphore has been acquired.	2013-05-21 16:11:35 +02:00
Julius Volz	5b105c77fc	Repointerize fingerprints.	2013-05-21 14:28:14 +02:00
Matt T. Proud	8f4c7ece92	Destroy naked returns in half of corpus. The use of naked return values is frowned upon. This is the first of two bulk updates to remove them.	2013-05-16 10:53:25 +03:00
Julius Volz	83c60ad43a	Fix GetMetricForFingerprint() metric mutability. Some users of GetMetricForFingerprint() end up modifying the returned metric labelset. Since the memory storage's implementation of GetMetricForFingerprint() returned a pointer to the metric (and maps are reference types anyways), the external mutation propagated back into the memory storage. The fix is to make a copy of the metric before returning it.	2013-05-14 16:46:30 +02:00
Matt T. Proud	b224251981	Simplify compaction and expose database sizes. This commit simplifies the way that compactions across a database's keyspace occur due to reading the LevelDB internals. Secondarily it introduces the database size estimation mechanisms. Include database health and help interfaces. Add database statistics; remove status goroutines. This commit kills the use of Go routines to expose status throughout the web components of Prometheus. It also dumps raw LevelDB status on a separate /databases endpoint.	2013-05-14 12:29:53 +02:00
Matt T. Proud	1f7f89b4e3	Simplify compaction and expose database sizes. This commit simplifies the way that compactions across a database's keyspace occur due to reading the LevelDB internals. Secondarily it introduces the database size estimation mechanisms.	2013-05-13 13:15:35 +02:00
Matt T. Proud	fa6a1f97d0	Expose interfaces for pruner and make pruner tool. In order to run database cleanups and diagnostics, we should have a means for pruning a database---even if LevelDB does this for us.	2013-05-10 17:07:03 +02:00
Bernerd Schaefer	5eb9840ed7	Fix goroutine leak in leveldb.AppendSamples The error channels in AppendSamples need to be buffered, since in the presence of errors their values may not be consumed.	2013-05-03 12:13:05 +02:00
Matt T. Proud	a3f1d81e24	Publicize a few storage components for curation. This commit introduces the publicization of Stop and other components, which the compaction curator shall take advantage of.	2013-05-02 13:16:04 +02:00
Matt T. Proud	561974308d	Add curation remark table and refactor error mgmt. The curator requires the existence of a curator remark table, which stores the progress for a given curation policy. The tests for the curator create an ad hoc table, but core Prometheus presently lacks said table, which this commit adds. Secondarily, the error handling for the LevelDB lifecycle functions in the metric persistence have been wrapped into an UncertaintyGroup, which mirrors some of the functions of sync.WaitGroup but adds error capturing capability to the mix.	2013-04-28 17:26:34 +02:00
Matt T. Proud	b3e34c6658	Implement batch database sample curator. This commit introduces to Prometheus a batch database sample curator, which corroborates the high watermarks for sample series against the curation watermark table to see whether a curator of a given type needs to be run. The curator is an abstract executor, which runs various curation strategies across the database. It remarks the progress for each type of curation processor that runs for a given sample series. A curation procesor is responsible for effectuating the underlying batch changes that are request. In this commit, we introduce the CompactionProcessor, which takes several bits of runtime metadata and combine sparse sample entries in the database together to form larger groups. For instance, for a given series it would be possible to have the curator effectuate the following grouping: - Samples Older than Two Weeks: Grouped into Bunches of 10000 - Samples Older than One Week: Grouped into Bunches of 1000 - Samples Older than One Day: Grouped into Bunches of 100 - Samples Older than One Hour: Grouped into Bunches of 10 The benefits hereof of such a compaction are 1. a smaller search space in the database keyspace, 2. better employment of compression for repetious values, and 3. reduced seek times.	2013-04-27 17:38:18 +02:00
Julius Volz	2202cd71c9	Track alerts over time and write out alert timeseries.	2013-04-26 14:35:21 +02:00
Matt T. Proud	b1a8e51b07	Extract dto.SampleValueSeries into model.Values.	2013-04-22 13:31:11 +02:00
Matt T. Proud	db4ffbb262	Wrap dto.SampleKey with business logic type. The curator work can be done easier if dto.SampleKey is no longer directly accessed but rather has a higher level type around it that captures a certain modicum of business logic. This doesn't look terribly interesting today, but it will get more so.	2013-04-21 20:38:39 +02:00
Matt T. Proud	f9e99bd08a	Refresh SampleValue to 64-bit floating point. We always knew that this needed to be fixed.	2013-04-21 20:31:50 +02:00
Julius Volz	99dcbe0f94	Integrate memory and disk layers in view rendering.	2013-04-19 16:01:27 +02:00
Julius Volz	63625bd244	Make view use memory persistence, remove obsolete code. This makes the memory persistence the backing store for views and adjusts the MetricPersistence interface accordingly. It also removes unused Get* method implementations from the LevelDB persistence so they don't need to be adapted to the new interface. In the future, we should rethink these interfaces. All staleness and interpolation handling is now removed from the storage layer and will be handled only by the query layer in the future.	2013-04-18 22:26:29 +02:00
Matt T. Proud	a55602df4a	Validate diskFrontier domain for series candidate. It is the case with the benchmark tool that we thought that we generated multiple series and saved them to the disk as such, when in reality, we overwrote the fields of the outgoing metrics via Go map reference behavior. This was accidental. In the course of diagnosing this, a few errors were found: 1. ``newSeriesFrontier`` should check to see if the candidate fingerprint is within the given domain of the ``diskFrontier``. If not, as the contract in the docstring stipulates, a ``nil`` ``seriesFrontier`` should be emitted. 2. In the interests of aiding debugging, the raw LevelDB ``levigoIterator`` type now includes a helpful forensics ``String()`` method. This work produced additional cleanups: 1. ``Close() error`` with the storage stack is technically incorrect, since nowhere in the bowels of it does an error actually occur. The interface has been simplified to remove this for now.	2013-04-09 11:47:16 +02:00

1 2

72 commits