prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-12-28 06:59:40 -08:00

Author	SHA1	Message	Date
Julius Volz	c7c0b33d0b	Add regex-matching support for labels. There are four label-matching ops for selecting timeseries now: - Equal: = - NotEqual: != - RegexMatch: =~ - RegexNoMatch: !~ Instead of looking up labels by a simple clientmodel.LabelSet (basically an equals op for every key/value pair in the set), timeseries fingerprint selection is now done via a list of metric.LabelMatchers. Change-Id: I510a83f761198e80946146770ebb64e4abc3bb96	2014-04-01 14:24:53 +02:00
Bjoern Rabenstein	c3b282bd14	Add regression tests for 'loop until op is consumed' bug. - Most of this is the actual regression test in tiered_test.go. - Working on that regression tests uncovered problems in tiered_test.go that are fixed in this commit. - The 'op.consumed = false' line added to freelist.go was actually not fixing a bug. Instead, there was no bug at all. So this commit removes that line again, but adds a regression test to make sure that the assumed bug is indeed not there (cf. freelist_test.go). - Removed more code duplication in operation.go (following the same approach as before, i.e. embedding op type A into op type B if everything in A is the same as in B with the exception of String() and ExtractSample()). (This change make struct literals for ops more clunky, but that only affects tests. No code change whatsoever was necessary in the actual code after this refactoring.) - Fix another op leak in tiered.go. Change-Id: Ia165c52e33290ad4f6aba9c83d92318d4f583517	2014-03-12 18:40:24 +01:00
Julius Volz	1eee448bc1	Store samples in custom binary encoding. This has been shown to provide immense decoding speed benefits. See also: https://groups.google.com/forum/#!topic/prometheus-developers/FeGl_qzGrYs Change-Id: I7d45b4650e44ddecaa91dad9d7fdb3cd0b9f15fe	2014-03-09 22:31:38 +01:00
Julius Volz	c2a2a20f36	Remove obsolete scanjobs timer. Change-Id: Ifb29b4d93c9c1c6cacb8b098d5237866925c9fac	2014-03-07 17:10:28 +01:00
Julius Volz	dd4892dcad	Ensure no ops are leaked in renderView(). Change-Id: I6970a9098be305fcd010d46443b040d864d9740a	2014-03-07 14:33:13 +01:00
Julius Volz	5745ce0a60	Fixups for single-op-per-fingerprint view rendering. Change-Id: Ie496d4529b65a3819c6042f43d7cf99e0e1ac60b	2014-03-07 00:54:28 +01:00
Bjoern Rabenstein	9ea9189dd1	Remove the multi-op-per-fingerprint capability. Currently, rendering a view is capable of handling multiple ops for the same fingerprint efficiently. However, this capability requires a lot of complexity in the code, which we are not using at all because the way we assemble a viewRequest will never have more than one operation per fingerprint. This commit weeds out the said capability, along with all the code needed for it. It is still possible to have more than one operation for the same fingerprint, it will just be handled in a less efficient way (as proven by the unit tests). As a result, scanjob.go could be removed entirely. This commit also contains a few related refactorings and removals of dead code in operation.go, view,go, and freelist.go. Also, the docstrings received some love. Change-Id: I032b976e0880151c3f3fdb3234fb65e484f0e2e5	2014-03-04 16:29:56 +01:00
Bjoern Rabenstein	6bc083f38b	Major code cleanup in storage. - Mostly docstring fixed/additions. (Please review these carefully, since most of them were missing, I had to guess them from an outsider's perspective. (Which on the other hand proves how desperately required many of these docstrings are.)) - Removed all uses of new(...) to meet our own style guide (draft). - Fixed all other 'go vet' and 'golint' issues (except those that are not fixable (i.e. caused by bugs in or by design of 'go vet' and 'golint')). - Some trivial refactorings, like reorder functions, minor renames, ... - Some slightly less trivial refactoring, mostly to reduce code duplication by embedding types instead of writing many explicit forwarders. - Cleaned up the interface structure a bit. (Most significant probably the removal of the View-like methods from MetricPersistenc. Now they are only in View and not duplicated anymore.) - Removed dead code. (Probably not all of it, but it's a first step...) - Fixed a leftover in storage/metric/end_to_end_test.go (that made some parts of the code never execute (incidentally, those parts were broken (and I fixed them, too))). Change-Id: Ibcac069940d118a88f783314f5b4595dce6641d5	2014-02-27 15:22:37 +01:00
Julius Volz	c4adfc4f25	Minor code cleanups. Change-Id: Ib3729cf38b107b7f2186ccf410a745e0472e3630	2014-02-13 15:24:43 +01:00
Stuart Nelson	28f59edf16	Added telemetry for counting stored samples Change-Id: I0f36f7c2738d070ca2f107fcb315f98e46803af3	2013-12-12 10:06:41 -05:00
Tobias Schmidt	6947ee9bc9	Try to create metrics root directory if missing This change tries to be nice and create the metrics directoy first before erroring out. Change-Id: I72691cdc32469708cd671c6ef1fb7db55fe60430	2013-12-03 18:16:13 +07:00
Julius Volz	740d448983	Use custom timestamp type for sample timestamps and related code. So far we've been using Go's native time.Time for anything related to sample timestamps. Since the range of time.Time is much bigger than what we need, this has created two problems: - there could be time.Time values which were out of the range/precision of the time type that we persist to disk, therefore causing incorrectly ordered keys. One bug caused by this was: https://github.com/prometheus/prometheus/issues/367 It would be good to use a timestamp type that's more closely aligned with what the underlying storage supports. - sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit Unix timestamp (possibly even a 32-bit one). Since we store samples in large numbers, this seriously affects memory usage. Furthermore, copying/working with the data will be faster if it's smaller. MEMORY USAGE RESULTS Initial memory usage comparisons for a running Prometheus with 1 timeseries and 100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my tests, this advantage for some reason decreased a bit the more samples the timeseries had (to 5-7% for millions of samples). This I can't fully explain, but perhaps garbage collection issues were involved. WHEN TO USE THE NEW TIMESTAMP TYPE The new clientmodel.Timestamp type should be used whenever time calculations are either directly or indirectly related to sample timestamps. For example: - the timestamp of a sample itself - all kinds of watermarks - anything that may become or is compared to a sample timestamp (like the timestamp passed into Target.Scrape()). When to still use time.Time: - for measuring durations/times not related to sample timestamps, like duration telemetry exporting, timers that indicate how frequently to execute some action, etc. NOTE ON OPERATOR OPTIMIZATION TESTS We don't use operator optimization code anymore, but it still lives in the code as dead code. It still has tests, but I couldn't get all of them to pass with the new timestamp format. I commented out the failing cases for now, but we should probably remove the dead code soon. I just didn't want to do that in the same change as this. Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f	2013-12-03 09:11:28 +01:00
Conor Hennessy	9a48010cec	Add a check for metrics directory existence. Previously on startup the program would just quit without stating explicitly why. Change-Id: I833b85eb74d2dd27cdc3f0f2e65d7bb1c42caa39	2013-10-22 20:54:34 +02:00
Matt T. Proud	4a87c002e8	Update low-level i'faces to reflect wireformats. This commit fixes a critique of the old storage API design, whereby the input parameters were always as raw bytes and never Protocol Buffer messages that encapsulated the data, meaning every place a read or mutation was conducted needed to manually perform said translations on its own. This is taxing. Change-Id: I4786938d0d207cefb7782bd2bd96a517eead186f	2013-09-04 17:13:58 +02:00
Matt T. Proud	7910f6e863	Prevent total storage locking during memory flush. While a hack, this change should allow us to serve queries expeditiously during a flush operation. Change-Id: I9a483fd1dd2b0638ab24ace960df08773c4a5079	2013-08-29 11:33:38 +02:00
Matt T. Proud	12d5e6ca5a	Curation should not starve user-interactive ops. The background curation should be staggered to ensure that disk I/O yields to user-interactive operations in a timely manner. The lack of routine prioritization necessitates this. Change-Id: I9b498a74ccd933ffb856e06fedc167430e521d86	2013-08-26 19:40:55 +02:00
Matt T. Proud	2b42fd0068	Snapshot of no more frontier. Change-Id: Icd52da3f52bfe4529829ea70b4865ed7c9f6c446	2013-08-23 17:13:58 +02:00
Matt T. Proud	7db518d3a0	Abstract high watermark cache into standard LRU. Conflicts: storage/metric/memory.go storage/metric/tiered.go storage/metric/watermark.go Change-Id: Iab2aedbd8f83dc4ce633421bd4a55990fa026b85	2013-08-19 12:26:55 +02:00
Julius Volz	aa5d251f8d	Use github.com/golang/glog for all logging.	2013-08-12 17:54:36 +02:00
Matt T. Proud	a3bf2efdd5	Replace index writes with wrapped interface. This commit is the first of several and should not be regarded as the desired end state for these cleanups. What this one does it, however, is wrap the query index writing behind an interface type that can be injected into the storage stack and have its lifecycle managed separately as needed. It also would mean we can swap out underlying implementations to support remote indexing, buffering, no-op indexing very easily. In the future, most of the individual index interface members in the tiered storage will go away in favor of agents that can query and resolve what they need from the datastore without the user knowing how and why they work.	2013-08-07 12:15:48 +02:00
Matt T. Proud	52664f701a	Hot Fix: Use extracted time.	2013-08-06 14:18:02 +02:00
Matt T. Proud	d8792cfd86	Extract HighWatermarking. Clean up the rest.	2013-08-05 11:03:03 +02:00
Julius Volz	a76a797f3f	Always treat series without watermarks as too old. Current series always get watermarks written out upon append now. This drops support for old series without any watermarks by always reporting them as too old (stale) during queries.	2013-06-27 17:10:06 +02:00
Julius Volz	d2da21121c	Implement getValueRangeAtIntervalOp for faster range queries. This also short-circuits optimize() for now, since it is complex to implement for the new operator, and ops generated by the query layer already fulfill the needed invariants. We should still investigate later whether to completely delete operator optimization code or extend it to support getValueRangeAtIntervalOp operators.	2013-06-26 18:10:36 +02:00
Matt T. Proud	30b1cf80b5	WIP - Snapshot of Moving to Client Model.	2013-06-25 15:52:42 +02:00
Matt T. Proud	ecb9c7bb9d	Code Review: Swap ordering of elements.	2013-06-21 21:17:50 +02:00
Matt T. Proud	5daa0a09ea	Code Review: Swap ordering of watermark getting. A test for Julius.	2013-06-21 18:34:08 +02:00
Matt T. Proud	2d5de99fbf	Regard in-memory series as new. This commit ensures that series that exist only in-memory and not on-disk are not regarded as too old for operation exclusion.	2013-06-21 18:26:39 +02:00
Julius Volz	ba8c122147	Log correct sample count when appending to disk.	2013-06-21 12:23:27 +02:00
Julius Volz	16364eda37	Drop empty series from memory after flushing.	2013-06-19 12:14:23 +02:00
Julius Volz	71199e2c93	Cache disk fingerprint->metric lookups in memory.	2013-06-18 14:08:58 +02:00
Matt T. Proud	a73f061d3c	Persist solely Protocol Buffers. An design question was open for me in the beginning was whether to serialize other types to disk, but Protocol Buffers quickly won out, which allows us to drop support for other types. This is a good start to cleaning up a lot of cruft in the storage stack and can let us eventually decouple the various moving parts into separate subsystems for easier reasoning. This commit is not strictly required, but it is a start to making the rest a lot more enjoyable to interact with.	2013-06-08 11:02:35 +02:00
Julius Volz	558281890b	Minor "go tool vet" cleanups	2013-06-07 15:34:41 +02:00
Julius Volz	84741b227d	Use LRU cache to avoid querying stale series.	2013-06-06 23:56:19 +02:00
Matt T. Proud	ef1d5fd8a2	Introduce semaphores for tiered storage. This commit wraps the tiered storage access componnets in semaphores, since we can handle several concurrent memory reads.	2013-06-06 18:16:18 +02:00
Matt T. Proud	819045541e	Code Review: Make double-drain a panic.	2013-06-06 12:40:06 +02:00
Matt T. Proud	beaaf386e7	Add storage state guards and transition callbacks. To ensure that we access tiered storage in the proper way, we have guards now.	2013-06-06 11:52:09 +02:00
Matt T. Proud	2c3df44af6	Ensure database access waits until it is started. This commit introduces a channel message to ensure serving state has been reached with the storage stack before anything attempts to use it.	2013-06-06 10:42:21 +02:00
Julius Volz	51689d965d	Add debug timers to instant and range queries. This adds timers around several query-relevant code blocks. For now, the query timer stats are only logged for queries initiated through the UI. In other cases (rule evaluations), the stats are simply thrown away. My hope is that this helps us understand where queries spend time, especially in cases where they sometimes hang for unusual amounts of time.	2013-06-05 18:32:54 +02:00
Matt T. Proud	8339a189cb	Code Review: Fix seriesPresent scope. The seriesPresent scope should be constrained to the scope of a scanJob, since this is keyed to given series.	2013-06-04 13:16:59 +02:00
Matt T. Proud	fe41ce0b19	Conditionalize disk initializations. This commit conditionalizes the creation of the diskFrontier and seriesFrontier along with the iterator such that they are provisioned once something is actually required from disk.	2013-06-04 12:53:57 +02:00
Julius Volz	a8468a2e5e	Fix reversed disk flush cutoff behavior.	2013-05-28 16:14:30 +02:00
Julius Volz	eb1f956909	Revert "Revert "Ensure that all extracted samples are added to view."" This reverts commit `4b30fb86b4`.	2013-05-28 14:36:03 +02:00
Matt T. Proud	4b30fb86b4	Revert "Ensure that all extracted samples are added to view." This reverts commit `008314b5a8`. By running an automated git bisection described in https://gist.github.com/matttproud-soundcloud/22a371a8d2cba382ea64 this commit was found.	2013-05-23 13:36:22 +02:00
Julius Volz	008314b5a8	Ensure that all extracted samples are added to view. The current behavior only adds those samples to the view that are extracted by the last pass of the last processed op and throws other ones away. This is a bug. We need to append all samples that are extracted by each op pass. This also makes view.appendSamples() take an array of samples.	2013-05-22 18:14:37 +02:00
Matt T. Proud	b586801830	Code Review: Fix to-disk queue infinite growth. We discovered a bug while manually testing this branch on a live instance, whereby the to-disk queue was never actually dumped to disk.	2013-05-22 17:59:53 +02:00
Matt T. Proud	c07abf8521	Initial move away from skiplist.	2013-05-22 17:59:53 +02:00
Julius Volz	5b105c77fc	Repointerize fingerprints.	2013-05-21 14:28:14 +02:00
Matt T. Proud	e5ac91222b	Benchmark memory arena; simplify map generation. The one-off keys have been replaced with ``model.LabelPair``, which is indexable. The performance impact is negligible, but it represents a cognitive simplification.	2013-05-21 09:39:12 +02:00
juliusv	360477f66c	Merge pull request #257 from prometheus/feature/better-memory-behaviors Pointerize memorySeriesArena.	2013-05-16 07:36:40 -07:00

1 2 3

102 commits