prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-11 08:04:04 -08:00

Author	SHA1	Message	Date
Matt T. Proud	972e856d9b	Kill the curation state channel. The use of the channels for curation state were always unidiomatic. Change-Id: I1cb1d7175ebfb4faf28dff84201066278d6a0d92	2013-08-13 17:20:22 +02:00
Matt T. Proud	1ceb25b701	Publication of LevelDBMetricPersistence Fields. This will enable us to break down the onerous construction method. Change-Id: Ia89337ba39d6745af6757180af2485ec8a990a3b	2013-08-13 00:36:12 +02:00
Julius Volz	0003027dce	Add needed trailing spaces in logs.	2013-08-12 18:22:48 +02:00
Julius Volz	aa5d251f8d	Use github.com/golang/glog for all logging.	2013-08-12 17:54:36 +02:00
Matt T. Proud	a5141e4d0a	Depointerize storage conf. and chain ingester. The storage builders need to work with the assumption that they have a copy of the underlying configuration data if any mutations are made.	2013-08-12 17:07:03 +02:00
Matt T. Proud	820e551988	Code Review: Nits.	2013-08-07 13:29:10 +02:00
Matt T. Proud	a3bf2efdd5	Replace index writes with wrapped interface. This commit is the first of several and should not be regarded as the desired end state for these cleanups. What this one does it, however, is wrap the query index writing behind an interface type that can be injected into the storage stack and have its lifecycle managed separately as needed. It also would mean we can swap out underlying implementations to support remote indexing, buffering, no-op indexing very easily. In the future, most of the individual index interface members in the tiered storage will go away in favor of agents that can query and resolve what they need from the datastore without the user knowing how and why they work.	2013-08-07 12:15:48 +02:00
Matt T. Proud	52664f701a	Hot Fix: Use extracted time.	2013-08-06 14:18:02 +02:00
Matt T. Proud	38dac35b3e	Code Review: Short name consistency.	2013-08-06 12:38:35 +02:00
Matt T. Proud	a00f18d78b	Code Review: Manual re-alignment.	2013-08-06 12:23:06 +02:00
Matt T. Proud	cc989c68e1	Replace direct curation table access with wrapper.	2013-08-06 12:02:52 +02:00
Matt T. Proud	07ac921aec	Code Review: First pass.	2013-08-05 17:31:49 +02:00
Matt T. Proud	d8792cfd86	Extract HighWatermarking. Clean up the rest.	2013-08-05 11:03:03 +02:00
Matt T. Proud	f4669a812c	Extract index storage into separate types.	2013-08-04 15:31:52 +02:00
Matt T. Proud	772d3d6b11	Consolidate LevelDB storage construction. There are too many parameters to constructing a LevelDB storage instance for a construction method, so I've opted to take an idiomatic approach of embedding them in a struct for easier mediation and versioning.	2013-08-03 17:25:03 +02:00
Julius Volz	e3415e953f	Add notifications telemetry.	2013-07-31 12:40:56 +02:00
juliusv	927435d68e	Merge pull request #333 from prometheus/round-time Round time to nearest second in memory storage.	2013-07-16 05:52:31 -07:00
Julius Volz	5d88e8cc45	Round time to nearest second in memory storage. When samples get flushed to disk, they lose sub-second precision anyways. By already dropping sub-second precision, data fetched from memory vs. disk will behave the same. Later, we should consider also storing a more compact representation than time.Time in memory if we're not going to use its full precision.	2013-07-16 14:51:54 +02:00
Matt T. Proud	f7704af4f8	Code Review: Formatting comments.	2013-07-15 15:12:01 +02:00
Julius Volz	a76a797f3f	Always treat series without watermarks as too old. Current series always get watermarks written out upon append now. This drops support for old series without any watermarks by always reporting them as too old (stale) during queries.	2013-06-27 17:10:06 +02:00
Julius Volz	d2da21121c	Implement getValueRangeAtIntervalOp for faster range queries. This also short-circuits optimize() for now, since it is complex to implement for the new operator, and ops generated by the query layer already fulfill the needed invariants. We should still investigate later whether to completely delete operator optimization code or extend it to support getValueRangeAtIntervalOp operators.	2013-06-26 18:10:36 +02:00
Julius Volz	e7f049c85b	Fix expunging of empty memory series (loop var pointerization bug)	2013-06-26 18:00:47 +02:00
Julius Volz	baa5b07829	Fix condition for dropping empty memory series.	2013-06-25 17:57:35 +02:00
Matt T. Proud	30b1cf80b5	WIP - Snapshot of Moving to Client Model.	2013-06-25 15:52:42 +02:00
juliusv	42198c1f1c	Merge pull request #311 from prometheus/fix/watermarking/on-first-write Ensure new metrics are watermarked early.	2013-06-25 03:13:58 -07:00
Matt T. Proud	b811ccc161	Disable paranoid checks and expose max FDs option. We shouldn't need paranoid checks now. We also shouldn't need too many FDs being open due to rule evaluator hitting in-memory values stream.	2013-06-24 12:10:14 +02:00
Matt T. Proud	4137c75523	Shrink default LRU cache sizes. Observing Prometheus in production confirms we can lower these values safely.	2013-06-24 12:09:16 +02:00
Matt T. Proud	ecb9c7bb9d	Code Review: Swap ordering of elements.	2013-06-21 21:17:50 +02:00
Matt T. Proud	5daa0a09ea	Code Review: Swap ordering of watermark getting. A test for Julius.	2013-06-21 18:34:08 +02:00
Matt T. Proud	ee840904d2	Code Review: !Before -> After.	2013-06-21 18:26:40 +02:00
Matt T. Proud	2d5de99fbf	Regard in-memory series as new. This commit ensures that series that exist only in-memory and not on-disk are not regarded as too old for operation exclusion.	2013-06-21 18:26:39 +02:00
Matt T. Proud	81c406630a	Merge pull request #312 from prometheus/fix/sample-append-logging Log correct sample count when appending to disk.	2013-06-21 08:55:51 -07:00
Matt T. Proud	a1a23fbaf8	Ensure new metrics are watermarked early. With the checking of fingerprint freshness to cull stale metrics from queries, we should write watermarks early to aid in more accurate responses.	2013-06-21 16:38:46 +02:00
Julius Volz	ba8c122147	Log correct sample count when appending to disk.	2013-06-21 12:23:27 +02:00
Julius Volz	f2b4067b7b	Speedup and clean up operation optimization.	2013-06-20 03:01:13 +02:00
Julius Volz	008bc09da8	Move check for empty memory series to separate method.	2013-06-19 14:19:53 +02:00
Julius Volz	16364eda37	Drop empty series from memory after flushing.	2013-06-19 12:14:23 +02:00
Julius Volz	71199e2c93	Cache disk fingerprint->metric lookups in memory.	2013-06-18 14:08:58 +02:00
Matt T. Proud	a73f061d3c	Persist solely Protocol Buffers. An design question was open for me in the beginning was whether to serialize other types to disk, but Protocol Buffers quickly won out, which allows us to drop support for other types. This is a good start to cleaning up a lot of cruft in the storage stack and can let us eventually decouple the various moving parts into separate subsystems for easier reasoning. This commit is not strictly required, but it is a start to making the rest a lot more enjoyable to interact with.	2013-06-08 11:02:35 +02:00
juliusv	95400cb785	Merge pull request #290 from prometheus/fix/go-vet Minor "go tool vet" cleanups	2013-06-07 06:52:48 -07:00
Julius Volz	558281890b	Minor "go tool vet" cleanups	2013-06-07 15:34:41 +02:00
juliusv	615972dd01	Merge pull request #288 from prometheus/fix/curator/fallthrough-compaction-ordering Fix fallthrough compaction value ordering.	2013-06-07 05:46:15 -07:00
Matt T. Proud	86f63b078b	Fix fallthrough compaction value ordering. We discovered a regression whereby data chunks could be appended out of order if the fallthrough case was hit.	2013-06-07 14:41:00 +02:00
Julius Volz	7b9ee95030	Minor LevelDB watermark handling cleanups.	2013-06-06 23:56:31 +02:00
Julius Volz	84741b227d	Use LRU cache to avoid querying stale series.	2013-06-06 23:56:19 +02:00
Julius Volz	f98853d7b7	Fix type error in watermark list handling.	2013-06-06 23:56:14 +02:00
Matt T. Proud	ef1d5fd8a2	Introduce semaphores for tiered storage. This commit wraps the tiered storage access componnets in semaphores, since we can handle several concurrent memory reads.	2013-06-06 18:16:18 +02:00
Matt T. Proud	819045541e	Code Review: Make double-drain a panic.	2013-06-06 12:40:06 +02:00
Matt T. Proud	e217a9fb41	Race Work: Make memory arena locks more coarse. We can optimize these as needed later.	2013-06-06 12:08:20 +02:00
Matt T. Proud	beaaf386e7	Add storage state guards and transition callbacks. To ensure that we access tiered storage in the proper way, we have guards now.	2013-06-06 11:52:09 +02:00
Matt T. Proud	abb5353ade	Merge pull request #283 from prometheus/feature/storage/consult-watermark Include LRU cache for fingerprint watermarks.	2013-06-06 02:33:45 -07:00
Matt T. Proud	2c3df44af6	Ensure database access waits until it is started. This commit introduces a channel message to ensure serving state has been reached with the storage stack before anything attempts to use it.	2013-06-06 10:42:21 +02:00
Matt T. Proud	cbe2f3a7b1	Include LRU cache for fingerprint watermarks.	2013-06-06 10:13:18 +02:00
Julius Volz	51689d965d	Add debug timers to instant and range queries. This adds timers around several query-relevant code blocks. For now, the query timer stats are only logged for queries initiated through the UI. In other cases (rule evaluations), the stats are simply thrown away. My hope is that this helps us understand where queries spend time, especially in cases where they sometimes hang for unusual amounts of time.	2013-06-05 18:32:54 +02:00
Matt T. Proud	8339a189cb	Code Review: Fix seriesPresent scope. The seriesPresent scope should be constrained to the scope of a scanJob, since this is keyed to given series.	2013-06-04 13:16:59 +02:00
Matt T. Proud	fe41ce0b19	Conditionalize disk initializations. This commit conditionalizes the creation of the diskFrontier and seriesFrontier along with the iterator such that they are provisioned once something is actually required from disk.	2013-06-04 12:53:57 +02:00
Julius Volz	a8468a2e5e	Fix reversed disk flush cutoff behavior.	2013-05-28 16:14:30 +02:00
Julius Volz	eb1f956909	Revert "Revert "Ensure that all extracted samples are added to view."" This reverts commit `4b30fb86b4`.	2013-05-28 14:36:03 +02:00
Matt T. Proud	4b30fb86b4	Revert "Ensure that all extracted samples are added to view." This reverts commit `008314b5a8`. By running an automated git bisection described in https://gist.github.com/matttproud-soundcloud/22a371a8d2cba382ea64 this commit was found.	2013-05-23 13:36:22 +02:00
Julius Volz	750f862d9a	Use GetBoundaryValues() for non-counter deltas.	2013-05-22 19:13:47 +02:00
Julius Volz	f2b48b8c4a	Make getValuesAtIntervalOp consume all chunk data in one pass. This is mainly a small performance improvement, since we skip past the last extracted time immediately if it was also the last sample in the chunk, instead of trying to extract non-existent values before the chunk end again and again and only gradually approaching the end of the chunk.	2013-05-22 18:14:45 +02:00
Julius Volz	83d60bed89	extractValuesAroundTime() code simplification.	2013-05-22 18:14:45 +02:00
Julius Volz	008314b5a8	Ensure that all extracted samples are added to view. The current behavior only adds those samples to the view that are extracted by the last pass of the last processed op and throws other ones away. This is a bug. We need to append all samples that are extracted by each op pass. This also makes view.appendSamples() take an array of samples.	2013-05-22 18:14:37 +02:00
Matt T. Proud	b586801830	Code Review: Fix to-disk queue infinite growth. We discovered a bug while manually testing this branch on a live instance, whereby the to-disk queue was never actually dumped to disk.	2013-05-22 17:59:53 +02:00
Matt T. Proud	285a8b701b	Code Review: Extend lock.	2013-05-22 17:59:53 +02:00
Matt T. Proud	2526ab8c81	Code Review: Extend lock scope for appending.	2013-05-22 17:59:53 +02:00
Matt T. Proud	f994482d15	Code Review: Avenues for future improvemnet noted.	2013-05-22 17:59:53 +02:00
Matt T. Proud	298a90c143	Code Review: Initial arena size name.	2013-05-22 17:59:53 +02:00
Matt T. Proud	c07abf8521	Initial move away from skiplist.	2013-05-22 17:59:53 +02:00
Matt T. Proud	74a66fd938	Spawn grouping of fingerprints with free semaphore. The previous implementation spawned N goroutines to group samples together and would not start work until the semaphore unblocked. While this didn't leak, it polluted the scheduling space. Thusly, the routine only starts after a semaphore has been acquired.	2013-05-21 16:11:35 +02:00
Julius Volz	5b105c77fc	Repointerize fingerprints.	2013-05-21 14:28:14 +02:00
Matt T. Proud	ec5b5bae28	Fuck you, Travis.	2013-05-21 09:42:00 +02:00
Matt T. Proud	e5ac91222b	Benchmark memory arena; simplify map generation. The one-off keys have been replaced with ``model.LabelPair``, which is indexable. The performance impact is negligible, but it represents a cognitive simplification.	2013-05-21 09:39:12 +02:00
juliusv	360477f66c	Merge pull request #257 from prometheus/feature/better-memory-behaviors Pointerize memorySeriesArena.	2013-05-16 07:36:40 -07:00
Matt T. Proud	e1f20de2e9	Pointerize memorySeriesArena.	2013-05-16 17:09:28 +03:00
Matt T. Proud	8f4c7ece92	Destroy naked returns in half of corpus. The use of naked return values is frowned upon. This is the first of two bulk updates to remove them.	2013-05-16 10:53:25 +03:00
Matt T. Proud	4e0c932a4f	Simplify Encoder's encoding signature. The reality is that if we ever try to encode a Protocol Buffer and it fails, it's likely that such an error is ultimately not a runtime error and should be fixed forthwith. Thusly, we should rename ``Encoder.Encode`` to ``Encoder.MustEncode`` and drop the error return value.	2013-05-16 00:54:18 +03:00
juliusv	516101f015	Merge pull request #250 from prometheus/refactor/drop-unused-storage-setting Drop unused writeMemoryInterval	2013-05-14 08:45:59 -07:00
juliusv	9ff00b651d	Merge pull request #251 from prometheus/fix/memory-metric-mutability Fix GetMetricForFingerprint() metric mutability.	2013-05-14 08:12:45 -07:00
Bernerd Schaefer	63d9988b9c	Drop unused writeMemoryInterval	2013-05-14 17:03:03 +02:00
Julius Volz	83c60ad43a	Fix GetMetricForFingerprint() metric mutability. Some users of GetMetricForFingerprint() end up modifying the returned metric labelset. Since the memory storage's implementation of GetMetricForFingerprint() returned a pointer to the metric (and maps are reference types anyways), the external mutation propagated back into the memory storage. The fix is to make a copy of the metric before returning it.	2013-05-14 16:46:30 +02:00
Bernerd Schaefer	428d91c86f	Rename test helper files to helpers_test.go This ensures that these files are properly included only in testing.	2013-05-14 16:30:47 +02:00
juliusv	98e512d755	Merge pull request #246 from prometheus/fix/interval-value-extraction Fix and optimize getValuesAtIntervalOp data extraction.	2013-05-14 05:55:22 -07:00
Julius Volz	71a3172abb	Fix and optimize getValuesAtIntervalOp data extraction. - only the data extracted in the last loop iteration of ExtractSamples() was emitted as output - if e.g. op interval < sample interval, there were situations where the same sample was added multiple times to the output	2013-05-14 13:55:17 +02:00
Matt T. Proud	244a4a9cdb	Update to go1.1. This commit updates the documentation, Makefiles, formatting, and code semantics to support the 1.1. runtime, which includes ... 1. ``make advice``, 2. ``make format``, and 3. ``go fix`` on various targets.	2013-05-14 12:39:08 +02:00
Matt T. Proud	b224251981	Simplify compaction and expose database sizes. This commit simplifies the way that compactions across a database's keyspace occur due to reading the LevelDB internals. Secondarily it introduces the database size estimation mechanisms. Include database health and help interfaces. Add database statistics; remove status goroutines. This commit kills the use of Go routines to expose status throughout the web components of Prometheus. It also dumps raw LevelDB status on a separate /databases endpoint.	2013-05-14 12:29:53 +02:00
juliusv	92ad65ff13	Merge pull request #232 from prometheus/optimize/granular-storage-locking Synchronous memory appends and more fine-grained storage locks.	2013-05-13 10:11:57 -07:00
Matt T. Proud	1f7f89b4e3	Simplify compaction and expose database sizes. This commit simplifies the way that compactions across a database's keyspace occur due to reading the LevelDB internals. Secondarily it introduces the database size estimation mechanisms.	2013-05-13 13:15:35 +02:00
Matt T. Proud	d538b0382f	Include long-tail data deletion mechanism. This commit introduces the long-tail deletion mechanism, which will automatically cull old sample values. It is an acceptable hold-over until we get a resampling pipeline implemented. Kill legacy OS X documentation, too.	2013-05-13 10:54:36 +02:00
Julius Volz	ce1ee444f1	Synchronous memory appends and more fine-grained storage locks. This does two things: 1) Make TieredStorage.AppendSamples() write directly to memory instead of buffering to a channel first. This is needed in cases where a rule might immediately need the data generated by a previous rule. 2) Replace the single storage mutex by two new ones: - memoryMutex - needs to be locked at any time that two concurrent goroutines could be accessing (via read or write) the TieredStorage memoryArena. - memoryDeleteMutex - used to prevent any deletion of samples from memoryArena as long as renderView is running and assembling data from it. The LevelDB disk storage does not need to be protected by a mutex when rendering a view since renderView works off a LevelDB snapshot. The rationale against adding memoryMutex directly to the memory storage: taking a mutex does come with a small inherent time cost, and taking it is only required in few places. In fact, no locking is required for the memory storage instance which is part of a view (and not the TieredStorage).	2013-05-10 17:15:52 +02:00
Matt T. Proud	fa6a1f97d0	Expose interfaces for pruner and make pruner tool. In order to run database cleanups and diagnostics, we should have a means for pruning a database---even if LevelDB does this for us.	2013-05-10 17:07:03 +02:00
Matt T. Proud	161c8fbf9b	Include deletion processor for long-tail values. This commit extracts the model.Values truncation behavior into the actual tiered storage, which uses it and behaves in a peculiar way—notably the retention of previous elements if the chunk were to ever go empty. This is done to enable interpolation between sparse sample values in the evaluation cycle. Nothing necessarily new here—just an extraction. Now, the model.Values TruncateBefore functionality would do what a user would expect without any surprises, which is required for the DeletionProcessor, which may decide to split a large chunk in two if it determines that the chunk contains the cut-off time.	2013-05-10 12:19:12 +02:00
Matt Proud	7f0d816574	Schedule the background compactors to run. This commit introduces three background compactors, which compact sparse samples together. 1. Older than five minutes is grouped together into chunks of 50 every 30 minutes. 2. Older than 60 minutes is grouped together into chunks of 250 every 50 minutes. 3. Older than one day is grouped together into chunks of 5000 every 70 minutes.	2013-05-07 17:14:04 +02:00
Julius Volz	caab131ada	Repointerize TieredStorage method receiver types.	2013-05-07 15:12:33 +02:00
juliusv	89de116ea9	Merge pull request #225 from prometheus/refactor/fmt-cleanups Slice expression simplifications.	2013-05-07 04:27:27 -07:00
Julius Volz	05afa970d2	Slice expression simplifications.	2013-05-07 13:22:29 +02:00
Matt T. Proud	f897164bcf	Expose TieredStorage.DiskStorage.	2013-05-07 10:26:28 +02:00
Matt T. Proud	ce45787dbf	Storage interface to TieredStorage. This commit drops the Storage interface and just replaces it with a publicized TieredStorage type. Storage had been anticipated to be used as a wrapper for testability but just was not used due to practicality. Merely overengineered. My bad. Anyway, we will eventually instantiate the TieredStorage dependencies in main.go and pass them in for more intelligent lifecycle management. These changes will pave the way for managing the curators without Law of Demeter violations.	2013-05-03 15:54:14 +02:00
Bernerd Schaefer	5eb9840ed7	Fix goroutine leak in leveldb.AppendSamples The error channels in AppendSamples need to be buffered, since in the presence of errors their values may not be consumed.	2013-05-03 12:13:05 +02:00
Matt T. Proud	a3f1d81e24	Publicize a few storage components for curation. This commit introduces the publicization of Stop and other components, which the compaction curator shall take advantage of.	2013-05-02 13:16:04 +02:00

1 2 3 4 5 ...

301 commits