prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-09-23 09:17:32 -07:00

Author	SHA1	Message	Date
Julius Volz	9b33cfc457	Fix/unify context-based remote storage timeouts	2017-03-20 14:17:06 +01:00
Julius Volz	815762a4ad	Move retrieval.NewHTTPClient -> httputil.NewClientFromConfig	2017-03-20 14:17:04 +01:00
Fabian Reinartz	397f001ac5	Merge branch 'master' into dev-2.0	2017-03-20 14:12:11 +01:00
Julius Volz	eb14678a25	Make remote read/write use config.HTTPClientConfig	2017-03-20 13:37:50 +01:00
Julius Volz	406b65d0dc	Rename remote.Storage to remote.Writer	2017-03-20 13:15:28 +01:00
Julius Volz	02395a224d	[WIP] Remote Read	2017-03-20 13:13:44 +01:00
Julius Volz	40e41a4776	Merge pull request #2494 from tomwilkie/remote-write-sharding Dynamically reshard the QueueManager based on observed load.	2017-03-20 12:45:17 +01:00
Fabian Reinartz	b586781283	*: update tsdb vendoring and add retention flag	2017-03-17 16:06:04 +01:00
beorn7	48d221c11e	storage: Fix typo in comment	2017-03-16 11:49:41 +01:00
Fabian Reinartz	0ecd205794	promql: Use buffer pool for matrix allocations	2017-03-14 10:57:34 +01:00
Tom Wilkie	75bb0f3253	Review feedback	2017-03-13 21:24:49 +00:00
Tom Wilkie	77cce900b8	Fix tests	2017-03-13 15:21:59 +00:00
Tom Wilkie	b48799a01e	Add license stanza	2017-03-13 14:50:15 +00:00
Tom Wilkie	9d22f030cf	Dynamically reshard the QueueManager based on observed load.	2017-03-13 14:41:16 +00:00
Fabian Reinartz	8a8eb12985	storage/tsdb: don't use partitioned DB.	2017-03-07 11:51:30 +01:00
Fabian Reinartz	9eb1d6c927	remote: take code from master	2017-03-07 11:43:32 +01:00
Fabian Reinartz	9304179ef7	Merge branch 'master' into dev-2.0	2017-03-02 08:16:58 +01:00
Fabian Reinartz	4397b4d508	*: pass Prometheus registry into storage	2017-02-28 09:33:14 +01:00
Tom Wilkie	1ab893c6ec	Limit 'discarding sample' logs to 1 every 10s (#2446 ) * Limit 'discarding sample' logs to 1 every 10s * Include the vendored library * Review feedback	2017-02-23 19:20:39 +01:00
Julius Volz	2f39dbc8b3	Rename StorageQueueManager -> QueueManager	2017-02-21 21:45:43 +01:00
Julius Volz	e9476b35d5	Re-add multiple remote writers Each remote write endpoint gets its own set of relabeling rules. This is based on the (yet-to-be-merged) https://github.com/prometheus/prometheus/pull/2419, which removes legacy remote write implementations.	2017-02-20 13:23:12 +01:00
Björn Rabenstein	089dc1076b	Merge pull request #2435 from jmeulemans/open-chunks-gauge Adding gauge for number of open head chunks.	2017-02-17 16:02:06 +01:00
Jeremy Meulemans	025c828976	Changed to open_head_chunks to address review. Now incrementing numHeadChunks directly.	2017-02-17 07:10:13 -06:00
Jeremy Meulemans	074050b8c0	Updating for failed codeclimate check.	2017-02-16 18:04:28 -06:00
Jeremy Meulemans	f70b52d0b6	Adding gauge for number of open head chunks. Fixes #1710	2017-02-16 17:56:45 -06:00
Julius Volz	beb3c4b389	Remove legacy remote storage implementations This removes legacy support for specific remote storage systems in favor of only offering the generic remote write protocol. An example bridge application that translates from the generic protocol to each of those legacy backends is still provided at: documentation/examples/remote_storage/remote_storage_bridge See also https://github.com/prometheus/prometheus/issues/10 The next step in the plan is to re-add support for multiple remote storages.	2017-02-14 17:52:05 +01:00
beorn7	d771185a43	storage: Fix chunkIndexToStartSeek calculation With a high enough shrink ratio and enough chunks to persist, the cutoff point could be _outside_ of the file, which wreaks havoc in the storage.	2017-02-10 11:42:59 +01:00
beorn7	73bd5e4dff	Merge branch 'beorn7/storage' into beorn7/storage3	2017-02-09 14:44:10 +01:00
beorn7	46a0837816	storage: Fix offset returned by dropAndPersistChunks This is another corner-case that was previously never exercised because the rewriting of a series file was never prevented by the shrink ratio. Scenario: There is an existing series on disk, which is archived. If a new sample comes in for that file, a new chunk in memory is created, and the chunkDescsOffset is set to -1. If series maintenance happens before the series has at least one chunk to persist _and_ an insufficient chunks on disk is old enough for purging (so that the shrink ratio kicks in), dropAndPersistChunks would return 0, but it should return the chunk length of the series file.	2017-02-09 14:35:07 +01:00
beorn7	9d12204da5	Merge branch 'release-1.5'	2017-02-09 13:11:53 +01:00
beorn7	bed4934224	storage: One more persist error code path discovered Also, in that code path, set chunkDescsOffset to 0 rather than -1 in case of "dropped more chunks from persistence than from memory" so that no other weird things happen before the series is quarantined for good.	2017-02-09 11:51:40 +01:00
beorn7	242d8edcb5	Merge branch 'release-1.5'	2017-02-08 17:28:09 +01:00
beorn7	8c8baaa558	storage: writeMemorySeries needs to return true for quarantined series This is another fallout of my bug hunt.	2017-02-08 16:28:56 +01:00
Mitsuhiro Tanda	be8b1eb656	storage: optimize dropping chunks by using minShrinkRatio (#2397 ) storage: prevent unnecessary chunk header reading if minShrinkRatio > 0	2017-02-07 17:33:54 +01:00
beorn7	2363a90adc	storage: Do not throw away fully persisted memory series in checkpointing	2017-02-06 17:39:59 +01:00
Fabian Reinartz	ea3ba338dd	main: add flags for new storage	2017-02-05 18:22:06 +01:00
beorn7	244a65fb29	storage: Increase persist watermark before calling append The append call may reuse cds, and thus change its len. (In practice, this wouldn't happen as cds should have len==cap. Still, the previous order of lines was problematic.)	2017-02-05 02:25:09 +01:00
beorn7	75282b27ba	storage: Added checks for invariants	2017-02-04 23:40:22 +01:00
beorn7	31e9db7f0c	storage: Simplify evictChunkDesc method	2017-02-04 22:29:37 +01:00
Fabian Reinartz	5772f1a7ba	retrieval/storage: adapt to new interface This simplifies the interface to two add methods for appends with labels or faster reference numbers.	2017-02-02 13:05:46 +01:00
beorn7	65dc8f44d3	storage: Test for errors returned by MaybePopulateLastTime	2017-02-01 23:43:58 +01:00
beorn7	752fac60ae	storage: Remove race condition from TestLoop	2017-02-01 23:43:58 +01:00
beorn7	4ccfc93dcf	storage: Set shrink ratio in the constructor.	2017-02-01 15:37:16 +01:00
beorn7	b2f086c6c4	storage: Expose bug of not setting the shrink ratio in the contstructor	2017-02-01 15:37:10 +01:00
Brian Brazil	c1b547a90e	Only checkpoint chunkdescs and series that need persisting. (#2340 ) This decreases checkpoint size by not checkpointing things that don't actually need checkpointing. This is fully compatible with the v2 checkpoint format, as it makes series appear as though the only chunksdescs in memory are those that need persisting.	2017-01-17 00:59:38 +00:00
Fabian Reinartz	c691895a0f	retrieval: cache series references, use pkg/textparse With this change the scraping caches series references and only allocates label sets if it has to retrieve a new reference. pkg/textparse is used to do the conditional parsing and reduce allocations from 900B/sample to 0 in the general case.	2017-01-16 12:03:57 +01:00
Brian Brazil	f64c231dad	Allow checkpoints and maintenance to happen concurrently. (#2321 ) This is essential on larger Prometheus servers, as otherwise checkpoints prevent sufficient persisting of chunks to disk.	2017-01-13 17:24:19 +00:00
Fabian Reinartz	ad9bc62e4c	storage: extend appender and adapt it	2017-01-13 14:48:01 +01:00
Brian Brazil	1dcb7637f5	Add various persistence related metrics (#2333 ) Add metrics around checkpointing and persistence * Add a metric to say if checkpointing is happening, and another to track total checkpoint time and count. This breaks the existing prometheus_local_storage_checkpoint_duration_seconds by renaming it to prometheus_local_storage_checkpoint_last_duration_seconds as the former name is more appropriate for a summary. * Add metric for last checkpoint size. * Add metric for series/chunks processed by checkpoints. For long checkpoints it'd be useful to see how they're progressing. * Add metric for dirty series * Add metric for number of chunks persisted per series. You can get the number of chunks from chunk_ops, but not the matching number of series. This helps determine the size of the writes being made. * Add metric for chunks queued for persistence Chunks created includes both chunks that'll need persistence and chunks read in for queries. This only includes chunks created for persistence. * Code review comments on new persistence metrics.	2017-01-11 15:11:19 +00:00
Fabian Reinartz	304cae9928	tsdb: Use PartitionedDB constructor	2017-01-06 12:34:54 +01:00
Brian Brazil	f9e581907a	Make index queue bigger. (#2322 ) When a large Prometheus starts up fresh it can take many minutes to warmup and clear out the index queue. A larger queue means less blocking, bigger batches and cuts down startup time by ~50%.	2017-01-05 17:57:42 +00:00
Fabian Reinartz	bc20d93f0a	storage: rename iterator value getters to At()	2017-01-02 13:33:37 +01:00
Fabian Reinartz	7322c46b8e	storage: add mock iterator for test	2016-12-30 10:45:56 +01:00
Fabian Reinartz	f8fc1f5bb2	*: migrate ingestion to new batch Appender	2016-12-29 11:03:56 +01:00
Fabian Reinartz	71fe0c58a8	promql: misc fixes	2016-12-28 11:32:15 +01:00
Mitsuhiro Tanda	7e369b9318	expose max memory chunks metrics (#2303 ) * expose max memory chunks metrics	2016-12-27 18:34:07 +00:00
Fabian Reinartz	fecf9532b9	*: fix misc compile errors	2016-12-25 11:42:57 +01:00
Fabian Reinartz	622ece6273	*: fix recording tests, migrate matcher types	2016-12-25 11:12:57 +01:00
Fabian Reinartz	0492ddbd4d	*: fully decouple tsdb, add new storage interfaces	2016-12-25 01:43:22 +01:00
Fabian Reinartz	d17b5be48a	storage/metric: remove package	2016-12-25 00:42:52 +01:00
Fabian Reinartz	8b84ee5ee6	storage: remove old storage This removes all old storage files and only keeps interfaces to still allow the code to compile.	2016-12-22 23:33:32 +01:00
Fabian Reinartz	11a731ba82	remote: remove hard-coded remote storages This commit removes the flag-configured remote storage integrations in favor of the generic remote write path.	2016-12-22 23:17:35 +01:00
Brian Brazil	93b70ee4ea	Evict chunk descs of all unloaded chunks during maintenance. (#2297 ) Keeping these around has two problems: 1) Each desc takes 64 bytes, 10 of them is 640B. This is a lot of overhead on a 1024 byte chunk. 2) It can take well over a week to reach a point where this and thus Prometheus memory usage as a whole enters steady state. This makes RAM estimation very hard for users, and makes it difficult to investigate things like memory fragmentation. Instead we'll wipe them during each memory series maintenance cycle, and if a query pulls them in they'll hang around as cache until the next cycle.	2016-12-22 13:49:03 +00:00
Brian Brazil	1b8a474612	Don't clone the metric if there's no remote writes. The metric clone can't be further optimised, and is a non-trivial memory allocation cost so fast path it if there's no remote writes configured.	2016-12-21 11:34:48 +00:00
Tristan Colgate	30be8e0b8a	ignore dotfiles in data directory	2016-12-15 11:48:23 +00:00
Björn Rabenstein	45570e5972	Merge pull request #2277 from prometheus/beorn7/storage2 storage: Sanity-check number of loaded chunk descs	2016-12-14 02:59:10 +01:00
beorn7	253be23c00	storage: Sanity-check number of loaded chunk descs Two cases: - An unarchived metric must have at least one chunk desc loaded upon unarchival. Otherwise, the file is gone or has size 0, which is an inconsistency (because the series is still indexed in the archive index). Hence, quarantining is triggered. - If loading the chunk descs of a series with a known chunkDescsOffset (i.e. != -1), the number of chunks loaded must be equal to chunkDescsOffset. If not, there is a data corruption. An error is returned, which leads to qurantining. In any case, there is a guard added to not access the 1st element of an empty chunkDescs slice. (That's what triggered the crashes in issue 2249.) A time series with unknown chunkDescsOffset and no chunks in memory and no chunks on disk either could trigger that case. I would assume such a "null series" doesn't exist, but it's not entirely unthinkable and unreasonable to happen (perhaps in future uses of the storage). (Create a series, and then something tries to preload chunks before the first sample is added.)	2016-12-13 23:19:39 +01:00
Björn Rabenstein	5f0c0e43cf	Merge pull request #2276 from prometheus/beorn7/storage storage: Catch data corruption that leads to division by zero	2016-12-13 23:13:39 +01:00
beorn7	837c029b16	storage: Fix linter issue Go style tries to avoid indented `else` blocks.	2016-12-13 19:05:30 +01:00
beorn7	4719482f5f	storage: Make tests go-vet and golint clean	2016-12-13 17:07:27 +01:00
beorn7	485ac8dff7	storage: Verify validity of byte length when unmarshalling (double)delta chunks This makes sure a division-by-zero crash cannot happen in the Len() method. Fixes #2773	2016-12-13 17:07:27 +01:00
tattsun	e714079cf2	storage: fix error message (#2270 ) * storage: add error message	2016-12-09 22:36:27 +00:00
Christopher M. Luciano	148b006e25	Clarify error message when Prometheus data dir finds unexpected files	2016-12-05 10:51:57 -05:00
Julius Volz	127332c56f	Merge pull request #2168 from tomwilkie/chunk-len Add call to estimate number of samples in a chunk to the API	2016-11-17 23:13:50 -08:00
Tom Wilkie	585878cdb2	Add call to estimate number of samples in a chunk to the API	2016-11-17 19:09:59 +00:00
Björn Rabenstein	036715370f	Merge pull request #2184 from huydx/master Fix possible memory leak by defer inside loop	2016-11-14 15:26:39 +01:00
huydx	c999902761	Fix possible memory leak by defer inside loop	2016-11-14 14:08:08 +09:00
Fabian Reinartz	856de30c09	Check error before defer closing If an error is returned the file might be nil and a Close call would cause a panic.	2016-11-13 18:16:02 +01:00
Fabian Reinartz	6703404cb4	Merge remote-tracking branch 'origin/release-1.2'	2016-11-01 16:35:22 +01:00
beorn7	c5bd178b93	Protect exported Querier interface method against negative time ranges	2016-11-01 15:05:01 +01:00
beorn7	5b16d6bd6e	Merge branch 'release-1.2'	2016-10-31 00:06:23 +01:00
beorn7	876e5da4f8	Add guard against non-monotonic samples in series This can only happen due to data corruption.	2016-10-25 14:59:33 +02:00
Dominik Schulz	182e17958a	Trivial spelling corrections and a small comment.	2016-10-18 20:14:38 +02:00
Fabian Reinartz	8fa18d564a	storage: enhance Querier interface usage This extracts Querier as an instantiateable and closeable object rather than just defining extending methods of the storage interface. This improves composability and allows abstracting query transactions, which can be useful for transaction-level caches, consistent data views, and encapsulating teardown.	2016-10-16 10:39:29 +02:00
beorn7	719508752b	Re-add counting of evict chunk ops and decrementing NumMemChunks Also, modify test to expose the regression.	2016-10-10 16:30:10 +02:00
Julius Volz	cb02f017ee	Clean up some doc comments	2016-10-06 21:53:40 +02:00
Julius Volz	c212ef0326	Add Chunk.Utilization() methods When using the chunking code in other projects (both Weave Prism and ChronixDB ingester), you sometimes want to know how well you are utilizing your chunks when closing/storing them.	2016-10-06 16:31:59 +02:00
Julius Volz	c7932aa009	Remove gRPC leftovers in protobuf definitions	2016-10-05 17:31:04 +02:00
Björn Rabenstein	1e2f03f668	Merge pull request #2005 from redbaron/microoptimise-matching Microoptimise matching	2016-10-05 17:26:56 +02:00
Maxim Ivanov	e6db9f8159	New fpsForLabelMatchers and seriesForLabelMatchers methods These more specific methods have replaced `metricForLabelMatchers` in cases where its `map[fingerprint]metric` result type was not necessary or was used as an intermediate step Avoids duplicated calls to `seriesForRange` from `QueryRange` and `QueryInstant` methods.	2016-10-05 15:15:54 +01:00
Brian Brazil	6e8f87a37f	Merge pull request #2047 from prometheus/write-relabel Add support for remote write relabelling.	2016-10-05 07:47:49 +01:00
Brian Brazil	77605649a9	Add support for remote write relabelling. Switch back to a single remote writer, as we were only ever meant to have one and the relabel semantics are clearer that way.	2016-10-05 07:43:19 +01:00
Julius Volz	c9d4526428	Unpublish accidentally published series methods There were some more accidentally published methods of the memorySeries type which I didn't notice when reviewing https://github.com/prometheus/prometheus/pull/2011	2016-10-03 00:04:56 +02:00
Maxim Ivanov	4978a65495	Extract initial FP candidate build logic into candidateFPsForLabelMatchers method No functional changes otherwise	2016-10-02 17:35:02 +01:00
Maxim Ivanov	c048a0cde8	Add metrics to result after checking all matchers Should be marginally faster and somewhat more GC friendly	2016-10-02 17:35:02 +01:00
Maxim Ivanov	bedc0eda1f	Added BenchmarkQueryRange	2016-10-02 17:35:02 +01:00
Julius Volz	c25f0de5ae	Remove local.ZeroSample{,Pair}, use model definitions	2016-09-28 23:42:45 +02:00
Julius Volz	044ebce779	Review fixups.	2016-09-28 23:42:44 +02:00
Julius Volz	d30a3c7c0f	Fix accidental publishing of memorySeries.firstTime()	2016-09-26 13:25:27 +02:00
Julius Volz	ab80ced756	storage: separate chunk package, publish more names This is a followup to https://github.com/prometheus/prometheus/pull/2011. This publishes more of the methods and other names of the chunk code and moves the chunk code to its own package. There's some unavoidable ugliness: the chunk and chunkDesc metrics are used by both packages, so I had to move them to the chunk package. That isn't great, but I don't see how to do it better without a larger redesign of everything. Same for the evict requests and some other types.	2016-09-26 13:25:11 +02:00

1 2 3 4 5 ...

820 commits