Commit graph

4021 commits

Author SHA1 Message Date
beorn7 244a65fb29 storage: Increase persist watermark before calling append
The append call may reuse cds, and thus change its len.
(In practice, this wouldn't happen as cds should have len==cap.
Still, the previous order of lines was problematic.)
2017-02-05 02:25:09 +01:00
beorn7 75282b27ba storage: Added checks for invariants 2017-02-04 23:40:22 +01:00
beorn7 31e9db7f0c storage: Simplify evictChunkDesc method 2017-02-04 22:29:37 +01:00
Fabian Reinartz 87bea50b85 web: fix /targets for new label types 2017-02-02 13:18:17 +01:00
Fabian Reinartz 5772f1a7ba retrieval/storage: adapt to new interface
This simplifies the interface to two add methods for
appends with labels or faster reference numbers.
2017-02-02 13:05:46 +01:00
beorn7 65dc8f44d3 storage: Test for errors returned by MaybePopulateLastTime 2017-02-01 23:43:58 +01:00
beorn7 752fac60ae storage: Remove race condition from TestLoop 2017-02-01 23:43:58 +01:00
beorn7 4daffbef12 Merge branch 'release-1.5'
This merges forward the bug-fixes from the release1.5 branch.
2017-02-01 23:43:05 +01:00
Brian Brazil 34767c2221 Clone lset before relabelling. (#2386)
We need to not change the lset passed into populateLabels, as that
is kept around by the SDs.

Fixes 2377
2017-02-01 19:49:50 +00:00
Björn Rabenstein 7db4447390 Merge pull request #2385 from prometheus/beorn7/storage
Fix embarrassing bug of not setting the shrink ratio
2017-02-01 16:58:56 +01:00
beorn7 4ccfc93dcf storage: Set shrink ratio in the constructor. 2017-02-01 15:37:16 +01:00
beorn7 b2f086c6c4 storage: Expose bug of not setting the shrink ratio in the contstructor 2017-02-01 15:37:10 +01:00
Julius Volz d5f6079029 Merge pull request #2381 from prometheus/remote-storage-bridge-example
Add standalone remote storage bridge example
2017-02-01 13:23:06 +01:00
Julius Volz b16371595d Add standalone remote storage bridge example
In preparation for removing specific remote storage implementations,
this offers an example of how to achieve the same in a separate process.
Rather than having three separate bridges for OpenTSDB, InfluxDB, and
Graphite, I decided to support all in one binary.

For now, this is in the example documenation directory, but perhaps we
will want to make a first-class project / repository out of it.
2017-02-01 13:22:41 +01:00
Fabian Reinartz 1d3cdd0d67 Merge branch 'master' into dev-2.0-rebase 2017-01-30 17:43:01 +01:00
Julius Volz 5e985f24de Merge pull request #2179 from prometheus/update-mailing-list-ref
Replace mailing list / IRC mention with link to Community page
2017-01-26 17:08:16 +01:00
Julius Volz 2e1d8dd6bd Replace mailing list / IRC mention with link to Community page 2017-01-26 17:07:27 +01:00
Björn Rabenstein 22a8fb4bc9 Merge pull request #2361 from larkinscott/patch-1
Update .codeclimate.yml
2017-01-24 11:51:51 +01:00
Scott Larkin 5319e1da09 Update .codeclimate.yml
Changed the vendor/ path in the exclude paths node.
2017-01-23 14:58:53 -05:00
Frederic Branczyk d840f2c400 Merge pull request #2359 from brancz/cut-1.5.0
*: cut 1.5.0
2017-01-23 14:05:51 +01:00
Frederic Branczyk fb17493f66
*: cut 1.5.0 2017-01-23 12:59:01 +01:00
Björn Rabenstein 9688a312ed Merge pull request #2355 from prometheus/beorn7/lint
Remove auto-generated protobuf code from codeclimate
2017-01-20 11:31:51 +01:00
Fabian Reinartz 035976b275 retrieval: handle not found error correctly 2017-01-20 11:27:01 +01:00
beorn7 4392aa43d4 Remove auto-generated protobuf code from codeclimate 2017-01-20 11:07:20 +01:00
Björn Rabenstein d717175104 Merge pull request #2354 from prometheus/beorn7/lint
Documentation: Add Code Climate badges to README.md
2017-01-20 10:51:05 +01:00
beorn7 0c8b753f6e Documentation: Add Code Climate badges to README.md 2017-01-19 23:22:22 +01:00
Scott Larkin e5a75b2b30 Code Climate config (#2351)
Created a Code Climate config with gofmt, golint, and govet enabled
2017-01-19 22:19:32 +01:00
Alex Somesan b22eb65d0f Cleaner separation between ServiceAccount and custom authentication in K8S SD (#2348)
* Canonical usage of cluster service-account in K8S SD

* Early validation for opt-in custom auth in K8S SD

* Fix typo in condition
2017-01-19 10:52:52 +01:00
Fabian Reinartz 7eb849e6a8 Merge pull request #2307 from joyent/triton_discovery
Add Joyent Triton discovery
2017-01-18 05:08:11 +01:00
Richard Kiene f3d9692d09 Add Joyent Triton discovery 2017-01-17 20:34:32 +00:00
Fabian Reinartz 598e2f01c0 retrieval: don't erronously break appending 2017-01-17 08:39:18 +01:00
Fabian Reinartz d80a3de235 pkg/textparse: add documentation 2017-01-17 08:16:47 +01:00
Brian Brazil c1b547a90e Only checkpoint chunkdescs and series that need persisting. (#2340)
This decreases checkpoint size by not checkpointing things
that don't actually need checkpointing.

This is fully compatible with the v2 checkpoint format,
as it makes series appear as though the only chunksdescs
in memory are those that need persisting.
2017-01-17 00:59:38 +00:00
Fabian Reinartz 5418a42965 Merge pull request #2345 from Bplotka/fixed-alertmanager-flag-auth
Fixed regression in `-alertmanager.url flag`. Basic auth was ignored.
2017-01-16 18:29:51 +01:00
Bartek Plotka 579e33f19a Fixed style issues. 2017-01-16 16:45:58 +00:00
Bartek Plotka d7febe97fa Fixed regression in -alertmanager.url flag. Basic auth was ignored.
- Included basic auth parsing while parsing to AlertmanagerConfig
- Added test case

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2017-01-16 16:39:20 +00:00
Fabian Reinartz db48726a6b pkg/textparse: allocate single string per metric 2017-01-16 17:24:00 +01:00
Fabian Reinartz 157e698958 web/api: fix min/max timestamps to valid range 2017-01-16 14:09:59 +01:00
Fabian Reinartz 990e40c959 Merge pull request #2338 from brancz/alertmanager-api
web/api: add alertmanager api
2017-01-16 12:08:14 +01:00
Fabian Reinartz c691895a0f retrieval: cache series references, use pkg/textparse
With this change the scraping caches series references and only
allocates label sets if it has to retrieve a new reference.
pkg/textparse is used to do the conditional parsing and reduce
allocations from 900B/sample to 0 in the general case.
2017-01-16 12:03:57 +01:00
Frederic Branczyk bd92571bdd
web/api: make target and alertmanager api responses consistent 2017-01-16 11:53:00 +01:00
Fabian Reinartz 022714b60a Merge pull request #2341 from mattbostock/patch-1
Correct notifications_dropped description
2017-01-16 09:23:46 +01:00
Fabian Reinartz fb3ab9bdb7 pkg/textparse: add more benchmarking, align lex defs 2017-01-15 17:32:57 +01:00
Fabian Reinartz e44d80314d pkg/textparse: add tests and method to retrieve full labels 2017-01-14 19:30:19 +01:00
Fabian Reinartz 091a7f2395 pkg/textparse: add initial text parser 2017-01-14 16:39:04 +01:00
Matt Bostock 4160892109 Correct notifications_dropped description
The current description does not accurately describe when the metric is incremented.

Aside from Alertmanger missing from the configuration, `prometheus_notifications_dropped_total` is incremented when errors occur while sending alert notifications to Alertmanager, or because the notifications queue is full, or because the number of notifications to be sent exceeds the queue capacity.

I think calling these cases 'errors' in a generic sense is more useful than the current description.
2017-01-13 23:36:00 +00:00
Brian Brazil f64c231dad Allow checkpoints and maintenance to happen concurrently. (#2321)
This is essential on larger Prometheus servers, as otherwise
checkpoints prevent sufficient persisting of chunks to disk.
2017-01-13 17:24:19 +00:00
Frederic Branczyk 389c6d0043
web/api: add alertmanager api 2017-01-13 15:30:20 +01:00
Fabian Reinartz ad9bc62e4c storage: extend appender and adapt it 2017-01-13 14:48:01 +01:00
Brian Brazil 1dcb7637f5 Add various persistence related metrics (#2333)
Add metrics around checkpointing and persistence

* Add a metric to say if checkpointing is happening,
and another to track total checkpoint time and count.

This breaks the existing prometheus_local_storage_checkpoint_duration_seconds
by renaming it to prometheus_local_storage_checkpoint_last_duration_seconds
as the former name is more appropriate for a summary.

* Add metric for last checkpoint size.

* Add metric for series/chunks processed by checkpoints.

For long checkpoints it'd be useful to see how they're progressing.

* Add metric for dirty series

* Add metric for number of chunks persisted per series.

You can get the number of chunks from chunk_ops,
but not the matching number of series. This helps determine
the size of the writes being made.

* Add metric for chunks queued for persistence

Chunks created includes both chunks that'll need persistence
and chunks read in for queries. This only includes chunks created
for persistence.

* Code review comments on new persistence metrics.
2017-01-11 15:11:19 +00:00