Commit graph

7154 commits

Author SHA1 Message Date
Fabian Reinartz 67d185ceb9 Compact based on compaction generation 2017-01-19 19:45:52 +01:00
Fabian Reinartz 472c618c39 Drop out-of-bound samples 2017-01-19 15:03:57 +01:00
Fabian Reinartz d4779b374c Properly track and write meta file 2017-01-19 14:01:38 +01:00
Fabian Reinartz 9ddbd64d00 Move stats into meta.json file, cleanup, docs 2017-01-19 11:22:47 +01:00
Alex Somesan b22eb65d0f Cleaner separation between ServiceAccount and custom authentication in K8S SD (#2348)
* Canonical usage of cluster service-account in K8S SD

* Early validation for opt-in custom auth in K8S SD

* Fix typo in condition
2017-01-19 10:52:52 +01:00
Fabian Reinartz 2f02f86b62 Fix WAL tests 2017-01-19 08:48:11 +01:00
Fabian Reinartz e006bc6dc6 Improve error messages, create regular dir for block 2017-01-19 08:40:15 +01:00
Fabian Reinartz d2322f6095 Improve compaction processing 2017-01-18 06:18:32 +01:00
Fabian Reinartz 7eb849e6a8 Merge pull request #2307 from joyent/triton_discovery
Add Joyent Triton discovery
2017-01-18 05:08:11 +01:00
Richard Kiene f3d9692d09 Add Joyent Triton discovery 2017-01-17 20:34:32 +00:00
Fabian Reinartz 5ceca3c810 Write to WAL before appending to memory storage 2017-01-17 16:33:58 +01:00
Fabian Reinartz 343dd9d94c Fix wrong byte size in WAL base ref 2017-01-17 08:40:31 +01:00
Fabian Reinartz 598e2f01c0 retrieval: don't erronously break appending 2017-01-17 08:39:18 +01:00
Fabian Reinartz d80a3de235 pkg/textparse: add documentation 2017-01-17 08:16:47 +01:00
Brian Brazil c1b547a90e Only checkpoint chunkdescs and series that need persisting. (#2340)
This decreases checkpoint size by not checkpointing things
that don't actually need checkpointing.

This is fully compatible with the v2 checkpoint format,
as it makes series appear as though the only chunksdescs
in memory are those that need persisting.
2017-01-17 00:59:38 +00:00
Fabian Reinartz 5fb01d41aa Use new Prometheus text format parser 2017-01-16 21:29:53 +01:00
Fabian Reinartz 5418a42965 Merge pull request #2345 from Bplotka/fixed-alertmanager-flag-auth
Fixed regression in `-alertmanager.url flag`. Basic auth was ignored.
2017-01-16 18:29:51 +01:00
Bartek Plotka 579e33f19a Fixed style issues. 2017-01-16 16:45:58 +00:00
Bartek Plotka d7febe97fa Fixed regression in -alertmanager.url flag. Basic auth was ignored.
- Included basic auth parsing while parsing to AlertmanagerConfig
- Added test case

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2017-01-16 16:39:20 +00:00
Fabian Reinartz db48726a6b pkg/textparse: allocate single string per metric 2017-01-16 17:24:00 +01:00
Fabian Reinartz dd0b69fe1b Export ErrNotFound 2017-01-16 14:18:32 +01:00
Fabian Reinartz 9cf49f68e9 wal: use larger buffer 2017-01-16 14:18:25 +01:00
Fabian Reinartz 157e698958 web/api: fix min/max timestamps to valid range 2017-01-16 14:09:59 +01:00
Fabian Reinartz 990e40c959 Merge pull request #2338 from brancz/alertmanager-api
web/api: add alertmanager api
2017-01-16 12:08:14 +01:00
Fabian Reinartz c691895a0f retrieval: cache series references, use pkg/textparse
With this change the scraping caches series references and only
allocates label sets if it has to retrieve a new reference.
pkg/textparse is used to do the conditional parsing and reduce
allocations from 900B/sample to 0 in the general case.
2017-01-16 12:03:57 +01:00
Frederic Branczyk bd92571bdd
web/api: make target and alertmanager api responses consistent 2017-01-16 11:53:00 +01:00
Fabian Reinartz 022714b60a Merge pull request #2341 from mattbostock/patch-1
Correct notifications_dropped description
2017-01-16 09:23:46 +01:00
Fabian Reinartz fb3ab9bdb7 pkg/textparse: add more benchmarking, align lex defs 2017-01-15 17:32:57 +01:00
Fabian Reinartz e44d80314d pkg/textparse: add tests and method to retrieve full labels 2017-01-14 19:30:19 +01:00
Fabian Reinartz 091a7f2395 pkg/textparse: add initial text parser 2017-01-14 16:39:04 +01:00
Matt Bostock 4160892109 Correct notifications_dropped description
The current description does not accurately describe when the metric is incremented.

Aside from Alertmanger missing from the configuration, `prometheus_notifications_dropped_total` is incremented when errors occur while sending alert notifications to Alertmanager, or because the notifications queue is full, or because the number of notifications to be sent exceeds the queue capacity.

I think calling these cases 'errors' in a generic sense is more useful than the current description.
2017-01-13 23:36:00 +00:00
Brian Brazil f64c231dad Allow checkpoints and maintenance to happen concurrently. (#2321)
This is essential on larger Prometheus servers, as otherwise
checkpoints prevent sufficient persisting of chunks to disk.
2017-01-13 17:24:19 +00:00
Fabian Reinartz 1c80c33e72 Fix bug of unsorted postings lists being created
The former approach created unordered postings list by either
map iteration of new series being unsorted (fixable) or concurrent
writers creating new series interleaved.

We switch back to generating ephemeral references for a single batch.
Newly created series have to be re-set upon the next insert.
2017-01-13 16:22:20 +01:00
Frederic Branczyk 389c6d0043
web/api: add alertmanager api 2017-01-13 15:30:20 +01:00
Fabian Reinartz c7f5590a71 Ensure order of postings when adding new series 2017-01-13 15:25:11 +01:00
Fabian Reinartz ad9bc62e4c storage: extend appender and adapt it 2017-01-13 14:48:01 +01:00
Fabian Reinartz d970f0256a Add Rollback() and docs to Appender interface 2017-01-12 20:17:49 +01:00
Fabian Reinartz 22db9c3413 Remove old appendBatch methods 2017-01-12 20:04:49 +01:00
Fabian Reinartz fde69dab49 Use buffer pool for head appenders 2017-01-12 20:03:44 +01:00
Fabian Reinartz a317f252b9 Expose series references to clients
This exposes a reference number of a series represented by a label set
to clients.
Subsequent samples can be directly added via the reference rather than
repeatedly passing in the full labels. This drasitcally speeds up the
append process.

The appender chain uses different sections of the reference number for
assignment to child appenders and invalidating reference numbers as
necessary.

Clients can either pass out reference numbers themselves or have their
own optimized lookup, i.e. by directly associating unparsed metric
descriptors strings with reference numbers.
2017-01-12 20:00:54 +01:00
Fabian Reinartz 5e028710d5 Add fast past to validation after lock switch 2017-01-12 15:51:08 +01:00
Brian Brazil 1dcb7637f5 Add various persistence related metrics (#2333)
Add metrics around checkpointing and persistence

* Add a metric to say if checkpointing is happening,
and another to track total checkpoint time and count.

This breaks the existing prometheus_local_storage_checkpoint_duration_seconds
by renaming it to prometheus_local_storage_checkpoint_last_duration_seconds
as the former name is more appropriate for a summary.

* Add metric for last checkpoint size.

* Add metric for series/chunks processed by checkpoints.

For long checkpoints it'd be useful to see how they're progressing.

* Add metric for dirty series

* Add metric for number of chunks persisted per series.

You can get the number of chunks from chunk_ops,
but not the matching number of series. This helps determine
the size of the writes being made.

* Add metric for chunks queued for persistence

Chunks created includes both chunks that'll need persistence
and chunks read in for queries. This only includes chunks created
for persistence.

* Code review comments on new persistence metrics.
2017-01-11 15:11:19 +00:00
Fabian Reinartz 1b39887baa Revalidate series existance after lock switch 2017-01-11 14:05:58 +01:00
Fabian Reinartz ca5791efbc Simplify creation of new series 2017-01-11 13:58:26 +01:00
Fabian Reinartz 0ca755b4ae Replace single head chunk per series with memSeries
This adds a memory series holding several chunk to replace
the single head chunk per series so far.
This is necessary for uniform maximum chunk sizes in cases
where some series have higher frequency samples than others.
2017-01-11 13:02:38 +01:00
Fabian Reinartz 80affd98a8 Add barrier to benchmark writer
This adds a barrier to avoid issues with unfair goroutine scheduling
that causes some fake scrapers to run away from the other ones.
2017-01-11 13:01:30 +01:00
Fabian Reinartz c32a94d409 Unexport HeadBlock, export Block interface 2017-01-10 15:41:57 +01:00
Fabian Reinartz d86e8a63c7 Report correct number of appended samples 2017-01-10 11:17:37 +01:00
Fabian Reinartz 29883a18fc Add own Appender() method for DB 2017-01-09 22:54:08 +01:00
Fabian Reinartz 4c4e0c614e Simplify position mapper updating 2017-01-09 19:24:05 +01:00