Commit graph

7154 commits

Author SHA1 Message Date
Fabian Reinartz 9c6a72aadd Load head with WALs correctly 2016-12-22 15:54:39 +01:00
Fabian Reinartz 1dde3b6d31 Add WAL decoder+loading and benchmarks 2016-12-22 15:18:33 +01:00
Brian Brazil 93b70ee4ea Evict chunk descs of all unloaded chunks during maintenance. (#2297)
Keeping these around has two problems:
1) Each desc takes 64 bytes, 10 of them is 640B. This is a lot of
overhead on a 1024 byte chunk.
2) It can take well over a week to reach a point where this and thus
Prometheus memory usage as a whole enters steady state. This makes RAM
estimation very hard for users, and makes it difficult to investigate
things like memory fragmentation.

Instead we'll wipe them during each memory series maintenance cycle, and
if a query pulls them in they'll hang around as cache until the next
cycle.
2016-12-22 13:49:03 +00:00
Fabian Reinartz 0b8c77361e Add initial WAL writing 2016-12-22 12:05:24 +01:00
Fabian Reinartz 2a825f6c28 Consolidate mem index into HeadBlock 2016-12-22 01:12:28 +01:00
Fabian Reinartz 869cccf080 Test and fixes for buffered iterator 2016-12-21 16:06:33 +01:00
Fabian Reinartz 0a94f58f1a Fix test import of labels, simplify constructor names 2016-12-21 15:12:26 +01:00
Brian Brazil bed4635802 Use irate consistently in console template examples. (#2296)
I must have forgotten my 'g' when switching these.
2016-12-21 13:19:23 +00:00
Fabian Reinartz da2beb3e6d Fix zero division, add buffer series iterator 2016-12-21 13:04:51 +01:00
Fabian Reinartz d6d03a966f Merge pull request #2295 from prometheus/fast-path-remote
Don't clone the metric if there's no remote writes.
2016-12-21 12:36:41 +01:00
Brian Brazil 1b8a474612 Don't clone the metric if there's no remote writes.
The metric clone can't be further optimised, and is a
non-trivial memory allocation cost so fast path it
if there's no remote writes configured.
2016-12-21 11:34:48 +00:00
Brian Brazil 6c07453ec1 Only clone the metric in the one place relabelling needs it. (#2292)
This cuts ~17% off memory allocations related to ingesting data
in a basic setup.
2016-12-21 10:00:33 +00:00
Fabian Reinartz dbca3453fb Add label clone benchmark 2016-12-21 10:37:38 +01:00
Fabian Reinartz ede733ab6c Extract labels package 2016-12-21 09:39:01 +01:00
Fabian Reinartz ee217adc7e Redfine append interface, remove old Prometheus storage from bench 2016-12-21 00:02:37 +01:00
Fabian Reinartz cddc29fa17 Fix labels comparison, fetch correct labels 2016-12-20 14:54:52 +01:00
Fabian Reinartz ce7f4106c2 Reda correct label number, fix buffered iterator panic 2016-12-20 14:21:50 +01:00
Fabian Reinartz d9ca4b47f5 Fix offset errors, fix persisted postings order 2016-12-20 13:14:55 +01:00
Fabian Reinartz 1b23d62e3f Properly close files before reopening 2016-12-19 22:37:03 +01:00
Fabian Reinartz 00a503129b Use contextualized and traced errors in reader 2016-12-19 22:29:49 +01:00
Fabian Reinartz 282d9ae6e2 Implement label value queries in all layers. 2016-12-19 12:26:25 +01:00
Fabian Reinartz aabb21f4b9 Add shard series set test 2016-12-19 11:44:11 +01:00
Brian Brazil 2e3b42ad6c Correctly handle the end time being 0 in the URL. (#2290) 2016-12-18 19:30:52 +00:00
Fabian Reinartz bad93d8d57 Extract head serialization into Head method 2016-12-18 14:43:27 +01:00
Brian Brazil f421ce0636 Remove label from prometheus_target_skipped_scrapes_total (#2289)
This avoids it not being intialised, and breaking out by
interval wasn't partiuclarly useful.

Fixes #2269
2016-12-16 18:00:52 +00:00
Brian Brazil 30448286c7 Add sample_limit to scrape config.
This imposes a hard limit on the number of samples ingested from the
target. This is counted after metric relabelling, to allow dropping of
problemtic metrics.

This is intended as a very blunt tool to prevent overload due to
misbehaving targets that suddenly jump in sample count (e.g. adding
a label containing email addresses).

Add metric to track how often this happens.

Fixes #2137
2016-12-16 15:10:09 +00:00
Fabian Reinartz b08f82fa4e Pre-select relevant chunks on series access.
This adds interval metadata to indexed chunks. The queried interval
is used to filter chunks when queried from the index to save
unnecessary accesses of the chunks file.

This is especially relevant for series that come and go often and larger
files.
2016-12-16 12:13:17 +01:00
Fabian Reinartz bd77103a49 Add stats serialization, load querier of all blocks 2016-12-15 16:14:33 +01:00
Fabian Reinartz 1a35e54450 Add chained iterator, skipping seek added 2016-12-15 15:23:15 +01:00
Björn Rabenstein f3f798fbcf Merge pull request #2283 from tcolgate/ignoredots
ignore dotfiles in data directory
2016-12-15 13:32:03 +01:00
Tristan Colgate 30be8e0b8a ignore dotfiles in data directory 2016-12-15 11:48:23 +00:00
Fabian Reinartz 205edd2da9 Add byte/string equality check benchmark 2016-12-15 12:15:54 +01:00
Fabian Reinartz b2f1db5666 Add unsafe string and slice conversions 2016-12-15 11:56:41 +01:00
Fabian Reinartz 9ceed5378e Fix missing bound checks, off-by-ones, typos 2016-12-15 11:24:20 +01:00
Fabian Reinartz 5424a0cf75 Rename SeriesShard to Shard 2016-12-15 08:36:09 +01:00
Fabian Reinartz 9873e18b75 Add loading of persisted blocks 2016-12-15 08:36:07 +01:00
Fabian Reinartz d56b281006 Rename Iterator to Postings 2016-12-15 07:05:29 +01:00
Fabian Reinartz c1acd3fe85 Advance buffered iterator correctly on seek 2016-12-14 21:43:27 +01:00
Fabian Reinartz e561c91d53 Implement proper buffered iterator
This adds a proper duration based lookback buffer for series iterators
to allow advancing sequentially while remaining able to calculate time
aggregating functions such as `rate` backwards.

It uses an array ring buffer to minimize heap allocations for
potentially hundreds of thousands of series for a single query.
2016-12-14 21:14:44 +01:00
Fabian Reinartz ca89080128 Misc fixes for initial Prometheus integration 2016-12-14 18:38:46 +01:00
Fabian Reinartz 725385ea05 Fix compareLabels, add test 2016-12-14 15:47:05 +01:00
Fabian Reinartz fc992fafc2 Change querier interface, initial implementations 2016-12-14 15:39:23 +01:00
Tristan Colgate-McFarlane 4d9134e6d8 Add labeldrop and labelkeep actions. (#2279)
Introduce two new relabel actions. labeldrop, and labelkeep.
These can be used to filter the set of labels by matching regex

- labeldrop: drops all labels that match the regex
- labelkeep: drops all labels that do not match the regex
2016-12-14 10:17:42 +00:00
Björn Rabenstein 45570e5972 Merge pull request #2277 from prometheus/beorn7/storage2
storage: Sanity-check number of loaded chunk descs
2016-12-14 02:59:10 +01:00
beorn7 253be23c00 storage: Sanity-check number of loaded chunk descs
Two cases:

- An unarchived metric must have at least one chunk desc loaded upon
  unarchival. Otherwise, the file is gone or has size 0, which is an
  inconsistency (because the series is still indexed in the archive
  index). Hence, quarantining is triggered.

- If loading the chunk descs of a series with a known chunkDescsOffset
  (i.e. != -1), the number of chunks loaded must be equal to
  chunkDescsOffset. If not, there is a data corruption. An error is
  returned, which leads to qurantining.

In any case, there is a guard added to not access the 1st element of
an empty chunkDescs slice. (That's what triggered the crashes in issue
2249.)  A time series with unknown chunkDescsOffset and no chunks in
memory and no chunks on disk either could trigger that case. I would
assume such a "null series" doesn't exist, but it's not entirely
unthinkable and unreasonable to happen (perhaps in future uses of the
storage). (Create a series, and then something tries to preload chunks
before the first sample is added.)
2016-12-13 23:19:39 +01:00
Björn Rabenstein 5f0c0e43cf Merge pull request #2276 from prometheus/beorn7/storage
storage: Catch data corruption that leads to division by zero
2016-12-13 23:13:39 +01:00
Björn Rabenstein a4c8292232 Merge pull request #2278 from prometheus/beorn7/style
storage: Fix linter issue
2016-12-13 23:13:05 +01:00
beorn7 837c029b16 storage: Fix linter issue
Go style tries to avoid indented `else` blocks.
2016-12-13 19:05:30 +01:00
Brian Brazil c8de1484d5 Add scrape_samples_post_metric_relabeling
This reports the number of samples post any keep/drop
from metric relabelling.
2016-12-13 17:32:11 +00:00
Brian Brazil 06b9df65ec Refactor and add unittests to scrape result handling. 2016-12-13 16:49:17 +00:00