Commit graph

7160 commits

Author SHA1 Message Date
beorn7 253be23c00 storage: Sanity-check number of loaded chunk descs
Two cases:

- An unarchived metric must have at least one chunk desc loaded upon
  unarchival. Otherwise, the file is gone or has size 0, which is an
  inconsistency (because the series is still indexed in the archive
  index). Hence, quarantining is triggered.

- If loading the chunk descs of a series with a known chunkDescsOffset
  (i.e. != -1), the number of chunks loaded must be equal to
  chunkDescsOffset. If not, there is a data corruption. An error is
  returned, which leads to qurantining.

In any case, there is a guard added to not access the 1st element of
an empty chunkDescs slice. (That's what triggered the crashes in issue
2249.)  A time series with unknown chunkDescsOffset and no chunks in
memory and no chunks on disk either could trigger that case. I would
assume such a "null series" doesn't exist, but it's not entirely
unthinkable and unreasonable to happen (perhaps in future uses of the
storage). (Create a series, and then something tries to preload chunks
before the first sample is added.)
2016-12-13 23:19:39 +01:00
Björn Rabenstein 5f0c0e43cf Merge pull request #2276 from prometheus/beorn7/storage
storage: Catch data corruption that leads to division by zero
2016-12-13 23:13:39 +01:00
Björn Rabenstein a4c8292232 Merge pull request #2278 from prometheus/beorn7/style
storage: Fix linter issue
2016-12-13 23:13:05 +01:00
beorn7 837c029b16 storage: Fix linter issue
Go style tries to avoid indented `else` blocks.
2016-12-13 19:05:30 +01:00
Brian Brazil c8de1484d5 Add scrape_samples_post_metric_relabeling
This reports the number of samples post any keep/drop
from metric relabelling.
2016-12-13 17:32:11 +00:00
Brian Brazil 06b9df65ec Refactor and add unittests to scrape result handling. 2016-12-13 16:49:17 +00:00
Björn Rabenstein 568fd8a8cb Merge pull request #2155 from prometheus/beorn7/vendoring2
Update vendoring for Azure
2016-12-13 17:10:59 +01:00
beorn7 4719482f5f storage: Make tests go-vet and golint clean 2016-12-13 17:07:27 +01:00
beorn7 485ac8dff7 storage: Verify validity of byte length when unmarshalling (double)delta chunks
This makes sure a division-by-zero crash cannot happen in the Len()
method.

Fixes #2773
2016-12-13 17:07:27 +01:00
Brian Brazil b5ded43594 Allow buffering of scraped samples before sending them to storage. 2016-12-13 15:01:35 +00:00
Fabian Reinartz 6eeb0ef01c Add queriers and partial mocks 2016-12-13 15:26:58 +01:00
beorn7 906c3a2237 Update vendoring for Azure
Also, actually record the vendored version in vendor.json.
2016-12-13 14:21:16 +01:00
Fabian Reinartz 9b400b4c58 Add chunk based series iterator 2016-12-12 19:12:55 +01:00
Fabian Reinartz b334c3ade8 Write chunk skiplist and add series reader 2016-12-12 15:39:55 +01:00
Fabian Reinartz ae379f385b Fix label index write and add read path 2016-12-12 11:38:43 +01:00
Fabian Reinartz 10943b6d88 Add initial index reader implementation 2016-12-12 08:12:19 +01:00
Fabian Reinartz 70a0224f19 Change chunk sample number to BigEndian 2016-12-12 08:11:53 +01:00
Fabian Reinartz 81b4d570ad Add series file reader 2016-12-11 15:54:25 +01:00
Fabian Reinartz 5e02e28f9c Add proper mmap calls 2016-12-11 15:49:36 +01:00
Fabian Reinartz 8425df035d Fix hashmap serialization 2016-12-11 15:49:24 +01:00
Fabian Reinartz 14dbc59f2b cleanup and switching removal of unsafe calls. 2016-12-10 18:09:57 +01:00
Fabian Reinartz eb9af096f9 Write hashmap pointers, simplify section writer 2016-12-10 10:13:54 +01:00
Fabian Reinartz 3a528c3078 Write plain postings list index 2016-12-10 09:44:00 +01:00
tattsun e714079cf2 storage: fix error message (#2270)
* storage: add error message
2016-12-09 22:36:27 +00:00
Fabian Reinartz 4eba874b04 Factor out section writer 2016-12-09 22:36:31 +01:00
Fabian Reinartz 0b77a3dafc Write series references into index 2016-12-09 22:27:43 +01:00
Fabian Reinartz 55b36ab413 Index persistence fixes, write label index hash table 2016-12-09 22:12:16 +01:00
Fabian Reinartz 8cbc95c316 Write label value indices 2016-12-09 21:40:38 +01:00
Fabian Reinartz 1e0edf367b Write index with symbol table 2016-12-09 21:23:34 +01:00
Fabian Reinartz 40a451694f Refactor persistence into interfaces 2016-12-09 20:45:46 +01:00
Fabian Reinartz 62f9dc311c misc 2016-12-09 16:54:38 +01:00
Fabian Reinartz 74f8dfd95d Persist blocks periodically 2016-12-09 13:41:38 +01:00
Fabian Reinartz 0cf8bb9e53 Move sub-indexes into single index structure 2016-12-09 10:41:51 +01:00
Fabian Reinartz 8aa99a3ebd misc 2016-12-09 10:00:14 +01:00
Fabian Reinartz 2c34a15fe6 Add initial seriailization of block data 2016-12-08 17:43:10 +01:00
Fabian Reinartz 3ef7da33c8 Restructure files 2016-12-08 12:21:03 +01:00
Fabian Reinartz 63b887eb62 Add Makefile 2016-12-08 12:00:05 +01:00
Fabian Reinartz b845f8d3a1 Reduce test data allocations 2016-12-08 11:59:54 +01:00
Fabian Reinartz ce82bdb71a Add write benchmark utility 2016-12-07 17:30:10 +01:00
Fabian Reinartz 52276c6966 Bucket samples before appending.
This pre-sorts samples into buckets before appending them to reduce
locking of shards.
2016-12-07 17:10:49 +01:00
Fabian Reinartz c5945177fb chunks: helper for bit range 2016-12-07 15:37:37 +01:00
Fabian Reinartz 9ecea36ef9 Merge pull request #2259 from prometheus/federationerr
web: don't return federation errors over HTTP
2016-12-06 16:18:03 +01:00
Fabian Reinartz cef2e04aa3 web: add error counter for federation responses 2016-12-06 16:09:50 +01:00
Fabian Reinartz 0ea0a19848 Merge pull request #2240 from agaoglu/read-timeout
Set read-timeout for http.Server
2016-12-06 16:01:45 +01:00
Fabian Reinartz 9d68e81b32 web: don't return federation errors over HTTP
We are writing federation responses streaming. So after
the first byte we wrote, the status header is fixed. We cannot
return an HTTP error for intermediate error but should just abort
and log instead.
2016-12-06 15:52:50 +01:00
Erdem Agaoglu 054f8ebbfb Increase default max-connections 2016-12-06 17:45:19 +03:00
Erdem Agaoglu 2260079c12 Vendor x/net/netutil 2016-12-06 12:52:29 +03:00
Erdem Agaoglu e487477a17 LimitListener to limit max number of connections
This also drops tcp keep-alive in ListenAndServe but it's no longer
necessary since we now close idle connections long before that.
2016-12-06 12:45:59 +03:00
Fabian Reinartz 893390e0c6 Merge pull request #2248 from msiebuhr/cwd-in-status
web: Display current working directory on status-page
2016-12-05 21:41:37 +01:00
Fabian Reinartz 9b459458d0 Docs and interface definitions 2016-12-05 21:26:19 +01:00