Commit graph

180 commits

Author SHA1 Message Date
beorn7 699946bf32 Fix chunk desc loading.
If all samples in consecutive chunks have the same timestamp, the way
we used to load chunks will fail. With this change, the persist
watermark is used to load the right amount of chunkDescs from disk.

This bug is a possible reason for the rare storage corruption we have
observed.
2015-07-16 13:09:20 +02:00
beorn7 4203849c92 Test chunkDesc eviction and loading 2015-07-16 13:09:13 +02:00
beorn7 37e12df9ff Improve TestAppendOutOfOrder 2015-07-16 12:48:33 +02:00
beorn7 502aa9ded5 Use Has instead of Get for existence test. 2015-07-16 12:26:50 +02:00
beorn7 ff08f0b6fe storage: ensure timestamp monotonicity within series.
Fixes https://github.com/prometheus/prometheus/issues/481

While doing so, clean up and fix a few other things:

- Fix `go vet` warnings (@fabxc to blame ;).

- Fix a racey problem with unarchiving: Whenever we unarchive a
  series, we essentially want to do something with it. However, until
  we have done something with it, it appears like a series that is
  ready to be archived or even purged. So e.g. it would be ignored
  during checkpointing. With this fix, we always load the chunkDescs
  upon unarchiving. This is wasteful if we only want to add a new
  sample to an archived time series, but the (presumably more common)
  case where we access an archived time series in a query doesn't
  become more expensive.

- The change above streamlined the getOrCreateSeries ond
  newMemorySeries flow. Also, the modTime is now always set correctly.

- Fix the leveldb-backed implementation of KeyValueStore.Delete. It
  had the wrong behavior of still returning true, nil if a
  non-existing key has been passed in.
2015-07-15 18:56:53 +02:00
Julius Volz acbc2b8cb6 storage: Fix float->uint conversions on some compilers.
See https://github.com/prometheus/prometheus/issues/887, which will at
least be partially fixed by this.

From the spec https://golang.org/ref/spec#Conversions:

"In all non-constant conversions involving floating-point or complex
values, if the result type cannot represent the value the conversion
succeeds but the result value is implementation-dependent."

This ended up setting the converted values to 0 on Debian's Go 1.4.2
compiler, at least on 32-bit Debians.
2015-07-13 11:19:11 +02:00
beorn7 8c196c1028 Minor doc fixes. 2015-06-23 17:07:18 +02:00
Fabian Reinartz 6bfb4549a6 storage: add LastSamplePairForFingerprint method 2015-06-23 13:45:15 +02:00
Fabian Reinartz dc7d27ab9a retrieval: add honor label handling and parametrized querying.
This commit adds the honor_labels and params arguments to the scrape
config. This allows to specify query parameters used by the scrapers
and handling scraped labels with precedence.
2015-06-23 13:45:14 +02:00
beorn7 9016917d1c Increment dirty counter only if setDirty(true) is called.
Currently, we increment the counter even if setDirty(false) is called,
which sets the storage clean.
2015-06-22 18:12:55 +02:00
Fabian Reinartz 1eff186555 Merge pull request #810 from prometheus/fabxc/lmatch
Match empty labels.
2015-06-22 15:45:50 +02:00
Fabian Reinartz 5b91ea9b36 storage: improve label matching and allow unset matching.
Matching of empty labels now also matches metrics where the label
was not explicitly set to the empty string.
2015-06-22 15:33:44 +02:00
Fabian Reinartz 46df1fd5ea storage/local: add benchmark for label matching. 2015-06-22 15:33:44 +02:00
Fabian Reinartz b105e26f4d storage: remove global flags 2015-06-15 19:01:06 +02:00
Fabian Reinartz 5c6c0e2faa Add storage method to delete time series 2015-06-01 21:23:32 +02:00
Fabian Reinartz 0de6edbdfc Move pkg/ to util/ 2015-06-01 21:12:32 +02:00
Fabian Reinartz 2317b001d0 Move flock package to pkg/flock 2015-06-01 21:12:31 +02:00
Fabian Reinartz 3c8fbf1e15 Move test package to pkg/testutil 2015-06-01 21:12:31 +02:00
Fabian Reinartz aff01e29c3 Limit retrievable samples to retention window.
The storage does not delete data immediately after the retention period.
We don't want to retrieve this data as it causes artifacts.
2015-05-27 13:13:59 +02:00
Fabian Reinartz a92134a947 Merge pull request #724 from prometheus/fabxc/storage-startup
Read from indexing queue during crash recovery.
2015-05-23 16:50:47 +02:00
Fabian Reinartz 6e319532cf Read from indexing queue during crash recovery.
Change #704 introduced a regression that started reading the queue only
after potential crash recovery. When more than the queue capacity was
indexed, Prometheus deadlocked.
2015-05-23 15:32:35 +02:00
beorn7 dbcb3d9333 Use an RW lock to checkpoint fingerprint mappings.
This has to be backported to 0.13.x.
2015-05-23 14:05:05 +02:00
beorn7 3b9ab546e6 Add metrics to count inconsistencies and fp collisions. 2015-05-21 18:46:20 +02:00
Björn Rabenstein c44e7cd105 Merge pull request #706 from prometheus/beorn7/persistence2
Improve iterator performance.
2015-05-21 13:48:52 +02:00
Fabian Reinartz 112a778922 Align int64s for atomic operations 2015-05-21 01:38:50 +02:00
beorn7 3b9c421a69 Weed out all the [Gg]et* method names.
The only exception is getNumChunksToPersist to avoid naming the struct
member numChunksToPersist in a weird way.
2015-05-20 19:13:06 +02:00
Julius Volz 267fd34156 Switch Prometheus to use github.com/prometheus/log.
This change is conceptually very simple, although the diff is large. It
switches logging from "github.com/golang/glog" to
"github.com/prometheus/log", while not actually changing any log
messages. V(1)-style logging has been changed to be log.Debug*().
2015-05-20 18:19:32 +02:00
beorn7 81b190bf45 Remove locking from series iterator. Cache chunk iterators. 2015-05-20 16:19:34 +02:00
beorn7 cd5574bf8a Make chunk and series iterators more efficient. 2015-05-20 16:19:34 +02:00
beorn7 f79c694be5 Add benchmarks for series iterator methods. 2015-05-20 16:19:34 +02:00
Fabian Reinartz f59a449a24 Fix storage test 2015-05-20 16:12:07 +02:00
Fabian Reinartz d8440d75f1 Do not start storage processing before Start() is called. 2015-05-19 13:51:45 +02:00
beorn7 d1a93655a1 Fix typo. 2015-05-11 17:15:30 +02:00
beorn7 7c6466d476 Reserve only ~1M FPs for the mapping.
That reduces the chance of having a fingerprint in the reserved area.
2015-05-08 18:10:56 +02:00
beorn7 ac75dc2812 Avoid archive lookup for known mapped FPs. 2015-05-08 16:39:26 +02:00
beorn7 ed810b45bf Improvements after review. 2015-05-08 13:35:39 +02:00
beorn7 c36e0e05f1 Add crash recovery of fingerprint mappings. 2015-05-07 18:58:14 +02:00
beorn7 2235cec175 Handle fingerprint collisions. 2015-05-07 18:17:59 +02:00
beorn7 9820e5fe99 Use FastFingerprint where appropriate. 2015-05-06 12:00:58 +02:00
Scott Worley e5f92d35fe Fix storage/local tests for 32-bit systems 2015-04-30 14:19:48 -07:00
beorn7 a052d32609 Comment improvement. 2015-04-14 10:49:43 +02:00
beorn7 66fc61f9b7 Make bufPool a member of the persistence struct. 2015-04-14 10:43:09 +02:00
beorn7 b02d900e61 Improve chunk and chunkDesc loading.
Also, clean up some things in the code (especially introduction of the
chunkLenWithHeader constant to avoid the same expression all over the place).

Benchmark results:

BEFORE
BenchmarkLoadChunksSequentially     5000            283580 ns/op          152143 B/op        312 allocs/op
BenchmarkLoadChunksRandomly        20000             82936 ns/op           39310 B/op         99 allocs/op
BenchmarkLoadChunkDescs            10000            110833 ns/op           15092 B/op        345 allocs/op

AFTER
BenchmarkLoadChunksSequentially    10000            146785 ns/op          152285 B/op        315 allocs/op
BenchmarkLoadChunksRandomly        20000             67598 ns/op           39438 B/op        103 allocs/op
BenchmarkLoadChunkDescs            20000             99631 ns/op           12636 B/op        192 allocs/op

Note that everything is obviously loaded from the page cache (as the
benchmark runs thousands of times with very small series files). In a
real-world scenario, I expect a larger impact, as the disk operations
will more often actually hit the disk. To load ~50 sequential chunks,
this reduces the iops from 100 seeks and 100 reads to 1 seek and 1
read.
2015-04-13 21:06:04 +02:00
beorn7 c563398c68 Remove obsolete debug message. 2015-04-13 16:59:52 +02:00
beorn7 c5fa0b90c3 Fix the case where a series in memory has 0 chunks, but chunks on disk.
This is actually completely normal for a freshly unarchived series.

Test added to expose.
2015-04-09 15:57:11 +02:00
beorn7 3035b8bfdd Adaptively reduce the wait time for memory series maintenance.
This will make in-memory series maintenance the faster the more chunks
are waiting for persistence.
2015-04-01 17:52:03 +02:00
beorn7 fbc44d8f95 Add benchmark for loading chunks and chunk descs. 2015-03-19 19:28:21 +01:00
beorn7 6a21f73898 Fixes after review. 2015-03-19 17:54:59 +01:00
beorn7 51d35f4481 Instrument series maintenance durations. 2015-03-19 17:06:16 +01:00
beorn7 12ae6e9203 Increase resilience of the storage against data corruption - step 4.
Step 4: Add a configurable sync'ing of series files after modification.
2015-03-19 15:58:02 +01:00