Commit graph

810 commits

Author SHA1 Message Date
Tom Wilkie 75bb0f3253 Review feedback 2017-03-13 21:24:49 +00:00
Tom Wilkie 77cce900b8 Fix tests 2017-03-13 15:21:59 +00:00
Tom Wilkie b48799a01e Add license stanza 2017-03-13 14:50:15 +00:00
Tom Wilkie 9d22f030cf Dynamically reshard the QueueManager based on observed load. 2017-03-13 14:41:16 +00:00
Fabian Reinartz 8a8eb12985 storage/tsdb: don't use partitioned DB. 2017-03-07 11:51:30 +01:00
Fabian Reinartz 9eb1d6c927 remote: take code from master 2017-03-07 11:43:32 +01:00
Fabian Reinartz 9304179ef7 Merge branch 'master' into dev-2.0 2017-03-02 08:16:58 +01:00
Fabian Reinartz 4397b4d508 *: pass Prometheus registry into storage 2017-02-28 09:33:14 +01:00
Tom Wilkie 1ab893c6ec Limit 'discarding sample' logs to 1 every 10s (#2446)
* Limit 'discarding sample' logs to 1 every 10s

* Include the vendored library

* Review feedback
2017-02-23 19:20:39 +01:00
Julius Volz 2f39dbc8b3 Rename StorageQueueManager -> QueueManager 2017-02-21 21:45:43 +01:00
Julius Volz e9476b35d5 Re-add multiple remote writers
Each remote write endpoint gets its own set of relabeling rules.

This is based on the (yet-to-be-merged)
https://github.com/prometheus/prometheus/pull/2419, which removes legacy
remote write implementations.
2017-02-20 13:23:12 +01:00
Björn Rabenstein 089dc1076b Merge pull request #2435 from jmeulemans/open-chunks-gauge
Adding gauge for number of open head chunks.
2017-02-17 16:02:06 +01:00
Jeremy Meulemans 025c828976 Changed to open_head_chunks to address review.
Now incrementing numHeadChunks directly.
2017-02-17 07:10:13 -06:00
Jeremy Meulemans 074050b8c0 Updating for failed codeclimate check. 2017-02-16 18:04:28 -06:00
Jeremy Meulemans f70b52d0b6 Adding gauge for number of open head chunks.
Fixes #1710
2017-02-16 17:56:45 -06:00
Julius Volz beb3c4b389 Remove legacy remote storage implementations
This removes legacy support for specific remote storage systems in favor
of only offering the generic remote write protocol. An example bridge
application that translates from the generic protocol to each of those
legacy backends is still provided at:

documentation/examples/remote_storage/remote_storage_bridge

See also https://github.com/prometheus/prometheus/issues/10

The next step in the plan is to re-add support for multiple remote
storages.
2017-02-14 17:52:05 +01:00
beorn7 d771185a43 storage: Fix chunkIndexToStartSeek calculation
With a high enough shrink ratio and enough chunks to persist, the
cutoff point could be _outside_ of the file, which wreaks havoc in the
storage.
2017-02-10 11:42:59 +01:00
beorn7 73bd5e4dff Merge branch 'beorn7/storage' into beorn7/storage3 2017-02-09 14:44:10 +01:00
beorn7 46a0837816 storage: Fix offset returned by dropAndPersistChunks
This is another corner-case that was previously never exercised
because the rewriting of a series file was never prevented by the
shrink ratio.

Scenario: There is an existing series on disk, which is archived. If a
new sample comes in for that file, a new chunk in memory is created,
and the chunkDescsOffset is set to -1. If series maintenance happens
before the series has at least one chunk to persist _and_ an
insufficient chunks on disk is old enough for purging (so that the
shrink ratio kicks in), dropAndPersistChunks would return 0, but it
should return the chunk length of the series file.
2017-02-09 14:35:07 +01:00
beorn7 9d12204da5 Merge branch 'release-1.5' 2017-02-09 13:11:53 +01:00
beorn7 bed4934224 storage: One more persist error code path discovered
Also, in that code path, set chunkDescsOffset to 0 rather than -1 in
case of "dropped more chunks from persistence than from memory" so
that no other weird things happen before the series is quarantined for
good.
2017-02-09 11:51:40 +01:00
beorn7 242d8edcb5 Merge branch 'release-1.5' 2017-02-08 17:28:09 +01:00
beorn7 8c8baaa558 storage: writeMemorySeries needs to return true for quarantined series
This is another fallout of my bug hunt.
2017-02-08 16:28:56 +01:00
Mitsuhiro Tanda be8b1eb656 storage: optimize dropping chunks by using minShrinkRatio (#2397)
storage: prevent unnecessary chunk header reading if minShrinkRatio > 0
2017-02-07 17:33:54 +01:00
beorn7 2363a90adc storage: Do not throw away fully persisted memory series in checkpointing 2017-02-06 17:39:59 +01:00
Fabian Reinartz ea3ba338dd main: add flags for new storage 2017-02-05 18:22:06 +01:00
beorn7 244a65fb29 storage: Increase persist watermark before calling append
The append call may reuse cds, and thus change its len.
(In practice, this wouldn't happen as cds should have len==cap.
Still, the previous order of lines was problematic.)
2017-02-05 02:25:09 +01:00
beorn7 75282b27ba storage: Added checks for invariants 2017-02-04 23:40:22 +01:00
beorn7 31e9db7f0c storage: Simplify evictChunkDesc method 2017-02-04 22:29:37 +01:00
Fabian Reinartz 5772f1a7ba retrieval/storage: adapt to new interface
This simplifies the interface to two add methods for
appends with labels or faster reference numbers.
2017-02-02 13:05:46 +01:00
beorn7 65dc8f44d3 storage: Test for errors returned by MaybePopulateLastTime 2017-02-01 23:43:58 +01:00
beorn7 752fac60ae storage: Remove race condition from TestLoop 2017-02-01 23:43:58 +01:00
beorn7 4ccfc93dcf storage: Set shrink ratio in the constructor. 2017-02-01 15:37:16 +01:00
beorn7 b2f086c6c4 storage: Expose bug of not setting the shrink ratio in the contstructor 2017-02-01 15:37:10 +01:00
Brian Brazil c1b547a90e Only checkpoint chunkdescs and series that need persisting. (#2340)
This decreases checkpoint size by not checkpointing things
that don't actually need checkpointing.

This is fully compatible with the v2 checkpoint format,
as it makes series appear as though the only chunksdescs
in memory are those that need persisting.
2017-01-17 00:59:38 +00:00
Fabian Reinartz c691895a0f retrieval: cache series references, use pkg/textparse
With this change the scraping caches series references and only
allocates label sets if it has to retrieve a new reference.
pkg/textparse is used to do the conditional parsing and reduce
allocations from 900B/sample to 0 in the general case.
2017-01-16 12:03:57 +01:00
Brian Brazil f64c231dad Allow checkpoints and maintenance to happen concurrently. (#2321)
This is essential on larger Prometheus servers, as otherwise
checkpoints prevent sufficient persisting of chunks to disk.
2017-01-13 17:24:19 +00:00
Fabian Reinartz ad9bc62e4c storage: extend appender and adapt it 2017-01-13 14:48:01 +01:00
Brian Brazil 1dcb7637f5 Add various persistence related metrics (#2333)
Add metrics around checkpointing and persistence

* Add a metric to say if checkpointing is happening,
and another to track total checkpoint time and count.

This breaks the existing prometheus_local_storage_checkpoint_duration_seconds
by renaming it to prometheus_local_storage_checkpoint_last_duration_seconds
as the former name is more appropriate for a summary.

* Add metric for last checkpoint size.

* Add metric for series/chunks processed by checkpoints.

For long checkpoints it'd be useful to see how they're progressing.

* Add metric for dirty series

* Add metric for number of chunks persisted per series.

You can get the number of chunks from chunk_ops,
but not the matching number of series. This helps determine
the size of the writes being made.

* Add metric for chunks queued for persistence

Chunks created includes both chunks that'll need persistence
and chunks read in for queries. This only includes chunks created
for persistence.

* Code review comments on new persistence metrics.
2017-01-11 15:11:19 +00:00
Fabian Reinartz 304cae9928 tsdb: Use PartitionedDB constructor 2017-01-06 12:34:54 +01:00
Brian Brazil f9e581907a Make index queue bigger. (#2322)
When a large Prometheus starts up fresh it can take many minutes
to warmup and clear out the index queue. A larger queue means less
blocking, bigger batches and cuts down startup time by ~50%.
2017-01-05 17:57:42 +00:00
Fabian Reinartz bc20d93f0a storage: rename iterator value getters to At() 2017-01-02 13:33:37 +01:00
Fabian Reinartz 7322c46b8e storage: add mock iterator for test 2016-12-30 10:45:56 +01:00
Fabian Reinartz f8fc1f5bb2 *: migrate ingestion to new batch Appender 2016-12-29 11:03:56 +01:00
Fabian Reinartz 71fe0c58a8 promql: misc fixes 2016-12-28 11:32:15 +01:00
Mitsuhiro Tanda 7e369b9318 expose max memory chunks metrics (#2303)
* expose max memory chunks metrics
2016-12-27 18:34:07 +00:00
Fabian Reinartz fecf9532b9 *: fix misc compile errors 2016-12-25 11:42:57 +01:00
Fabian Reinartz 622ece6273 *: fix recording tests, migrate matcher types 2016-12-25 11:12:57 +01:00
Fabian Reinartz 0492ddbd4d *: fully decouple tsdb, add new storage interfaces 2016-12-25 01:43:22 +01:00
Fabian Reinartz d17b5be48a storage/metric: remove package 2016-12-25 00:42:52 +01:00
Fabian Reinartz 8b84ee5ee6 storage: remove old storage
This removes all old storage files and only keeps interfaces
to still allow the code to compile.
2016-12-22 23:33:32 +01:00
Fabian Reinartz 11a731ba82 remote: remove hard-coded remote storages
This commit removes the flag-configured remote storage integrations
in favor of the generic remote write path.
2016-12-22 23:17:35 +01:00
Brian Brazil 93b70ee4ea Evict chunk descs of all unloaded chunks during maintenance. (#2297)
Keeping these around has two problems:
1) Each desc takes 64 bytes, 10 of them is 640B. This is a lot of
overhead on a 1024 byte chunk.
2) It can take well over a week to reach a point where this and thus
Prometheus memory usage as a whole enters steady state. This makes RAM
estimation very hard for users, and makes it difficult to investigate
things like memory fragmentation.

Instead we'll wipe them during each memory series maintenance cycle, and
if a query pulls them in they'll hang around as cache until the next
cycle.
2016-12-22 13:49:03 +00:00
Brian Brazil 1b8a474612 Don't clone the metric if there's no remote writes.
The metric clone can't be further optimised, and is a
non-trivial memory allocation cost so fast path it
if there's no remote writes configured.
2016-12-21 11:34:48 +00:00
Tristan Colgate 30be8e0b8a ignore dotfiles in data directory 2016-12-15 11:48:23 +00:00
Björn Rabenstein 45570e5972 Merge pull request #2277 from prometheus/beorn7/storage2
storage: Sanity-check number of loaded chunk descs
2016-12-14 02:59:10 +01:00
beorn7 253be23c00 storage: Sanity-check number of loaded chunk descs
Two cases:

- An unarchived metric must have at least one chunk desc loaded upon
  unarchival. Otherwise, the file is gone or has size 0, which is an
  inconsistency (because the series is still indexed in the archive
  index). Hence, quarantining is triggered.

- If loading the chunk descs of a series with a known chunkDescsOffset
  (i.e. != -1), the number of chunks loaded must be equal to
  chunkDescsOffset. If not, there is a data corruption. An error is
  returned, which leads to qurantining.

In any case, there is a guard added to not access the 1st element of
an empty chunkDescs slice. (That's what triggered the crashes in issue
2249.)  A time series with unknown chunkDescsOffset and no chunks in
memory and no chunks on disk either could trigger that case. I would
assume such a "null series" doesn't exist, but it's not entirely
unthinkable and unreasonable to happen (perhaps in future uses of the
storage). (Create a series, and then something tries to preload chunks
before the first sample is added.)
2016-12-13 23:19:39 +01:00
Björn Rabenstein 5f0c0e43cf Merge pull request #2276 from prometheus/beorn7/storage
storage: Catch data corruption that leads to division by zero
2016-12-13 23:13:39 +01:00
beorn7 837c029b16 storage: Fix linter issue
Go style tries to avoid indented `else` blocks.
2016-12-13 19:05:30 +01:00
beorn7 4719482f5f storage: Make tests go-vet and golint clean 2016-12-13 17:07:27 +01:00
beorn7 485ac8dff7 storage: Verify validity of byte length when unmarshalling (double)delta chunks
This makes sure a division-by-zero crash cannot happen in the Len()
method.

Fixes #2773
2016-12-13 17:07:27 +01:00
tattsun e714079cf2 storage: fix error message (#2270)
* storage: add error message
2016-12-09 22:36:27 +00:00
Christopher M. Luciano 148b006e25 Clarify error message when Prometheus data dir finds unexpected files 2016-12-05 10:51:57 -05:00
Julius Volz 127332c56f Merge pull request #2168 from tomwilkie/chunk-len
Add call to estimate number of samples in a chunk to the API
2016-11-17 23:13:50 -08:00
Tom Wilkie 585878cdb2 Add call to estimate number of samples in a chunk to the API 2016-11-17 19:09:59 +00:00
Björn Rabenstein 036715370f Merge pull request #2184 from huydx/master
Fix possible memory leak by defer inside loop
2016-11-14 15:26:39 +01:00
huydx c999902761 Fix possible memory leak by defer inside loop 2016-11-14 14:08:08 +09:00
Fabian Reinartz 856de30c09 Check error before defer closing
If an error is returned the file might be nil and a Close call
would cause a panic.
2016-11-13 18:16:02 +01:00
Fabian Reinartz 6703404cb4 Merge remote-tracking branch 'origin/release-1.2' 2016-11-01 16:35:22 +01:00
beorn7 c5bd178b93 Protect exported Querier interface method against negative time ranges 2016-11-01 15:05:01 +01:00
beorn7 5b16d6bd6e Merge branch 'release-1.2' 2016-10-31 00:06:23 +01:00
beorn7 876e5da4f8 Add guard against non-monotonic samples in series
This can only happen due to data corruption.
2016-10-25 14:59:33 +02:00
Dominik Schulz 182e17958a Trivial spelling corrections and a small comment. 2016-10-18 20:14:38 +02:00
Fabian Reinartz 8fa18d564a storage: enhance Querier interface usage
This extracts Querier as an instantiateable and closeable object
rather than just defining extending methods of the storage interface.
This improves composability and allows abstracting query transactions,
which can be useful for transaction-level caches, consistent data views,
and encapsulating teardown.
2016-10-16 10:39:29 +02:00
beorn7 719508752b Re-add counting of evict chunk ops and decrementing NumMemChunks
Also, modify test to expose the regression.
2016-10-10 16:30:10 +02:00
Julius Volz cb02f017ee Clean up some doc comments 2016-10-06 21:53:40 +02:00
Julius Volz c212ef0326 Add Chunk.Utilization() methods
When using the chunking code in other projects (both Weave Prism and
ChronixDB ingester), you sometimes want to know how well you are
utilizing your chunks when closing/storing them.
2016-10-06 16:31:59 +02:00
Julius Volz c7932aa009 Remove gRPC leftovers in protobuf definitions 2016-10-05 17:31:04 +02:00
Björn Rabenstein 1e2f03f668 Merge pull request #2005 from redbaron/microoptimise-matching
Microoptimise matching
2016-10-05 17:26:56 +02:00
Maxim Ivanov e6db9f8159 New fpsForLabelMatchers and seriesForLabelMatchers methods
These more specific methods have replaced `metricForLabelMatchers`
in cases where its  `map[fingerprint]metric` result type was
not necessary or was used as an intermediate step

Avoids duplicated calls to `seriesForRange` from
`QueryRange` and `QueryInstant` methods.
2016-10-05 15:15:54 +01:00
Brian Brazil 6e8f87a37f Merge pull request #2047 from prometheus/write-relabel
Add support for remote write relabelling.
2016-10-05 07:47:49 +01:00
Brian Brazil 77605649a9 Add support for remote write relabelling.
Switch back to a single remote writer, as we were only ever meant to
have one and the relabel semantics are clearer that way.
2016-10-05 07:43:19 +01:00
Julius Volz c9d4526428 Unpublish accidentally published series methods
There were some more accidentally published methods of the memorySeries
type which I didn't notice when reviewing https://github.com/prometheus/prometheus/pull/2011
2016-10-03 00:04:56 +02:00
Maxim Ivanov 4978a65495 Extract initial FP candidate build logic into candidateFPsForLabelMatchers method
No functional changes otherwise
2016-10-02 17:35:02 +01:00
Maxim Ivanov c048a0cde8 Add metrics to result after checking all matchers
Should be marginally faster and somewhat more GC friendly
2016-10-02 17:35:02 +01:00
Maxim Ivanov bedc0eda1f Added BenchmarkQueryRange 2016-10-02 17:35:02 +01:00
Julius Volz c25f0de5ae Remove local.ZeroSample{,Pair}, use model definitions 2016-09-28 23:42:45 +02:00
Julius Volz 044ebce779 Review fixups. 2016-09-28 23:42:44 +02:00
Julius Volz d30a3c7c0f Fix accidental publishing of memorySeries.firstTime() 2016-09-26 13:25:27 +02:00
Julius Volz ab80ced756 storage: separate chunk package, publish more names
This is a followup to https://github.com/prometheus/prometheus/pull/2011.

This publishes more of the methods and other names of the chunk code and
moves the chunk code to its own package. There's some unavoidable
ugliness: the chunk and chunkDesc metrics are used by both packages, so
I had to move them to the chunk package. That isn't great, but I don't
see how to do it better without a larger redesign of everything. Same
for the evict requests and some other types.
2016-09-26 13:25:11 +02:00
Julius Volz 42c05dd3a2 Merge pull request #2011 from mattkanwisher/chuck-public
Make Chunk and ChunkIterator public for reuse
2016-09-23 14:45:35 +02:00
beorn7 ca98382943 Avoid defer in seriesMap.get
This is related to https://github.com/golang/go/issues/14939 .
It's probably the only occurrence where it matters.
2016-09-22 17:50:58 +02:00
Matthew Campbell 67d76e3a5d timeseries: store varbit encoded data into cassandra 2016-09-21 17:56:55 +02:00
Tom Wilkie 4520e12440 Add HTTP Basic Auth & TLS support to the generic write path. (#1957)
* Add config, HTTP Basic Auth and TLS support to the generic write path.

- Move generic write path configuration to the config file
- Factor out config.TLSConfig -> tlf.Config translation
- Support TLSConfig for generic remote storage
- Rename Run to Start, and make it non-blocking.
- Dedupe code in httputil for TLS config.
- Make remote queue metrics global.
2016-09-19 22:47:51 +02:00
Julius Volz c187308366 storage: Contextify storage interfaces.
This is based on https://github.com/prometheus/prometheus/pull/1997.

This adds contexts to the relevant Storage methods and already passes
PromQL's new per-query context into the storage's query methods.
The immediate motivation supporting multi-tenancy in Frankenstein, but
this could also be used by Prometheus's normal local storage to support
cancellations and timeouts at some point.
2016-09-19 16:29:07 +02:00
Maxim Ivanov bdc53098fc Avoid having contended mutexes on same cacheline
CPUs have to serialise write access to a single cache line
effectively reducing level of possible parallelism. Placing
mutexes on different cache lines avoids this problem.

Most gains will be seen on NUMA servers where CPU interconnect
traffic is especially expensive

Before:
go test . -run none -bench BenchmarkFingerprintLocker
BenchmarkFingerprintLockerParallel-4   	 2000000	       932 ns/op
BenchmarkFingerprintLockerSerial-4     	30000000	        49.6 ns/op

After:
go test . -run none -bench BenchmarkFingerprintLocker
BenchmarkFingerprintLockerParallel-4   	 3000000	       569 ns/op
BenchmarkFingerprintLockerSerial-4     	30000000	        51.0 ns/op
2016-09-18 23:32:55 +01:00
Julius Volz 5f5a78e807 Merge pull request #1974 from prometheus/disable-local-storage
Allow disabling local storage.
2016-09-17 18:40:01 +02:00
Tom Wilkie d83879210c Switch back to protos over HTTP, instead of GRPC.
My aim is to support the new grpc generic write path in Frankenstein.  On the surface this seems easy - however I've hit a number of problems that make me think it might be better to not use grpc just yet.

The explanation of the problems requires a little background.  At weave, traffic to frankenstein need to go through a couple of services first, for SSL and to be authenticated.  So traffic goes:

    internet -> frontend -> authfe -> frankenstein

- The frontend is Nginx, and adds/removes SSL.  Its done this way for legacy reasons, so the certs can be managed in one place, although eventually we imagine we'll merge it with authfe.  All traffic from frontend is sent to authfe.
- Authfe checks the auth tokens / cookie etc and then picks the service to forward the RPC to.
- Frankenstein accepts the reads and does the right thing with them.

First problem I hit was Nginx won't proxy http2 requests - it can accept them, but all calls downstream are http1 (see https://trac.nginx.org/nginx/ticket/923).  This wasn't such a big deal, so it now looks like:

    internet --(grpc/http2)--> frontend --(grpc/http1)--> authfe --(grpc/http1)--> frankenstein

Next problem was golang grpc server won't accept http1 requests (see https://groups.google.com/forum/#!topic/grpc-io/JnjCYGPMUms).  It is possible to link a grpc server in with a normal go http mux, as long as the mux server is serving over SSL, as the golang http client & server won't do http2 over anything other than an SSL connection.  This would require making all our service to service comms SSL.  So I had a go a writing a grpc http1 server, and got pretty far.  But is was a bit of a mess.

So finally I thought I'd make a separate grpc frontend for this, running in parallel with the frontend/authfe combo on a different port - and first up I'd need a grpc reverse proxy.  Ideally we'd have some nice, generic reverse proxy that only knew about a map from service names -> downstream service, and didn't need to decode & re-encode every request as it went through.  It seems like this can't be done with golang's grpc library - see https://github.com/mwitkow/grpc-proxy/issues/1.

And then I was surprised to find you can't do grpc from browsers! See http://www.grpc.io/faq/ - not important to us, but I'm starting to question why we decided to use grpc in the first place?

It would seem we could have most of the benefits of grpc with protos over HTTP, and this wouldn't preclude moving to grpc when its a bit more mature?  In fact, the grcp FAQ even admits as much:

> Why is gRPC better than any binary blob over HTTP/2?
> This is largely what gRPC is on the wire.
2016-09-15 23:21:54 +01:00
Tobias Schmidt 29ced0090f Fix common english misspellings 2016-09-14 23:23:28 -04:00
Tobias Schmidt e2c12dcdb5 Add missing error check in persistence test 2016-09-14 23:16:47 -04:00