prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-14 01:24:04 -08:00

Author	SHA1	Message	Date
Julius Volz	8ebeed0b44	remote: Expose ClientConfig type (#3165 ) The Client type is already exposed, but can't be used without the config for it also being exposed. Using the remote.Client from other programs is useful to do full end-to-end tests of Prometheus's remote protocol against adapter implementations.	2017-09-14 15:25:09 +02:00
Tom Wilkie	f66f882d08	Merge pull request #3160 from bboreham/remote-keepalive Re-enable http keepalive on remote storage	2017-09-14 08:23:43 +01:00
Bryan Boreham	9d6b945e41	Default HTTP keep-alive ON for remote read/write	2017-09-11 09:48:30 +00:00
Ben Kochie	59aca4138b	Fix staticcheck issues.	2017-08-28 17:29:01 +02:00
Tom Wilkie	e1c77cdfd4	Merge pull request #2991 from tomwilkie/2990-remote-config Make queue manager configurable.	2017-08-03 10:26:29 +01:00
Tom Wilkie	5169f990f9	Review feedback: add yaml struct tags, don't embed queue config. Also, rename QueueManageConfig to QueueConfig, for consistency with tags.	2017-08-01 14:43:56 +01:00
Fabian Reinartz	bc2e9459d8	Merge pull request #2973 from tomwilkie/2969-negative-shards Prevent number of remote write shards from going negative.	2017-07-28 13:02:33 +02:00
Tom Wilkie	454b661145	Make queue manager configurable.	2017-07-25 13:47:34 +01:00
Conor Broderick	4b868113bb	Metric name validation (#2975 )	2017-07-24 13:49:20 +01:00
Tom Wilkie	1d94eb8d95	Prevent number of remote write shards from going negative. This can happen in the situation where the system scales up the number of shards massively (to deal with some backlog), then scales it down again as the number of samples sent during the time period is less than the number received.	2017-07-19 16:27:19 +01:00
Matt Bostock	13c6e4a4bc	Remote queue manager: Fix typo Change 'send' to 'sent'.	2017-07-04 20:48:52 +01:00
Tom Wilkie	24a113bb09	Review feedback: limit number of bytes read under error.	2017-06-01 11:21:48 +01:00
Tom Wilkie	46abe8cbf2	Remote write: read first line of response and include it in the error.	2017-05-31 13:46:08 +01:00
Alexey Palazhchenko	b0e1ea7c6c	Simplify code, fix typos. (#2719 )	2017-05-15 09:56:09 +01:00
Julius Volz	1c72524870	Fix HTTP error handling in remote.Client.Store() (#2708 ) Regression introduced in `e5d7bbfc3c`	2017-05-11 18:40:10 +02:00
Tom Wilkie	3141a6b36b	Compress remote storage requests and responses with unframed/raw snappy. (#2696 ) * Compress remote storage requests and responses with unframed/raw snappy, for compatibility with other languages. * Remove backwards compatibility code from remote_storage_adapter, update example_write_adapter * Add /documentation/examples/remote_storage/example_write_adapter/example_writer_adapter to .gitignore	2017-05-10 16:42:59 +02:00
Tom Wilkie	2195bb66f7	Ensure ewma int64s are always aligned. (#2675 )	2017-05-03 14:32:50 -05:00
Tom Wilkie	e5d7bbfc3c	Remote writes: retry on recoverable errors. (#2552 ) * Remote writes: retry on recoverable errors. * Add comments * Review feedback * Comments * Review feedback * Final spelling misteak (I hope). Plus, record failed samples correctly.	2017-04-07 00:15:41 +02:00
Brian Brazil	c813c824d4	Separate out remote read responses. Fixes #2574	2017-04-06 15:49:47 +01:00
Julius Volz	5a896033e3	Add remote read external label handling (#2555 ) * Add remote read external label handling This implements rule 1 and 2 from https://docs.google.com/document/d/188YauRgfF0J4CYMigLsVNN34V_kUwKnApBs2dQMfBbs/edit * Use more descriptive example labels in read test * Add comment for querier.addExternalLabels() * Make argument naming in removeLabels() more generic	2017-04-02 17:48:15 +02:00
Julius Volz	3f23aa2cc7	Add headers to indicate remote read/write version Also add Content-Type header.	2017-03-24 17:39:51 +01:00
Julius Volz	94acd3f1d8	Add fanin tests and fix uncovered bugs	2017-03-21 00:08:17 +01:00
Julius Volz	9b33cfc457	Fix/unify context-based remote storage timeouts	2017-03-20 14:17:06 +01:00
Julius Volz	815762a4ad	Move retrieval.NewHTTPClient -> httputil.NewClientFromConfig	2017-03-20 14:17:04 +01:00
Julius Volz	eb14678a25	Make remote read/write use config.HTTPClientConfig	2017-03-20 13:37:50 +01:00
Julius Volz	406b65d0dc	Rename remote.Storage to remote.Writer	2017-03-20 13:15:28 +01:00
Julius Volz	02395a224d	[WIP] Remote Read	2017-03-20 13:13:44 +01:00
Tom Wilkie	75bb0f3253	Review feedback	2017-03-13 21:24:49 +00:00
Tom Wilkie	77cce900b8	Fix tests	2017-03-13 15:21:59 +00:00
Tom Wilkie	b48799a01e	Add license stanza	2017-03-13 14:50:15 +00:00
Tom Wilkie	9d22f030cf	Dynamically reshard the QueueManager based on observed load.	2017-03-13 14:41:16 +00:00
Tom Wilkie	1ab893c6ec	Limit 'discarding sample' logs to 1 every 10s (#2446 ) * Limit 'discarding sample' logs to 1 every 10s * Include the vendored library * Review feedback	2017-02-23 19:20:39 +01:00
Julius Volz	2f39dbc8b3	Rename StorageQueueManager -> QueueManager	2017-02-21 21:45:43 +01:00
Julius Volz	e9476b35d5	Re-add multiple remote writers Each remote write endpoint gets its own set of relabeling rules. This is based on the (yet-to-be-merged) https://github.com/prometheus/prometheus/pull/2419, which removes legacy remote write implementations.	2017-02-20 13:23:12 +01:00
Julius Volz	beb3c4b389	Remove legacy remote storage implementations This removes legacy support for specific remote storage systems in favor of only offering the generic remote write protocol. An example bridge application that translates from the generic protocol to each of those legacy backends is still provided at: documentation/examples/remote_storage/remote_storage_bridge See also https://github.com/prometheus/prometheus/issues/10 The next step in the plan is to re-add support for multiple remote storages.	2017-02-14 17:52:05 +01:00
Brian Brazil	1b8a474612	Don't clone the metric if there's no remote writes. The metric clone can't be further optimised, and is a non-trivial memory allocation cost so fast path it if there's no remote writes configured.	2016-12-21 11:34:48 +00:00
Julius Volz	c7932aa009	Remove gRPC leftovers in protobuf definitions	2016-10-05 17:31:04 +02:00
Brian Brazil	77605649a9	Add support for remote write relabelling. Switch back to a single remote writer, as we were only ever meant to have one and the relabel semantics are clearer that way.	2016-10-05 07:43:19 +01:00
Matthew Campbell	67d76e3a5d	timeseries: store varbit encoded data into cassandra	2016-09-21 17:56:55 +02:00
Tom Wilkie	4520e12440	Add HTTP Basic Auth & TLS support to the generic write path. (#1957 ) * Add config, HTTP Basic Auth and TLS support to the generic write path. - Move generic write path configuration to the config file - Factor out config.TLSConfig -> tlf.Config translation - Support TLSConfig for generic remote storage - Rename Run to Start, and make it non-blocking. - Dedupe code in httputil for TLS config. - Make remote queue metrics global.	2016-09-19 22:47:51 +02:00
Tom Wilkie	d83879210c	Switch back to protos over HTTP, instead of GRPC. My aim is to support the new grpc generic write path in Frankenstein. On the surface this seems easy - however I've hit a number of problems that make me think it might be better to not use grpc just yet. The explanation of the problems requires a little background. At weave, traffic to frankenstein need to go through a couple of services first, for SSL and to be authenticated. So traffic goes: internet -> frontend -> authfe -> frankenstein - The frontend is Nginx, and adds/removes SSL. Its done this way for legacy reasons, so the certs can be managed in one place, although eventually we imagine we'll merge it with authfe. All traffic from frontend is sent to authfe. - Authfe checks the auth tokens / cookie etc and then picks the service to forward the RPC to. - Frankenstein accepts the reads and does the right thing with them. First problem I hit was Nginx won't proxy http2 requests - it can accept them, but all calls downstream are http1 (see https://trac.nginx.org/nginx/ticket/923). This wasn't such a big deal, so it now looks like: internet --(grpc/http2)--> frontend --(grpc/http1)--> authfe --(grpc/http1)--> frankenstein Next problem was golang grpc server won't accept http1 requests (see https://groups.google.com/forum/#!topic/grpc-io/JnjCYGPMUms). It is possible to link a grpc server in with a normal go http mux, as long as the mux server is serving over SSL, as the golang http client & server won't do http2 over anything other than an SSL connection. This would require making all our service to service comms SSL. So I had a go a writing a grpc http1 server, and got pretty far. But is was a bit of a mess. So finally I thought I'd make a separate grpc frontend for this, running in parallel with the frontend/authfe combo on a different port - and first up I'd need a grpc reverse proxy. Ideally we'd have some nice, generic reverse proxy that only knew about a map from service names -> downstream service, and didn't need to decode & re-encode every request as it went through. It seems like this can't be done with golang's grpc library - see https://github.com/mwitkow/grpc-proxy/issues/1. And then I was surprised to find you can't do grpc from browsers! See http://www.grpc.io/faq/ - not important to us, but I'm starting to question why we decided to use grpc in the first place? It would seem we could have most of the benefits of grpc with protos over HTTP, and this wouldn't preclude moving to grpc when its a bit more mature? In fact, the grcp FAQ even admits as much: > Why is gRPC better than any binary blob over HTTP/2? > This is largely what gRPC is on the wire.	2016-09-15 23:21:54 +01:00
Tobias Schmidt	29ced0090f	Fix common english misspellings	2016-09-14 23:23:28 -04:00
Tobias Schmidt	8f3b62bfe4	Simplify struct initialization	2016-09-14 23:13:27 -04:00
Fabian Reinartz	7bd7e63f97	storage: fix struct alignment issue in test The uint64 `numCalls` ends up being not word-aligned on certain architectures, which makes atomic reads/writes panic.	2016-09-11 00:32:57 +02:00
Tom Wilkie	d41d91388f	Update for new generic remote storage.	2016-08-30 17:43:29 +02:00
Tom Wilkie	a6931b71e8	Rationalise retrieval metrics so we have the state (success/failed) on both samples and batches, in a consistent fashion. Also, report total queue capacity of all queues, i.e. capacity * shards.	2016-08-30 17:42:42 +02:00
Tom Wilkie	ece12bff93	Shard/parrallelise samples by fingerprint in StorageQueueManager By splitting the single queue into multiple queues and flushing each individual queue in serially (and all queues in parallel), we can guarantee to preserve the order of timestampsin samples sent to downstream systems.	2016-08-30 17:42:36 +02:00
Julius Volz	aa3f2b7216	Generic write cleanups and changes. - fold metric name into labels - return initialization errors back to main - add snappy compression - better context handling - pre-allocation of labels - remove generic naming - other cleanups	2016-08-30 17:24:48 +02:00
Brian Brazil	36d2c4bd0b	Add generic write path using grpc. This uses a new proto format, with scope for multiple samples per timeseries in future. This will allow users to pump samples out to whatever they like without having to change the core Prometheus code. There's also an example receiver to save users figuring out the boilerplate themselves.	2016-08-30 17:19:18 +02:00
Dan Milstein	764ceaa939	Add timeout to test, cap waiting at 1 second	2016-08-24 11:30:38 -04:00

1 2

91 commits