prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2025-03-05 20:59:13 -08:00

Author	SHA1	Message	Date
Fabian Reinartz	84f74b9a84	Apply new scrape config on reload. This commit updates a target set's scrape configuration on reload. This will cause all running scrape loops to be stopped and started again with new parameters.	2016-03-01 13:50:51 +01:00
Fabian Reinartz	02f635dc24	Remove interval/timeout from Target internals	2016-03-01 13:50:51 +01:00
Fabian Reinartz	775316f8d2	Move appender construction from Target to scrapePool	2016-03-01 13:50:51 +01:00
Fabian Reinartz	fbe251c2df	Fix scrape interval length calculation	2016-03-01 13:48:36 +01:00
Fabian Reinartz	1a3253e8ed	Make scrape time unambigious. This commit changes the scraper interface to accept a timestamp so the reported timestamp by the caller and the timestamp attached to samples does not differ.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	2bb8ef99d1	Test scrape loop behavior.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	c7bbe95597	Remove outdated target tests	2016-03-01 13:48:36 +01:00
Fabian Reinartz	05de8b7f8d	Extract target scraping into scrape loop. This commit factors out the scrape loop handling into its own data structure. For the transition it will be directly attached to the target.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	cebba3efbb	Simplify and fix TargetManager reloading	2016-03-01 13:48:36 +01:00
Fabian Reinartz	da99366f85	Consolidate Target.Update into constructor. The Target.Update method is no longer needed.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	d15adfc917	Preserve target state across reloads. This commit moves Scraper handling into a separate scrapePool type. TargetSets only manage TargetProvider lifecycles and sync the retrieved updates to the scrapePool. TargetProviders are now expected to send a full initial target set within 5 seconds. The scrapePools preserve target state across reloads and only drop targets after the initial set was synced.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	5b30bdb610	Change TargetProvider interface. This commit changes the TargetProvider interface to use a context.Context and send lists of TargetGroups, rather than single ones.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	bb6dc3ff78	Remove old tests	2016-03-01 13:48:36 +01:00
Fabian Reinartz	5bfa4cdd46	Simplify target update handling. We group providers by their scrape configuration. Each provider produces target groups with an unique identifier. On stopping a set of target providers we cancel the target providers, stop scraping the targets and wait for the scrapers to finish. On configuration reload all provider sets are stopped and new ones are created. This will make targets disappear briefly on configuration reload. Potentially scrapes are missed but due to the consistent scrape intervals implemented recently, the impact is minor.	2016-03-01 13:48:36 +01:00
Jimmi Dyson	e59b7c15a3	Kubernetes SD: Fix node IP discovery	2016-03-01 12:24:52 +00:00
beorn7	33a50e69f7	Fix a deadlock Double acquisition of the RLock usually doesn't blow up, but if the write lock is called for between the two RLock's, we are deadlocked. This deadlock does not exist in release-0.17, BTW.	2016-02-29 16:34:29 +01:00
beorn7	fd5108b038	Fix a targetmanager test	2016-02-22 16:43:48 +01:00
Fabian Reinartz	6df1f49c13	Remove fullLabels method and fix target updating With recent changes to a Target's internal data representation updating by fullLabels() assigns the additional default instance label. This breaks target identity comparison and causes identical targets from service discovery to be constantly swapped.	2016-02-22 13:06:30 +01:00
Fabian Reinartz	825831e98f	Use fingerprint for target identity comparison So far we were using the InstanceIdentifier to compare equality of targets. This is not always accurate, for example for the blackbox exporter where the actual target is in the parameter.	2016-02-17 16:34:53 +01:00
Fabian Reinartz	66767121ab	Handle scrape timeout on request. For historic reasons we were enforcing a timeout directly via the TCP dialer. This is no longer necessary for quite a while now. Switching to context.Context will allow us to properly terminate requests on shutdown as well.	2016-02-16 11:46:02 +01:00
Julius Volz	293486c7b1	Remove old superfluous calls to setLastScrape(). This is called from within the scrape()->report() flow now. See https://github.com/prometheus/prometheus/pull/1394/files#r52945817	2016-02-15 22:42:24 +01:00
Fabian Reinartz	a0078ec84c	Merge pull request #1394 from prometheus/scraperef2 Refactor and test appender modifications	2016-02-15 21:19:40 +01:00
Fabian Reinartz	463dd3ea06	Refactor target scrape reporting.	2016-02-15 18:06:15 +01:00
Fabian Reinartz	cd28b88b08	Fix wrong EOF error on successful target scraping	2016-02-15 17:23:04 +01:00
Fabian Reinartz	27d71b08d1	Factor out appender wrapping	2016-02-15 16:47:39 +01:00
Fabian Reinartz	fe7e91e2eb	Make scraping offset consistent. To evenly distribute scraping load we currently rely on random jittering. This commit hashes over the target's identity and calculates a consistent offset. This also ensures that scrape intervals are constantly spaced between config/target changes.	2016-02-15 16:46:29 +01:00
Fabian Reinartz	a06bc75519	Remove occurrences of 'base' labels	2016-02-15 10:36:57 +01:00
Fabian Reinartz	0d44248fb8	Cleanup cluttered test data	2016-02-13 10:13:38 +01:00
Fabian Reinartz	65eba080a0	Cleanup internal target data	2016-02-13 10:13:38 +01:00
Julius Volz	9b6d69610a	Fix various typos in comments. Helpfully reported by https://goreportcard.com/report/github.com/prometheus/prometheus :)	2016-02-10 03:47:00 +01:00
Julius Volz	3728b5872f	Fix target update error handling. Fixes https://github.com/prometheus/prometheus/issues/1378	2016-02-08 21:42:59 +01:00
Fabian Reinartz	1f877f3d2a	Fix deadlock, structure target logging	2016-02-03 10:39:34 +01:00
Fabian Reinartz	d0d2c38c68	Fix tests for append API changes	2016-02-03 10:17:08 +01:00
Fabian Reinartz	59f1e722df	Return error on sample appending	2016-02-02 14:01:44 +01:00
Björn Rabenstein	9ea3897ea7	Merge pull request #1354 from prometheus/beorn7/storage Rework the way to communicate backpressure (AKA suspended ingestion)	2016-02-01 15:10:13 +01:00
beorn7	ec08c9a391	Rework the way to communicate backpressure (AKA suspended ingestion) This gives up on the idea to communicate throuh the Append() call (by either not returning as it is now or returning an error as suggested/explored elsewhere). Here I have added a Throttled() call, which has the advantage that it can be called before a whole _batch_ of Append()'s. Scrapes will happen completely or not at all. Same for rule group evaluations. That's a highly desired behavior (as discussed elsewhere). The code is even simpler now as the whole ingestion buffer could be removed. Logging of throttled mode has been streamlined and will create at most one message per minute.	2016-02-01 14:45:44 +01:00
beorn7	a7408bfb47	Unify duration parsing It's actually happening in several places (and for flags, we use the standard Go time.Duration...). This at least reduces all our home-grown parsing to one place (in model).	2016-01-29 15:41:50 +01:00
Jimmi Dyson	9faa7515c6	Kubernetes SD: Refactor to handle missing Kubernetes events	2016-01-19 20:49:58 +00:00
Brian Brazil	4a829e63a2	Merge pull request #1299 from PrFalken/master Support AirBnB's Smartstack Nerve client for SD	2016-01-18 13:31:04 +00:00
Julien Dehee	061fe2f364	Support AirBnB's Smartstack Nerve client for SD nerve's registration format differs from serverset. With this commit there is now a dedicated treecache file in util, and two separate files for serverset and nerve. Reference: https://github.com/airbnb/nerve	2016-01-18 14:07:28 +01:00
Brian Brazil	7a5f019c40	Use up/down in UI for consistency with 'up' metric.	2016-01-12 12:09:20 +00:00
Brian Brazil	6b7629be27	Merge pull request #1242 from tommyulfsparre/watcher-fix Reduces watches in serverset	2015-12-10 10:43:57 +00:00
Jimmi Dyson	c12fb447b8	Kubernetes SD: Use first TCP service port as target port & clean up example config Fixes #1256	2015-12-08 10:29:40 +00:00
Tommy Ulfsparre	83e09422bf	skip already watched child nodes.	2015-12-02 21:31:05 +01:00
Fabian Reinartz	29a69eecb8	Do not panic in Consul SD creation	2015-11-30 18:41:48 +01:00
Jimmi Dyson	2cca07381b	KubernetesSD: Create targets for services as well as service endpoints	2015-11-18 14:15:39 +00:00
Brian Brazil	427bf29db1	Add in default port after relabelling. For the SNMP and blackbox exporters where the ports tends to not be 80/443 and indeed there may not be a port this makes the relabelling a bit simpler as you don't have to figure out this logic exists and strip off the :80. This is a breaking change for the example configs of those exporters.	2015-11-08 11:42:18 +00:00
Brian Brazil	fd2bd81cd8	Allow all instance labels in target groups With the blackbox exporter, the instance label will commonly be used for things other than hostnames so remove this restriction. https://example.com or https://example.com/probe/me are some examples. To prevent user error, check that urls aren't provided as targets when there's no relabelling that could potentically fix them.	2015-11-07 14:35:20 +00:00
Fabian Reinartz	9cad147265	Merge pull request #1172 from federicobaldo/ec2_sd_improvements Minor improvements to ec2 service discovery	2015-11-04 13:02:51 +01:00
Federico Baldo	d14d2429ea	Minor improvements to ec2 sd: 1. static credentials replaced with defaults.DefaultChainCredentials. This change ensures that credentials are sourced form all possible providers available with the aws sdk, in the following order: env variables, shared awsconfig file in user folder, ec2 instance role. 2. Added a few labels: AvailabilityZone, PublicDns, VpcId (if available), SubnetId (if in Vpc)	2015-11-02 14:55:24 +01:00
Jimmi Dyson	87940ec213	Kubernetes SD: Rename `masters` to `api_servers` in config	2015-10-24 14:41:14 +01:00
Jimmi Dyson	7ff5cc66ea	Kubernetes SD authentication options cleanup	2015-10-23 16:47:52 +01:00
Jimmi Dyson	ea9a173008	Kubernetes SD: Use node name as instance label	2015-10-12 21:26:09 +01:00
Julius Volz	d88aea7e6f	Fix SD mechanism source prefix handling. The prefixed target provider changed a pointerized target group that was reused in the wrapped target provider, causing an ever-increasing chain of source prefixes in target groups from the Consul target provider. We now make this bug generally impossible by switching the target group channel from pointer to value type and thus ensuring that target groups are copied before being passed on to other parts of the system. I tried to not let the depointerization leak too far outside of the channel handling (both upstream and downstream) because I tried that initially and caused some nasty bugs, which I want to minimize. Fixes https://github.com/prometheus/prometheus/issues/1083	2015-10-09 14:08:22 +02:00
Julius Volz	dec9fc9c32	Merge pull request #1148 from prometheus/fix-serverset-multiple-paths Fix watching multiple Zookeeper paths in serverset SD.	2015-10-08 19:27:06 +02:00
Matt Jibson	dcb4856d72	Add SD for Amazon EC2 instances	2015-10-06 18:36:17 -04:00
Julius Volz	60cf4015a4	Fix watching multiple Zookeeper paths in serverset SD. Fix https://github.com/prometheus/prometheus/issues/1137	2015-10-06 15:54:54 +02:00
Fabian Reinartz	e3b6ec9784	Switch to common/log	2015-10-03 10:21:43 +02:00
Jimmi Dyson	0d61605526	Kubernetes SD example: separate out cluster level components & services	2015-09-29 11:22:18 +01:00
Julius Volz	99e8fff872	Fix target manager CPU busyloop caused by bad done-channel handling. Unfortunately this isn't nicely testable, as it's timing-dependent and one would have to detect a stray goroutine doing a CPU busyloop... Fixes https://github.com/prometheus/prometheus/issues/1114	2015-09-28 11:51:16 +02:00
Fabian Reinartz	097d810f37	Merge pull request #1120 from prometheus/flaky-test retrieval: Reduce flakiness of TestTargetRunScraperScrapes	2015-09-28 09:57:16 +02:00
Brian Brazil	ba6688bfce	retrieval: Reduce flakiness of TestTargetRunScraperScrapes	2015-09-28 08:34:54 +01:00
Brian Brazil	b03569267e	retrieval: Add URL parameters to fullLabels too Move all the special cases into one map, rather than spreading the logic around.	2015-09-26 16:59:24 +01:00
Brian Brazil	50258929ac	Retrieval: Show error message for failed test scrape This is flaky, and I suspect it was due the to I/O timeout that I've already fixed. In case that wasn't it, display the error should it happen again.	2015-09-23 09:24:50 +01:00
Brian Brazil	4bc39dc60e	retrieval: Reduce flakiness of TestTargetManagerChan This will increase test time by a few hundred ms, this is the 2nd most common cause of flakiness.	2015-09-23 09:00:37 +01:00
Brian Brazil	93145b960a	retrieval: Reduce flakiness of target tests Bump timeouts of tests where we don't want I/O timeouts. Adjust the full channel test to be much more reliable, by reducing the ingestion timeout from 1ms to 0.	2015-09-22 19:23:36 +01:00
Fabian Reinartz	cac6eea434	Merge pull request #1105 from prometheus/consulnil Fix nil panic on consul error	2015-09-22 14:55:31 +02:00
Fabian Reinartz	327152862c	Update expfmt.NewDecoder usage	2015-09-22 12:11:28 +02:00
Fabian Reinartz	1ce89a4a0b	Fix nil panic on consul error	2015-09-22 09:04:31 +02:00
Julius Volz	af513468eb	Fix some dead code, missing error checks, shadowings. I applied https://medium.com/@jgautheron/quality-pipeline-for-go-projects-497e34d6567 and was greeted with a deluge of warnings, most of which were not applicable or really fixable realistically. These are some of the first ones I decided to fix.	2015-09-14 12:21:34 +02:00
Jimmi Dyson	7ef9399920	Clean up kubernetes http response bodies	2015-09-11 11:44:28 +01:00
Anders Daljord Morken	9fb65a91af	Close HTTP connections on HTTP errors too. Move defer resp.Body.Close() up to make sure it's called even when the HTTP request returns something other than 200 or Decoder construction fails. This avoids leaking and eventually running out of file descriptors.	2015-09-10 22:41:05 +02:00
Fabian Reinartz	8456b7e12f	Use go1.5.1	2015-09-10 12:11:44 +02:00
Jimmi Dyson	a1574aa2b3	Move TLS options to scrape config Fixes #1013, fixes #989	2015-09-09 09:52:21 +01:00
Julius Volz	b7b7b2e883	Merge pull request #1050 from fabric8io/kubernetes-discovery Kubernetes SD improvements	2015-09-04 14:58:11 +02:00
Jimmi Dyson	d7a7fd4589	Kubernetes SD improvements * Support multiple masters with retries against each master as required. * Scrape masters' metrics. * Add role meta label for node/service/master to make it easier for relabeling.	2015-09-04 11:31:20 +01:00
Fabian Reinartz	cc1a2a2061	Remove attachment of global labels upon ingestion	2015-09-03 14:16:23 +02:00
Fabian Reinartz	ebf417a282	Fix map initialization	2015-09-01 18:06:22 +02:00
Julius Volz	f63a899744	Change config regexes to full-string matches. This anchors all regular expressions entered via the config to match a full string vs. a substring. THIS IS A BREAKING CHANGE! Fixes part of https://github.com/prometheus/prometheus/issues/996	2015-09-01 15:46:41 +02:00
Fabian Reinartz	542da6774e	Fix draining of file watcher events	2015-08-28 12:17:22 +02:00
Daniel Lundin	4abf54b747	serverset: extract shard number from serverset data	2015-08-27 16:26:00 +02:00
Julius Volz	29eaa8c7cf	Merge pull request #1030 from prometheus/fix-flakey-filesd Fix flakey FileSD test.	2015-08-26 13:25:00 +02:00
Julius Volz	3fd5826589	Fix flakey FileSD test. When the test ends, all files matching the watcher's glob are removed via defer. In that moment, the draining goroutine may still be running and then detect no files matching the configured glob just before the test exits. This is now solved by waiting for the draining goroutine to finish before leaving the test function and thus causing the deferred file removal.	2015-08-26 13:06:34 +02:00
Julius Volz	744d5d5a7a	Merge pull request #1029 from prometheus/vet-fixes Fix "go vet" errors.	2015-08-26 12:50:18 +02:00
Julius Volz	995d3b831d	Fix most golint warnings. This is with `golint -min_confidence=0.5`. I left several lint warnings untouched because they were either incorrect or I felt it was better not to change them at the moment.	2015-08-26 12:44:46 +02:00
Julius Volz	963ad82dcb	Fix "go vet" errors. I ignored all errors of the type "composite literal uses unkeyed fields". Most of them are wrong because of https://github.com/golang/go/issues/9171.	2015-08-26 02:05:04 +02:00
Fabian Reinartz	6664b77f36	Merge pull request #1021 from prometheus/appenders move metric modifications into SampleAppenders	2015-08-25 17:47:55 +02:00
Fabian Reinartz	01834fa528	Move metric modifications into SampleAppenders	2015-08-25 15:32:37 +02:00
Fabian Reinartz	d6d88f8950	Add missing license headers	2015-08-24 19:19:21 +02:00
Julius Volz	d36a7f4e6f	Fix busylooping in case of no target providers. merge() closes the channel that handleUpdates() reads from when there are zero configured target providers in the configuration. In that case, the for-select loop in handleUpdates() entered a busy loop. It should exit when the upstream channel is closed.	2015-08-24 16:42:28 +02:00
Fabian Reinartz	3a0145c09e	Reenable blocked appending tests	2015-08-22 09:47:57 +02:00
Fabian Reinartz	438e232c9b	Fix grouping of import blocks	2015-08-22 09:42:45 +02:00
Fabian Reinartz	6d0f58dcf3	sanitize scrape health recording code	2015-08-21 23:01:08 +02:00
Fabian Reinartz	25bf5fdaf5	Timeout sample appends	2015-08-21 18:04:35 +02:00
Fabian Reinartz	11a577fcd0	Switch to common/expfmt for extraction	2015-08-21 13:33:38 +02:00
Fabian Reinartz	306e8468a0	Switch from client_golang/model to common/model	2015-08-21 13:33:38 +02:00
Sharif Nassar	6cb519fe82	Add Consul ServiceID to the discovery meta labels.	2015-08-20 14:04:42 -07:00
Fabian Reinartz	0f5022c091	Add missing Kubernetes doc strings	2015-08-18 14:37:28 +02:00
Fabian Reinartz	f592740bac	Only exit static target provider on done	2015-08-18 11:51:53 +02:00
Julius Volz	b4adf2723d	Merge pull request #994 from robbiet480/consul-datacenter-name Pass through current agent Consul datacenter name	2015-08-18 01:09:24 +02:00

1 2 3 4 5 ...

354 commits