prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-10 15:44:05 -08:00

Author	SHA1	Message	Date
Fabian Reinartz	76076bfb47	discovery: simplify client initialization	2016-04-30 21:07:49 +02:00
Fabian Reinartz	b5bfb502df	discovery: properly check context on chan send	2016-04-30 11:57:20 +02:00
Fabian Reinartz	9f8feb9ff6	discovery: consolidate Marathon SD files	2016-04-30 11:56:11 +02:00
Fabian Reinartz	086f7caceb	discovery: extract Consul shouldWatch logic	2016-04-30 11:50:19 +02:00
Fabian Reinartz	e805e68c01	discovery: sanitize Consul service discovery This commits simplifies the SD's structure and ensures that all channel sends are checked against a canceled context.	2016-04-30 11:50:19 +02:00
Fabian Reinartz	5837e6a97f	discovery: move consul SD into own package	2016-04-25 16:56:27 +02:00
beorn7	d566808d40	Bring back logging of discarded samples But only on DEBUG level. Also, count and report the two cases of out-of-order timestamps on the one hand and same timestamp but different value on the other hand separately.	2016-04-25 16:43:52 +02:00
Fabian Reinartz	585ab6b163	Merge pull request #1494 from iamseth/master Add discovery capability for Microsoft Azure	2016-04-21 13:49:44 +02:00
Jonathan Boulle	38098f8c95	Add missing license headers Prometheus is Apache 2 licensed, and most source files have the appropriate copyright license header, but some were missing it without apparent reason. Correct that by adding it.	2016-04-13 16:08:22 +02:00
Seth Miller	0988e3b937	Add support for Azure discovery This change adds the ability to do target discovery with Microsoft's Azure platform.	2016-04-06 22:47:02 -05:00
Fabian Reinartz	769389e559	Fix potential race in ctx intialization	2016-04-05 20:27:31 +02:00
Tobias Schmidt	e82ef154ee	Remove unused code leftovers	2016-04-02 20:20:55 -04:00
stuart nelson	dbe5d18b6e	Instrument scrape pool `sync()` Instruments: - duration - count	2016-03-14 18:30:16 +01:00
stuart nelson	813f61e551	Merge pull request #1484 from prometheus/instrument-retrieval Instrument retrieval/scrape.go	2016-03-11 12:26:00 +01:00
stuart nelson	a1ee77601a	Instrument the duration of the `reload` function	2016-03-11 12:12:42 +01:00
Fabian Reinartz	895f2f092f	Fix flaky scrape test t	2016-03-09 16:00:33 +01:00
Fabian Reinartz	f2e359962c	Sort exported targets	2016-03-08 17:12:27 +01:00
Fabian Reinartz	56fc9bdff3	Handle closed target provider channel This fixes the case where a target provider closes the update channel and exits before the context is canceled. This should only be true for the static provider but it's safer to generally handle this case.	2016-03-08 15:49:03 +01:00
beorn7	d44b83690e	Fix flaky file-sd test	2016-03-07 15:39:18 +01:00
Fabian Reinartz	ddc74f712b	Add sortable target list	2016-03-02 09:10:20 +01:00
Fabian Reinartz	499f4af4aa	Test target URL	2016-03-01 14:49:57 +01:00
Fabian Reinartz	50c2f20756	Add targetScraper tests	2016-03-01 14:33:28 +01:00
Fabian Reinartz	1ede7b9d72	Consolidate TargetStatus into Target. This commit simplifies the TargetHealth type and moves the target status into the target itself. This also removes a race where error and last scrape time could have been out of sync.	2016-03-01 14:33:21 +01:00
Fabian Reinartz	2060a0a15b	Turn target group members into plain lists. As the scrape pool deduplicates targets now, it is no longer necessary to store a hash map for members of each group.	2016-03-01 14:33:12 +01:00
Fabian Reinartz	0d7105abee	Remove scrape config from Target. This commit removes the scrapeConfig entirely from Target. All identity defining parameters are thus immutable now and the mutex can be removed.. Target identity is now correctly defined by the labels and the full URL. This in particular includes URL parameters that are not specified in the label set. Fingerprint is also removed from hash to remove an unnecessary tight coupling to the common/model package.	2016-03-01 14:32:57 +01:00
Fabian Reinartz	75681b691a	Extract HTTP client from Target. The HTTP client is the same across all targets with the same scrape configuration. Thus, this commit moves it into the scrape pool.	2016-03-01 14:31:57 +01:00
Fabian Reinartz	9bea27ae8a	Add scraping tests	2016-03-01 14:00:48 +01:00
Fabian Reinartz	76a8c6160d	Deduplicate targets in scrape pool. With this commit the scrape pool deduplicates incoming targets before scraping them. This way multiple target providers can produce the same target but it will be scraped only once.	2016-03-01 13:50:51 +01:00
Fabian Reinartz	84f74b9a84	Apply new scrape config on reload. This commit updates a target set's scrape configuration on reload. This will cause all running scrape loops to be stopped and started again with new parameters.	2016-03-01 13:50:51 +01:00
Fabian Reinartz	02f635dc24	Remove interval/timeout from Target internals	2016-03-01 13:50:51 +01:00
Fabian Reinartz	775316f8d2	Move appender construction from Target to scrapePool	2016-03-01 13:50:51 +01:00
Fabian Reinartz	fbe251c2df	Fix scrape interval length calculation	2016-03-01 13:48:36 +01:00
Fabian Reinartz	1a3253e8ed	Make scrape time unambigious. This commit changes the scraper interface to accept a timestamp so the reported timestamp by the caller and the timestamp attached to samples does not differ.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	2bb8ef99d1	Test scrape loop behavior.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	c7bbe95597	Remove outdated target tests	2016-03-01 13:48:36 +01:00
Fabian Reinartz	05de8b7f8d	Extract target scraping into scrape loop. This commit factors out the scrape loop handling into its own data structure. For the transition it will be directly attached to the target.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	cebba3efbb	Simplify and fix TargetManager reloading	2016-03-01 13:48:36 +01:00
Fabian Reinartz	da99366f85	Consolidate Target.Update into constructor. The Target.Update method is no longer needed.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	d15adfc917	Preserve target state across reloads. This commit moves Scraper handling into a separate scrapePool type. TargetSets only manage TargetProvider lifecycles and sync the retrieved updates to the scrapePool. TargetProviders are now expected to send a full initial target set within 5 seconds. The scrapePools preserve target state across reloads and only drop targets after the initial set was synced.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	5b30bdb610	Change TargetProvider interface. This commit changes the TargetProvider interface to use a context.Context and send lists of TargetGroups, rather than single ones.	2016-03-01 13:48:36 +01:00
Fabian Reinartz	bb6dc3ff78	Remove old tests	2016-03-01 13:48:36 +01:00
Fabian Reinartz	5bfa4cdd46	Simplify target update handling. We group providers by their scrape configuration. Each provider produces target groups with an unique identifier. On stopping a set of target providers we cancel the target providers, stop scraping the targets and wait for the scrapers to finish. On configuration reload all provider sets are stopped and new ones are created. This will make targets disappear briefly on configuration reload. Potentially scrapes are missed but due to the consistent scrape intervals implemented recently, the impact is minor.	2016-03-01 13:48:36 +01:00
Jimmi Dyson	e59b7c15a3	Kubernetes SD: Fix node IP discovery	2016-03-01 12:24:52 +00:00
beorn7	33a50e69f7	Fix a deadlock Double acquisition of the RLock usually doesn't blow up, but if the write lock is called for between the two RLock's, we are deadlocked. This deadlock does not exist in release-0.17, BTW.	2016-02-29 16:34:29 +01:00
beorn7	fd5108b038	Fix a targetmanager test	2016-02-22 16:43:48 +01:00
Fabian Reinartz	6df1f49c13	Remove fullLabels method and fix target updating With recent changes to a Target's internal data representation updating by fullLabels() assigns the additional default instance label. This breaks target identity comparison and causes identical targets from service discovery to be constantly swapped.	2016-02-22 13:06:30 +01:00
Fabian Reinartz	825831e98f	Use fingerprint for target identity comparison So far we were using the InstanceIdentifier to compare equality of targets. This is not always accurate, for example for the blackbox exporter where the actual target is in the parameter.	2016-02-17 16:34:53 +01:00
Fabian Reinartz	66767121ab	Handle scrape timeout on request. For historic reasons we were enforcing a timeout directly via the TCP dialer. This is no longer necessary for quite a while now. Switching to context.Context will allow us to properly terminate requests on shutdown as well.	2016-02-16 11:46:02 +01:00
Julius Volz	293486c7b1	Remove old superfluous calls to setLastScrape(). This is called from within the scrape()->report() flow now. See https://github.com/prometheus/prometheus/pull/1394/files#r52945817	2016-02-15 22:42:24 +01:00
Fabian Reinartz	a0078ec84c	Merge pull request #1394 from prometheus/scraperef2 Refactor and test appender modifications	2016-02-15 21:19:40 +01:00
Fabian Reinartz	463dd3ea06	Refactor target scrape reporting.	2016-02-15 18:06:15 +01:00
Fabian Reinartz	cd28b88b08	Fix wrong EOF error on successful target scraping	2016-02-15 17:23:04 +01:00
Fabian Reinartz	27d71b08d1	Factor out appender wrapping	2016-02-15 16:47:39 +01:00
Fabian Reinartz	fe7e91e2eb	Make scraping offset consistent. To evenly distribute scraping load we currently rely on random jittering. This commit hashes over the target's identity and calculates a consistent offset. This also ensures that scrape intervals are constantly spaced between config/target changes.	2016-02-15 16:46:29 +01:00
Fabian Reinartz	a06bc75519	Remove occurrences of 'base' labels	2016-02-15 10:36:57 +01:00
Fabian Reinartz	0d44248fb8	Cleanup cluttered test data	2016-02-13 10:13:38 +01:00
Fabian Reinartz	65eba080a0	Cleanup internal target data	2016-02-13 10:13:38 +01:00
Julius Volz	9b6d69610a	Fix various typos in comments. Helpfully reported by https://goreportcard.com/report/github.com/prometheus/prometheus :)	2016-02-10 03:47:00 +01:00
Julius Volz	3728b5872f	Fix target update error handling. Fixes https://github.com/prometheus/prometheus/issues/1378	2016-02-08 21:42:59 +01:00
Fabian Reinartz	1f877f3d2a	Fix deadlock, structure target logging	2016-02-03 10:39:34 +01:00
Fabian Reinartz	d0d2c38c68	Fix tests for append API changes	2016-02-03 10:17:08 +01:00
Fabian Reinartz	59f1e722df	Return error on sample appending	2016-02-02 14:01:44 +01:00
Björn Rabenstein	9ea3897ea7	Merge pull request #1354 from prometheus/beorn7/storage Rework the way to communicate backpressure (AKA suspended ingestion)	2016-02-01 15:10:13 +01:00
beorn7	ec08c9a391	Rework the way to communicate backpressure (AKA suspended ingestion) This gives up on the idea to communicate throuh the Append() call (by either not returning as it is now or returning an error as suggested/explored elsewhere). Here I have added a Throttled() call, which has the advantage that it can be called before a whole _batch_ of Append()'s. Scrapes will happen completely or not at all. Same for rule group evaluations. That's a highly desired behavior (as discussed elsewhere). The code is even simpler now as the whole ingestion buffer could be removed. Logging of throttled mode has been streamlined and will create at most one message per minute.	2016-02-01 14:45:44 +01:00
beorn7	a7408bfb47	Unify duration parsing It's actually happening in several places (and for flags, we use the standard Go time.Duration...). This at least reduces all our home-grown parsing to one place (in model).	2016-01-29 15:41:50 +01:00
Jimmi Dyson	9faa7515c6	Kubernetes SD: Refactor to handle missing Kubernetes events	2016-01-19 20:49:58 +00:00
Brian Brazil	4a829e63a2	Merge pull request #1299 from PrFalken/master Support AirBnB's Smartstack Nerve client for SD	2016-01-18 13:31:04 +00:00
Julien Dehee	061fe2f364	Support AirBnB's Smartstack Nerve client for SD nerve's registration format differs from serverset. With this commit there is now a dedicated treecache file in util, and two separate files for serverset and nerve. Reference: https://github.com/airbnb/nerve	2016-01-18 14:07:28 +01:00
Brian Brazil	7a5f019c40	Use up/down in UI for consistency with 'up' metric.	2016-01-12 12:09:20 +00:00
Brian Brazil	6b7629be27	Merge pull request #1242 from tommyulfsparre/watcher-fix Reduces watches in serverset	2015-12-10 10:43:57 +00:00
Jimmi Dyson	c12fb447b8	Kubernetes SD: Use first TCP service port as target port & clean up example config Fixes #1256	2015-12-08 10:29:40 +00:00
Tommy Ulfsparre	83e09422bf	skip already watched child nodes.	2015-12-02 21:31:05 +01:00
Fabian Reinartz	29a69eecb8	Do not panic in Consul SD creation	2015-11-30 18:41:48 +01:00
Jimmi Dyson	2cca07381b	KubernetesSD: Create targets for services as well as service endpoints	2015-11-18 14:15:39 +00:00
Brian Brazil	427bf29db1	Add in default port after relabelling. For the SNMP and blackbox exporters where the ports tends to not be 80/443 and indeed there may not be a port this makes the relabelling a bit simpler as you don't have to figure out this logic exists and strip off the :80. This is a breaking change for the example configs of those exporters.	2015-11-08 11:42:18 +00:00
Brian Brazil	fd2bd81cd8	Allow all instance labels in target groups With the blackbox exporter, the instance label will commonly be used for things other than hostnames so remove this restriction. https://example.com or https://example.com/probe/me are some examples. To prevent user error, check that urls aren't provided as targets when there's no relabelling that could potentically fix them.	2015-11-07 14:35:20 +00:00
Fabian Reinartz	9cad147265	Merge pull request #1172 from federicobaldo/ec2_sd_improvements Minor improvements to ec2 service discovery	2015-11-04 13:02:51 +01:00
Federico Baldo	d14d2429ea	Minor improvements to ec2 sd: 1. static credentials replaced with defaults.DefaultChainCredentials. This change ensures that credentials are sourced form all possible providers available with the aws sdk, in the following order: env variables, shared awsconfig file in user folder, ec2 instance role. 2. Added a few labels: AvailabilityZone, PublicDns, VpcId (if available), SubnetId (if in Vpc)	2015-11-02 14:55:24 +01:00
Jimmi Dyson	87940ec213	Kubernetes SD: Rename `masters` to `api_servers` in config	2015-10-24 14:41:14 +01:00
Jimmi Dyson	7ff5cc66ea	Kubernetes SD authentication options cleanup	2015-10-23 16:47:52 +01:00
Jimmi Dyson	ea9a173008	Kubernetes SD: Use node name as instance label	2015-10-12 21:26:09 +01:00
Julius Volz	d88aea7e6f	Fix SD mechanism source prefix handling. The prefixed target provider changed a pointerized target group that was reused in the wrapped target provider, causing an ever-increasing chain of source prefixes in target groups from the Consul target provider. We now make this bug generally impossible by switching the target group channel from pointer to value type and thus ensuring that target groups are copied before being passed on to other parts of the system. I tried to not let the depointerization leak too far outside of the channel handling (both upstream and downstream) because I tried that initially and caused some nasty bugs, which I want to minimize. Fixes https://github.com/prometheus/prometheus/issues/1083	2015-10-09 14:08:22 +02:00
Julius Volz	dec9fc9c32	Merge pull request #1148 from prometheus/fix-serverset-multiple-paths Fix watching multiple Zookeeper paths in serverset SD.	2015-10-08 19:27:06 +02:00
Matt Jibson	dcb4856d72	Add SD for Amazon EC2 instances	2015-10-06 18:36:17 -04:00
Julius Volz	60cf4015a4	Fix watching multiple Zookeeper paths in serverset SD. Fix https://github.com/prometheus/prometheus/issues/1137	2015-10-06 15:54:54 +02:00
Fabian Reinartz	e3b6ec9784	Switch to common/log	2015-10-03 10:21:43 +02:00
Jimmi Dyson	0d61605526	Kubernetes SD example: separate out cluster level components & services	2015-09-29 11:22:18 +01:00
Julius Volz	99e8fff872	Fix target manager CPU busyloop caused by bad done-channel handling. Unfortunately this isn't nicely testable, as it's timing-dependent and one would have to detect a stray goroutine doing a CPU busyloop... Fixes https://github.com/prometheus/prometheus/issues/1114	2015-09-28 11:51:16 +02:00
Fabian Reinartz	097d810f37	Merge pull request #1120 from prometheus/flaky-test retrieval: Reduce flakiness of TestTargetRunScraperScrapes	2015-09-28 09:57:16 +02:00
Brian Brazil	ba6688bfce	retrieval: Reduce flakiness of TestTargetRunScraperScrapes	2015-09-28 08:34:54 +01:00
Brian Brazil	b03569267e	retrieval: Add URL parameters to fullLabels too Move all the special cases into one map, rather than spreading the logic around.	2015-09-26 16:59:24 +01:00
Brian Brazil	50258929ac	Retrieval: Show error message for failed test scrape This is flaky, and I suspect it was due the to I/O timeout that I've already fixed. In case that wasn't it, display the error should it happen again.	2015-09-23 09:24:50 +01:00
Brian Brazil	4bc39dc60e	retrieval: Reduce flakiness of TestTargetManagerChan This will increase test time by a few hundred ms, this is the 2nd most common cause of flakiness.	2015-09-23 09:00:37 +01:00
Brian Brazil	93145b960a	retrieval: Reduce flakiness of target tests Bump timeouts of tests where we don't want I/O timeouts. Adjust the full channel test to be much more reliable, by reducing the ingestion timeout from 1ms to 0.	2015-09-22 19:23:36 +01:00
Fabian Reinartz	cac6eea434	Merge pull request #1105 from prometheus/consulnil Fix nil panic on consul error	2015-09-22 14:55:31 +02:00
Fabian Reinartz	327152862c	Update expfmt.NewDecoder usage	2015-09-22 12:11:28 +02:00
Fabian Reinartz	1ce89a4a0b	Fix nil panic on consul error	2015-09-22 09:04:31 +02:00
Julius Volz	af513468eb	Fix some dead code, missing error checks, shadowings. I applied https://medium.com/@jgautheron/quality-pipeline-for-go-projects-497e34d6567 and was greeted with a deluge of warnings, most of which were not applicable or really fixable realistically. These are some of the first ones I decided to fix.	2015-09-14 12:21:34 +02:00
Jimmi Dyson	7ef9399920	Clean up kubernetes http response bodies	2015-09-11 11:44:28 +01:00
Anders Daljord Morken	9fb65a91af	Close HTTP connections on HTTP errors too. Move defer resp.Body.Close() up to make sure it's called even when the HTTP request returns something other than 200 or Decoder construction fails. This avoids leaking and eventually running out of file descriptors.	2015-09-10 22:41:05 +02:00
Fabian Reinartz	8456b7e12f	Use go1.5.1	2015-09-10 12:11:44 +02:00
Jimmi Dyson	a1574aa2b3	Move TLS options to scrape config Fixes #1013, fixes #989	2015-09-09 09:52:21 +01:00
Julius Volz	b7b7b2e883	Merge pull request #1050 from fabric8io/kubernetes-discovery Kubernetes SD improvements	2015-09-04 14:58:11 +02:00
Jimmi Dyson	d7a7fd4589	Kubernetes SD improvements * Support multiple masters with retries against each master as required. * Scrape masters' metrics. * Add role meta label for node/service/master to make it easier for relabeling.	2015-09-04 11:31:20 +01:00
Fabian Reinartz	cc1a2a2061	Remove attachment of global labels upon ingestion	2015-09-03 14:16:23 +02:00
Fabian Reinartz	ebf417a282	Fix map initialization	2015-09-01 18:06:22 +02:00
Julius Volz	f63a899744	Change config regexes to full-string matches. This anchors all regular expressions entered via the config to match a full string vs. a substring. THIS IS A BREAKING CHANGE! Fixes part of https://github.com/prometheus/prometheus/issues/996	2015-09-01 15:46:41 +02:00
Fabian Reinartz	542da6774e	Fix draining of file watcher events	2015-08-28 12:17:22 +02:00
Daniel Lundin	4abf54b747	serverset: extract shard number from serverset data	2015-08-27 16:26:00 +02:00
Julius Volz	29eaa8c7cf	Merge pull request #1030 from prometheus/fix-flakey-filesd Fix flakey FileSD test.	2015-08-26 13:25:00 +02:00
Julius Volz	3fd5826589	Fix flakey FileSD test. When the test ends, all files matching the watcher's glob are removed via defer. In that moment, the draining goroutine may still be running and then detect no files matching the configured glob just before the test exits. This is now solved by waiting for the draining goroutine to finish before leaving the test function and thus causing the deferred file removal.	2015-08-26 13:06:34 +02:00
Julius Volz	744d5d5a7a	Merge pull request #1029 from prometheus/vet-fixes Fix "go vet" errors.	2015-08-26 12:50:18 +02:00
Julius Volz	995d3b831d	Fix most golint warnings. This is with `golint -min_confidence=0.5`. I left several lint warnings untouched because they were either incorrect or I felt it was better not to change them at the moment.	2015-08-26 12:44:46 +02:00
Julius Volz	963ad82dcb	Fix "go vet" errors. I ignored all errors of the type "composite literal uses unkeyed fields". Most of them are wrong because of https://github.com/golang/go/issues/9171.	2015-08-26 02:05:04 +02:00
Fabian Reinartz	6664b77f36	Merge pull request #1021 from prometheus/appenders move metric modifications into SampleAppenders	2015-08-25 17:47:55 +02:00
Fabian Reinartz	01834fa528	Move metric modifications into SampleAppenders	2015-08-25 15:32:37 +02:00
Fabian Reinartz	d6d88f8950	Add missing license headers	2015-08-24 19:19:21 +02:00
Julius Volz	d36a7f4e6f	Fix busylooping in case of no target providers. merge() closes the channel that handleUpdates() reads from when there are zero configured target providers in the configuration. In that case, the for-select loop in handleUpdates() entered a busy loop. It should exit when the upstream channel is closed.	2015-08-24 16:42:28 +02:00
Fabian Reinartz	3a0145c09e	Reenable blocked appending tests	2015-08-22 09:47:57 +02:00
Fabian Reinartz	438e232c9b	Fix grouping of import blocks	2015-08-22 09:42:45 +02:00
Fabian Reinartz	6d0f58dcf3	sanitize scrape health recording code	2015-08-21 23:01:08 +02:00
Fabian Reinartz	25bf5fdaf5	Timeout sample appends	2015-08-21 18:04:35 +02:00
Fabian Reinartz	11a577fcd0	Switch to common/expfmt for extraction	2015-08-21 13:33:38 +02:00
Fabian Reinartz	306e8468a0	Switch from client_golang/model to common/model	2015-08-21 13:33:38 +02:00
Sharif Nassar	6cb519fe82	Add Consul ServiceID to the discovery meta labels.	2015-08-20 14:04:42 -07:00
Fabian Reinartz	0f5022c091	Add missing Kubernetes doc strings	2015-08-18 14:37:28 +02:00
Fabian Reinartz	f592740bac	Only exit static target provider on done	2015-08-18 11:51:53 +02:00
Julius Volz	b4adf2723d	Merge pull request #994 from robbiet480/consul-datacenter-name Pass through current agent Consul datacenter name	2015-08-18 01:09:24 +02:00
Robbie Trencheny	48e461f7db	Pass through current agent Consul datacenter name Instead of only filling __meta_consul_dc when datacenter is set in consul_sd_config this change fills the label based on what the agent reports it's current data center is, if datacenter isn't manually set, otherwise it uses whatever datacenter was set to.	2015-08-17 16:00:26 -07:00
Fabian Reinartz	d0a90964c1	Fix license header	2015-08-17 19:51:12 +02:00
Fabian Reinartz	eabbdc6603	Add missing license headers	2015-08-17 19:49:10 +02:00
Julius Volz	47a96bff1a	Update constant names in comments.	2015-08-17 15:05:06 +02:00
Brian Brazil	e1d5eb52f2	retrieval: Don't include unmatched source of regex in replacement. ReplaceAllString only replaces the matching part of the regex, the unmatched bits around it are left in place. This is not the expected or desired behaviour as the replacement string should be everything. This may break users dependant on this behaviour, but what they're doing is still possible.	2015-08-17 00:31:56 +01:00
Fabian Reinartz	3c6dd161d7	Scrape all services on empty services list.	2015-08-14 17:39:41 +02:00
Fabian Reinartz	9b9ff66212	Merge pull request #977 from prometheus/fabxc/target-dedup Improve target discovery pipeline	2015-08-14 16:38:16 +02:00
Fabian Reinartz	8fa9ec278b	Add application labels as meta labels Removes built-in conditional scraping based on application's 'prometheus' label.	2015-08-14 15:34:02 +02:00
Fabian Reinartz	f269943950	Adjust Kubernetes SD to pipeline changes	2015-08-14 13:30:27 +02:00
Fabian Reinartz	4e84b86510	Improve target discovery pipeline Replace the TargetProvider Stop method with done channels that ensure properly broadcasted shutdown of the whole pipeline.	2015-08-14 13:30:27 +02:00
Fabian Reinartz	15b4115a25	Merge pull request #986 from prometheus/fabxc/tpdoc Clarify docs of TargetProvider	2015-08-14 12:07:15 +02:00
Fabian Reinartz	625374ee36	Clarify docs of TargetProvider	2015-08-14 12:02:22 +02:00
Fabian Reinartz	f7e3722388	Rename __meta_dns_srv_name to __meta_dns_name This is change potentially breaking relabeling rules.	2015-08-13 17:02:56 +02:00
Fabian Reinartz	b964da4b75	Merge pull request #905 from fabric8io/kubernetes-discovery Kubernetes discovery	2015-08-13 15:08:32 +02:00
Fabian Reinartz	24e91720ad	Merge pull request #980 from prometheus/map-labels Retrieval: Add relabel action to map labels names with a regex.	2015-08-13 14:36:59 +02:00
Brian Brazil	4e70a0a14e	Retrieval: Add relabel action to map label names with a regex. The intended use case is where a user has tags/labels coming from metadata in Kubernetes or EC2, and wants to make some subset of them into target labels.	2015-08-13 13:19:11 +01:00
Jimmi Dyson	923f8111d4	Initial Kubernetes discovery Fixes #904	2015-08-13 10:38:52 +01:00
Miek Gieben	caaa3de4ff	Make HashMod use MD5 instead of FNV MD5 will will distribute the inputs more uniformly over the output space than FNV; leading to more evenly balanced load when using HashMod.	2015-08-13 09:42:07 +01:00
Fabian Reinartz	0138d37458	Improve unique target group sources. Include position of same SD mechanisms within the same scrape configuration. Move unique prefixing out of SD implementations and target manager into its own interface.	2015-08-10 11:29:09 +02:00
Fabian Reinartz	54202bc5a8	Merge pull request #902 from xperimental/feature/marathon-discovery retrieval/discovery: Service discovery using marathon API	2015-08-10 01:43:37 +02:00
Robert Jacob	4d0f974c42	Add service discovery using Marathon API.	2015-08-10 01:36:24 +02:00
Will Rouesnel	7810448dbe	Add proxy_url parameter to allow specifying per-job HTTP proxy servers Allow scrape_configs to have an optional proxy_url option which specifies a proxy to be used for all connections to hosts in that config. Internally this modifies the various client functions to take a *url.URL pointer which currently must point to an HTTP proxy (but has been left open-ended to allow the url format to be extended to support others, such as maybe SOCKS if needed).	2015-08-08 04:29:27 +10:00
Jimmi Dyson	da4c50a6cf	Make scheme relabelable via discovery	2015-08-06 12:00:33 +01:00
Jimmi Dyson	52cf6b3e6e	Configuration options for bearer tokens, client certs & CA certs Fixes #918, fixes #917	2015-08-04 17:18:46 +01:00
Florian Pfitzer	1fa0b0f253	fix consul port label	2015-07-31 16:20:17 +00:00
Brian Brazil	adf7f16d1a	Merge pull request #934 from prometheus/query-params Retrieval: Make it possible to relabel query params	2015-07-31 11:01:45 +01:00
Brian Brazil	d8875d17d8	Retrieval: Make it possible to relabel query params This only allows relabelling the first value for a given parameter, this should be sufficient in practice.	2015-07-31 10:09:28 +01:00
Johannes 'fish' Ziemke	6e7d743cd4	Merge pull request #946 from prometheus/add-sd-dns-a Add support for A record based DNS SD	2015-07-30 16:01:47 +02:00
Johannes 'fish' Ziemke	9ab340e95e	Add support for A record based DNS SD If using A records, the user needs to specify "port" and set "type" to "A".	2015-07-30 15:55:38 +02:00
beorn7	645f6772e5	Add Consul Address, ServicePort, and ServiceAddress to the meta labels. In setups where the ServiceAddress is the relevant address for scraping, users can relabel the `__address__` label to ServiceAddress + ":" + ServicePort. This needs to be documented, of course. Will do once this is LGTM'd.	2015-07-22 18:19:13 +02:00
Julius Volz	9d98910fca	Revert "Use Consul ServiceAddress instead of Address when set" This reverts commit `0ac7e7217e`. See discussion on https://github.com/prometheus/prometheus/pull/812 for reasoning. While fixing one use case, it breaks others, and we need a more generic way of handling this.	2015-07-22 13:04:29 +02:00
Fabian Reinartz	d53cc7935d	retrieval: avoid race conditions	2015-07-08 21:27:52 +02:00
Brian Brazil	3d268d681e	retrieval: Handle serverset node not existing. This stops configuration loading hanging if the Znode doesn't exist, and retries until the node does exist.	2015-07-01 13:56:31 +01:00
Fabian Reinartz	080e067601	Merge pull request #832 from prometheus/fabxc/target-test retrieval: double timeout in target scrape test.	2015-06-25 17:23:52 +02:00
Brian Brazil	52859b8033	Merge pull request #836 from prometheus/shard Add 'hashmod' relabel action.	2015-06-24 21:40:10 +01:00
Brian Brazil	682f949ab1	Add 'hashmod' relabel action. This takes the modulus of a hash of some labels. Combined with a keep relabel action, this allows for sharding of targets across multiple prometheus servers.	2015-06-24 21:14:53 +01:00
Fabian Reinartz	23862c92c4	retrieval/discovery: refresh services in Consul to recover from missing events.	2015-06-24 17:48:27 +02:00
Fabian Reinartz	c292979374	retrieval: double timeout in target scrape test.	2015-06-23 21:59:55 +02:00
Julius Volz	d868264bb8	Improve UI of /alerts page. Changes to the UI: - "Active Since" timestamps are now human-readable. - Alerting rules are now pretty-printed better. - Labels are no longer just strings, but alert bubbles (like we do on the status page for base labels). - Alert states and target health states are now capitalized in the presentation layer rather than at the source.	2015-06-23 18:48:45 +02:00
Fabian Reinartz	53b9d5917d	web: improve target URL handling and display.	2015-06-23 13:45:15 +02:00
Fabian Reinartz	dc7d27ab9a	retrieval: add honor label handling and parametrized querying. This commit adds the honor_labels and params arguments to the scrape config. This allows to specify query parameters used by the scrapers and handling scraped labels with precedence.	2015-06-23 13:45:14 +02:00
Fabian Reinartz	459d18cf18	Merge pull request #812 from Marmelatze/consul_services Use Consul ServiceAddress instead of Address when set	2015-06-17 20:10:52 +02:00
Florian Pfitzer	0ac7e7217e	Use Consul ServiceAddress instead of Address when set	2015-06-17 15:39:42 +02:00
Brian Brazil	4d895242f9	Add support for Zookeeper Serversets for SD. It can discover an entire tree of serversets, or just one.	2015-06-16 11:02:08 +01:00
Brian Brazil	0dbae36d36	Allow ingested metrics to be relabeled. The main purpose of this is to allow for blacklisting of expensive metrics as a tactical option. It could also find uses for renaming and removing labels from federation.	2015-06-13 15:18:27 +01:00
Brian Brazil	58ceae82bc	Revert "Allow ingested metrics to be relabeled." This reverts commit `f2f26ca08f`. Was accidentally pushed to master instead of a branch for PR.	2015-06-12 22:12:26 +01:00
Brian Brazil	f2f26ca08f	Allow ingested metrics to be relabeled. The main purpose of this is to allow for blacklisting of expensive metrics as a tactical option. It could also find uses for renaming and removing labels from federation.	2015-06-12 22:06:30 +01:00
Fabian Reinartz	b5fe2e9afe	Merge pull request #773 from prometheus/fabxc/simple-cfg config: simplify default config handling.	2015-06-08 16:22:06 +02:00
Brian Brazil	b8b1d3cbac	Web: Add pre-relabel labels to status page. Figuring out what's going on with the new service discovery and labels is difficult. Add a popover with the labels to the target table to make things simpler, and help discovery of potentially useful labels.	2015-06-08 12:19:01 +01:00
Fabian Reinartz	0af1cff8af	config: simplify default config handling.	2015-06-06 09:04:04 +02:00
Fabian Reinartz	8214b4ee78	retrieval/discovery: surround __meta_consul_tags value with tag seperators.	2015-06-05 19:18:34 +02:00
Fabian Reinartz	280d11dca8	main: exit on invalid rule files on startup.	2015-06-02 18:44:41 +02:00
Fabian Reinartz	0de6edbdfc	Move pkg/ to util/	2015-06-01 21:12:32 +02:00
Fabian Reinartz	dfaf31a1da	Move web/httputils to pkg/httputil and add DeadlineClient to it	2015-06-01 21:12:31 +02:00
Fabian Reinartz	a4f179230a	Merge pull request #744 from prometheus/fabxc/fix-labels Fix discarding of labels in file target groups	2015-05-27 19:57:15 +02:00
Fabian Reinartz	e9b344abee	Fix discarding of labels in file target groups	2015-05-27 18:52:44 +02:00
Fabian Reinartz	8b7e5f9184	Stop holding TargetManager lock when stopping components. TargetProviders may flush some last changes to the target manager before actually stopping. To properly read those form the channel the target manager must not be locked while stopping a provider.	2015-05-27 12:41:37 +02:00
Brian Brazil	f34de493d5	Add increase() function, to replace delta(..., 1). This calculates how much a counter increases over a given period of time, which is the area under the curve of it's rate. increase(x[5m]) is equivilent to rate(x[5m]) * 300.	2015-05-26 22:49:21 +01:00
Fabian Reinartz	efb39cfd4e	Fix file SD test	2015-05-23 21:20:39 +02:00
Julius Volz	267fd34156	Switch Prometheus to use github.com/prometheus/log. This change is conceptually very simple, although the diff is large. It switches logging from "github.com/golang/glog" to "github.com/prometheus/log", while not actually changing any log messages. V(1)-style logging has been changed to be log.Debug*().	2015-05-20 18:19:32 +02:00
Fabian Reinartz	7143dff02f	Add initial implementation for SD via Consul. This commit adds service discovery using Consul's HTTP API and watches (long polling) to retrieve target updates.	2015-05-20 11:46:24 +02:00
Fabian Reinartz	b0c181dc0d	Add Consul SD configuration.	2015-05-20 11:46:24 +02:00
Fabian Reinartz	ff832d2e03	Attach __meta_filepath label to file SD targets.	2015-05-19 15:49:38 +02:00
Fabian Reinartz	8de50619f1	Increase target test wait times On slow systems such as Travis CI occasionally the tests fail because the wait times are too short.	2015-05-19 12:06:52 +02:00
Fabian Reinartz	385919a65a	Avoid inter-component blocking if ingestion/scraping blocks. Appending to the storage can block for a long time. Timing out scrapes can also cause longer blocks. This commit avoids that those blocks affect other compnents than the target itself. Also the Target interface was removed.	2015-05-18 17:58:51 +02:00
Fabian Reinartz	1a2d57b45c	Move template functionality out of target. The target implementation and interface contain methods only serving a specific purpose of the templates. They were moved to the template as they operate on more fundamental target data.	2015-05-18 13:35:43 +02:00
Fabian Reinartz	dbc08d390e	Move target status data into its own object	2015-05-18 11:15:42 +02:00
Fabian Reinartz	9ca47869ed	Provide full SD configs to discovery constructors. Some SD configs may have many options. To be readable and consistent, make all discovery constructors receive the full config rather than the separate arguments.	2015-05-15 14:54:29 +02:00
Fabian Reinartz	93548a8882	Add initial file based service discovery. This commits adds file based service discovery which reads target groups from specified files. It detects changes based on file watches and regular refreshes.	2015-05-15 14:44:54 +02:00
Fabian Reinartz	d5aa012fd0	Make HTTP basic auth configurable for scrape targets.	2015-05-15 12:47:50 +02:00
Fabian Reinartz	bb540fd9fd	Implement config reloading on SIGHUP. With this commit, sending SIGHUP to the Prometheus process will reload and apply the configuration file. The different components attempt to handle failing changes gracefully.	2015-05-13 16:49:46 +02:00
Fabian Reinartz	86087120dd	Replace example config with new YAML format.	2015-05-11 18:14:07 +02:00
Fabian Reinartz	5fbde88919	Switch config to YAML format.	2015-05-07 16:52:14 +02:00
Fabian Reinartz	b5a8f7b8fa	Cleanup, test, and document config.	2015-04-30 21:17:19 +02:00
Fabian Reinartz	945c49a2dd	Add relabelling to target management. This commit adds a relabelling stage on the set of base labels from which a target is created. It allows to drop targets and rewrite any regular or internal label.	2015-04-30 18:46:33 +02:00
Fabian Reinartz	0b619b46d6	Change JobConfig to ScrapeConfig. This commit changes the configuration interface from job configs to scrape configs. This includes allowing multiple ways of target definition at once and moving DNS SD to its own config message. DNS SD can now contain multiple DNS names per configured discovery.	2015-04-28 23:18:55 +02:00
Fabian Reinartz	5015c2a0e8	Make target manager source based. This commit shifts responsibility for maintaining targets from providers and pools to the target manager. Target groups have a source name that identifies them for updates.	2015-04-24 15:49:35 +02:00
Fabian Reinartz	4f8673aa88	Simplify update sync for targets, format config fixtures.	2015-04-19 10:36:26 +02:00
Fabian Reinartz	36184f3530	Show correct error on wrong DNS response.	2015-04-11 16:14:38 +02:00
beorn7	fa1935a644	Remove /api/targets call and do not show job and instance labels on status. /api/targets was undocumented and never used and also broken. Showing instance and job labels on the status page (next to targets) does not make sense as those labels are set in an obvious way. Also add a doc comment to TargetStateToClass.	2015-03-18 18:53:43 +01:00
beorn7	be11cb2b07	Remove the sample ingestion channel. The one central sample ingestion channel has caused a variety of trouble. This commit removes it. Targets and rule evaluation call an Append method directly now. To incorporate multiple storage backends (like OpenTSDB), storage.Tee forks the Append into two different appenders. Note that the tsdb queue manager had its own queue anyway. It was a queue after a queue... Much queue, so overhead... Targets have their own little buffer (implemented as a channel) to avoid stalling during an http scrape. But a new scrape will only be started once the old one is fully ingested. The contraption of three pipelined ingesters was removed. A Target is an ingester itself now. Despite more logic in Target, things should be less confusing now. Also, remove lint and vet warnings in ast.go.	2015-03-15 14:08:22 +01:00
Julius Volz	140eede5e0	Rename UNREACHABLE to UNHEALTHY. The current wording suggests that a target is not reachable at all, although it might also get set when the target was reachable, but there was some other error during the scrape (invalid headers or invalid scrape content). UNHEALTHY is a more general wording that includes all these cases. For consistency, ALIVE is also renamed to HEALTHY.	2015-03-07 23:18:18 +01:00
Sergiusz 'q3k' Bazański	0d0bb3c030	Change instance identifiers to be host:port This changes the PublicURL function into InstanceIdentifier, which now returns a simple <host>:<port> string instead of a full URL.	2015-02-20 16:21:13 +01:00
Sergiusz 'q3k' Bazański	bb69a3d284	Hide HTTP auth parts from URL This instroduces an extra function in the Target interface (PublicURL) which is used to populate the instance field in scraped metrics.	2015-02-19 18:58:47 +01:00
Julius Volz	af627bb2b9	Copy vendored deps manually instead of using Godeps. We were using Godep incorrectly (cloning repos from the internet during build time instead of including Godeps/_workspace in the GOPATH via "godep go"). However, to avoid even having to fetch "godeps" from the internet during build, this now just copies the vendored files into the GOPATH. Also, the protocol buffer library moved from Google Code to GitHub, which is reflected in these updates. This fixes https://github.com/prometheus/prometheus/issues/525	2015-02-17 02:08:56 +01:00
beorn7	11b3c2387c	Improvements after review. - Increase samplesQueueCapacity. - Improve docstring for the above. - Accept a short waiting period for the ingest channel to become ready. This should depend on the http timeout, but 100ms is probably good enough to cushion bursts bigger than samplesQueueCapacity, while it is unlikely that anybody ever will set an HTTP timeout similarly short.	2015-02-10 14:58:46 +01:00
beorn7	0f191629c6	Next try to deal with backed-up ingestion. This is now not even trying to throttle in a benign way, but creates a fully-fledged error. Advantage: It shows up very visible on the status page. Disadvantage: The server does not really adjusts to a lower scraping rate. However, if your ingestion backs up, you are in a very irregulare state, I'd say it _should_ be considered an error and not dealt with in a more graceful way. In different news: I'll work on optimizing ingestion so that we will not as easily run into that situation in the first place.	2015-02-09 17:32:47 +01:00
beorn7	16a1a6d324	Add another check for stopped scraper.	2015-02-06 18:30:33 +01:00
beorn7	5678a86924	Throttle scraping if a scrape took longer than the configured interval. The simple algorithm applied here will increase the actual interval incrementally, whenever and as long as the scrape itself takes longer than the configured interval. Once it takes shorter again, the actual interval will iteratively decrease again.	2015-02-06 16:44:56 +01:00
Bjoern Rabenstein	5859b74f1b	Clean up license issues. - Move CONTRIBUTORS.md to the more common AUTHORS. - Added the required NOTICE file. - Changed "Prometheus Team" to "The Prometheus Authors". - Reverted the erroneous changes to the Apache License.	2015-01-21 20:07:45 +01:00
Bjoern Rabenstein	b09453af1d	Adjust to new client_golang API.	2015-01-21 15:42:25 +01:00
Julius Volz	d6b9e97655	Remove extraction.Result type, simplify code.	2015-01-08 16:34:01 +01:00
juliusv	917acb6baf	Merge pull request #429 from brian-brazil/scrape-time Have scrape time as a pseudovariable, not a prometheus variable.	2015-01-02 13:22:04 +01:00
Brian Brazil	e56786b221	Have scrape time as a pseudovariable, not a prometheus variable. This ensures it has the right timestamp, and is easier to work with. Switch sd variable away from 'outcome', using total/failed instead.	2014-12-27 00:39:33 +00:00
Brian Brazil	89c43dd0d7	Sort targets on the status page. Change-Id: I6b59c97ab50093c50b608e29be2304475bc5d9f6	2014-12-26 13:14:19 +00:00
Johannes 'fish' Ziemke	ff95a52b0f	Rename Address to URL The "Address" is actually a URL which may contain username and password. Calling this Address is misleading so we rename it. Change-Id: I441c7ab9dfa2ceedc67cde7a47e6843a65f60511	2014-12-18 12:18:16 +01:00
Bjoern Rabenstein	b1e4956142	Apply a giant code cleanup. Essentially: - Remove unused code. - Make it 'go vet' clean. The only remaining warnings are in generated code. - Make it 'golint' clean. The only remaining warnings are in gerenated code. - Smoothed out same minor things. Change-Id: I3fe5c1fbead27b0e7a9c247fee2f5a45bc2d42c6	2014-12-10 16:16:49 +01:00
Bjoern Rabenstein	89bb376bce	Reduce lock-protected area during scrape. Change-Id: Iaa7faa7c916b1890b568d05bd8bfff6299b6767d	2014-12-05 19:40:41 +01:00
Bjoern Rabenstein	fee88a7a77	Remove the remaining races, new and old. Also, resolve a few other TODOs. Change-Id: Icb39b5a5e8ca22ebcb48771cd8951c5d9e112691	2014-12-03 18:07:23 +01:00
Bjoern Rabenstein	14bda4180c	Changes after pair code review. Change-Id: Ib72d40f8e9027818cfbbd32a7a7201eebda07455	2014-11-25 17:12:59 +01:00
Bjoern Rabenstein	a2feed343a	Convert another occurrence from chan bool to chan struct{}. Change-Id: I11ba127a934ee3aec0fcd139ad32a7751cff77a0	2014-11-25 17:10:39 +01:00
Bjoern Rabenstein	74c143c4c9	Improve scraper shutdown time. - Stop target pools in parallel. - Stop individual scrapers in goroutines, too. - Timing tweaks. Change-Id: I9dff1ee18616694f14b04408eaf1625d0f989696	2014-11-25 17:10:39 +01:00
Bjoern Rabenstein	92156ee89d	Drain the newBaseLabels channel upon shutdown. This should help cut down shutdown times. Change-Id: I6e70a598a9e49aa6eeeb2034105b1bc6e9014324	2014-11-25 17:10:39 +01:00
Bjoern Rabenstein	6b37e47f9e	Remove unused metrics. Change-Id: Icf03ba4ce92a5e38daf12930f9661daba79c83bb	2014-11-25 17:09:03 +01:00
Bjoern Rabenstein	4fc8ad6677	Fix retrieval unit tests. Change-Id: I299b71406b59539230e5182ccc37bc8a83af60b3	2014-11-25 17:08:45 +01:00
Bjoern Rabenstein	b3ed9aa7a2	Clean up start-up and shut-down. Change-Id: Idff4bbb0a15a9f879bfbb3da5b1025179cab5e2c	2014-11-25 17:08:45 +01:00
Bjoern Rabenstein	4447708c9f	Fix a race in target.go. Also, fix problems in shutdown. Starting serving and shutdown still has to be cleaned up properly. It's a mess. Change-Id: I51061db12064e434066446e6fceac32741c4f84c	2014-11-25 17:08:45 +01:00
Bjoern Rabenstein	38fc24d0ed	Fix targetpool_test.go and other tests. Change-Id: I91a4dd1d39e01f174e1aaae653ce1ed7aecaa624	2014-11-25 17:08:26 +01:00
Julius Volz	7f5d3c2c29	Fix and improve the fp locker. Benchmark: $ go test -bench 'Fingerprint' -test.run 'Fingerprint' -test.cpu=1,2,4 OLD BenchmarkFingerprintLockerParallel 500000 3618 ns/op BenchmarkFingerprintLockerParallel-2 100000 12257 ns/op BenchmarkFingerprintLockerParallel-4 500000 10164 ns/op BenchmarkFingerprintLockerSerial 10000000 283 ns/op BenchmarkFingerprintLockerSerial-2 10000000 284 ns/op BenchmarkFingerprintLockerSerial-4 10000000 288 ns/op NEW BenchmarkFingerprintLockerParallel 1000000 1018 ns/op BenchmarkFingerprintLockerParallel-2 1000000 1164 ns/op BenchmarkFingerprintLockerParallel-4 2000000 910 ns/op BenchmarkFingerprintLockerSerial 50000000 56.0 ns/op BenchmarkFingerprintLockerSerial-2 50000000 47.9 ns/op BenchmarkFingerprintLockerSerial-4 50000000 54.5 ns/op Change-Id: I3c65a43822840e7e64c3c3cfe759e1de51272581	2014-11-25 17:07:45 +01:00
Bjoern Rabenstein	e0a6cb281e	Fix the accept header. A '/' is a separator and has to be in a quoted string. Change-Id: If7a3a847f84f8f709074d05dc98b5b21e954030c	2014-11-25 17:02:00 +01:00
Brian Brazil	5edf689133	Stagger scrapes to spread out load. Change-Id: Ib141b271e4adfb817886871f86051c207b05cf35	2014-11-25 17:02:00 +01:00
Bjoern Rabenstein	1909686789	Make metrics exported by the Prometheus server itself more consistent. - Always spell out the time unit (e.g. milliseconds instead of ms). - Remove "_total" from the names of metrics that are not counters. - Make use of the "Namespace" and "Subsystem" fields in the options. - Removed the "capacity" facet from all metrics about channels/queues. These are all fixed via command line flags and will never change during the runtime of a process. Also, they should not be part of the same metric family. I have added separate metrics for the capacity of queues as convenience. (They will never change and are only set once.) - I left "metric_disk_latency_microseconds" unchanged, although that metric measures the latency of the storage device, even if it is not a spinning disk. "SSD" is read by many as "solid state disk", so it's not too far off. (It should be "solid state drive", of course, but "metric_drive_latency_microseconds" is probably confusing.) - Brian suggested to not mix "failure" and "success" outcome in the same metric family (distinguished by labels). For now, I left it as it is. We are touching some bigger issue here, especially as other parts in the Prometheus ecosystem are following the same principle. We still need to come to terms here and then change things consistently everywhere. Change-Id: If799458b450d18f78500f05990301c12525197d3	2014-11-25 17:02:00 +01:00
Brian Brazil	4a2b96f848	Remove backoff on scrape failure. Having metrics with variable timestamps inconsistently spaced when things fail will make it harder to write correct rules. Update status page, requires some refactoring to insert a function. Change-Id: Ie1c586cca53b8f3b318af8c21c418873063738a8	2014-11-25 17:02:00 +01:00
Julius Volz	1bb7074fec	Fix HTTP connection leak upon non-OK status. Change-Id: Ie7fbd7dcc089b8306b40631be3e3d736c23c1cd3	2014-11-25 17:02:00 +01:00
Bjoern Rabenstein	bacc31d5cc	Remove work-around that required copying all bytes of a scrape. Now that the subtle bug in matttproud/golang_protobuf_extensions is fixed, we do not need to copy the bytes of a scrape into a buffer first before starting to parse it. Change-Id: Ib73ecae16173ddd219cda56388a8f853332f8853	2014-11-25 17:01:59 +01:00
Bjoern Rabenstein	8956faeccb	Migrate to new client_golang. This change will only be submitted when the new client_golang has been moved to the new version. Change-Id: Ifceb59333072a08286a8ac910709a8ba2e3a1581	2014-11-25 17:01:59 +01:00
Bjoern Rabenstein	814e479723	Treat non-200 HTTP response as error. Change-Id: I2a9f3b47012b3c4839be53aa44c66d16dd41a24a	2014-11-25 17:01:59 +01:00
Bjoern Rabenstein	ca6a4fccef	Weed out our homegrown test.Tester. The Go stdlib has testing.TB now, which fulfills the exact same purpose. Change-Id: I0db9c73400e208ca376b932a02b7e3402234b87c	2014-05-21 19:27:24 +02:00
Brian Brazil	23255f1499	Fix negative Next Retrieval on status page. Change-Id: Ifa754034660a251fee71f166dbf057697ec4e872	2014-05-12 15:24:34 +01:00
Bjoern Rabenstein	64811caaec	Make Prometheus announce its new super-power: text format! Change-Id: Ia2ddfb28999c145e4d46c395381a9bf89d43148c	2014-04-22 18:44:52 +02:00
Julius Volz	84df022025	Cleanup server address handling, support IPv6. This fixes https://github.com/prometheus/prometheus/issues/377, as IPv6 server addresses are now handled correctly. Change-Id: Iebde7cfdadb0a52041472517e6fdcff4303a25ab	2014-03-09 23:31:30 +01:00
Julius Volz	b382e8b7bd	Remove overly verbose DNS-SD logging line. Change-Id: Ie4534437ab88b9a6b99f5cb6c2f32c9588c1fff6	2014-01-24 16:09:41 +01:00
Julius Volz	0378c2ca1f	Nonexistent labels in BY-clauses shouldn't propagate to result. This fixes bug 2. of https://github.com/prometheus/prometheus/issues/374 Change-Id: Ia4a13153616bafce5bf10597966b071434422d09	2014-01-24 16:05:30 +01:00
Stuart Nelson	48a6326d25	Added DNS-SD lookup counter for successful/unsuccessful lookups Change-Id: I0a71e994a989cecace280b5134a31ebc2ace7591	2013-12-16 08:48:56 -05:00
Julius Volz	fb44580110	Cleanup/fix program termination sequence. Change-Id: I2bc58a2583fb079c9ef383cfc7a5e0fbe613f1cd	2013-12-11 15:40:32 +01:00
Julius Volz	740d448983	Use custom timestamp type for sample timestamps and related code. So far we've been using Go's native time.Time for anything related to sample timestamps. Since the range of time.Time is much bigger than what we need, this has created two problems: - there could be time.Time values which were out of the range/precision of the time type that we persist to disk, therefore causing incorrectly ordered keys. One bug caused by this was: https://github.com/prometheus/prometheus/issues/367 It would be good to use a timestamp type that's more closely aligned with what the underlying storage supports. - sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit Unix timestamp (possibly even a 32-bit one). Since we store samples in large numbers, this seriously affects memory usage. Furthermore, copying/working with the data will be faster if it's smaller. MEMORY USAGE RESULTS Initial memory usage comparisons for a running Prometheus with 1 timeseries and 100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my tests, this advantage for some reason decreased a bit the more samples the timeseries had (to 5-7% for millions of samples). This I can't fully explain, but perhaps garbage collection issues were involved. WHEN TO USE THE NEW TIMESTAMP TYPE The new clientmodel.Timestamp type should be used whenever time calculations are either directly or indirectly related to sample timestamps. For example: - the timestamp of a sample itself - all kinds of watermarks - anything that may become or is compared to a sample timestamp (like the timestamp passed into Target.Scrape()). When to still use time.Time: - for measuring durations/times not related to sample timestamps, like duration telemetry exporting, timers that indicate how frequently to execute some action, etc. NOTE ON OPERATOR OPTIMIZATION TESTS We don't use operator optimization code anymore, but it still lives in the code as dead code. It still has tests, but I couldn't get all of them to pass with the new timestamp format. I commented out the failing cases for now, but we should probably remove the dead code soon. I just didn't want to do that in the same change as this. Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f	2013-12-03 09:11:28 +01:00
Johannes 'fish' Ziemke	8c08a5031f	Add search domain support to SRV lookups This adds search domain support by trying to resolve a name by appending each search domain configured in /etc/resolv.conf until the query succeeds (NOERROR) and has at least one answer. Change-Id: Ibdc5138c5d8cc049e11fab90c3d5243d5a06852c	2013-10-29 17:19:49 +01:00
Julius Volz	274934bcd3	Revert "Revert "Merge pull request #317 from prometheus/fix/miekg-dns-for-srv"" This reverts commit `88099328d1`. Change-Id: I7bf74de5fda458e2e6f9eea2eacd0e256f95bdee	2013-09-10 17:48:05 +02:00
Johannes 'fish' Ziemke	88099328d1	Revert "Merge pull request #317 from prometheus/fix/miekg-dns-for-srv" This reverts commit `e3bc6fc9dc`, reversing changes made to `1cf9e5840a`. Conflicts: retrieval/target_provider.go Change-Id: Icb6e98fb30419e9e2fe9b686c243702ced372014	2013-08-30 16:32:51 +02:00
Julius Volz	788587426b	Make scrape timeouts configurable per job. Change-Id: I77a7514ad9e7969771f873d63d6353ec50082a62	2013-08-19 12:21:47 +02:00
Julius Volz	d69b85e6c9	Add global label support via Ingesters.	2013-08-13 16:54:15 +02:00
Julius Volz	0003027dce	Add needed trailing spaces in logs.	2013-08-12 18:22:48 +02:00
Julius Volz	aa5d251f8d	Use github.com/golang/glog for all logging.	2013-08-12 17:54:36 +02:00
Matt T. Proud	a5141e4d0a	Depointerize storage conf. and chain ingester. The storage builders need to work with the assumption that they have a copy of the underlying configuration data if any mutations are made.	2013-08-12 17:07:03 +02:00
Julius Volz	f8b20f30ac	Make retrieval work with client's new Ingester interface.	2013-08-12 15:15:41 +02:00
Julius Volz	3b970c5133	Add variable interpolation to notification messages. This includes required refactorings to enable replacing the http client (for testing) and moving the NotificationReq type definitions to the "notifications" package, so that this package doesn't need to depend on "rules" anymore and that it can instead use a representation of the required data which only includes the necessary fields.	2013-08-12 12:29:08 +02:00
Julius Volz	35ee2cd3cb	Add alertmanager notification support to Prometheus. Alert definitions now also have mandatory SUMMARY and DESCRIPTION fields that get sent along a firing alert to the alert manager.	2013-07-30 17:23:41 +02:00
Julius Volz	81f0b85013	Return [] instead of null for empty result vectors.	2013-07-25 12:16:32 +02:00
Julius Volz	331be19af6	Fix broken retrieval tests. These have been broken since `06b4a40661`	2013-07-25 12:15:00 +02:00
Matt T. Proud	f7704af4f8	Code Review: Formatting comments.	2013-07-15 15:12:01 +02:00
Matt T. Proud	06b4a40661	Represent targets in a tabular interface. This commit represents a target group's endpoints in a tabular fashion for better differentiation of their state in a concise manner.	2013-07-15 15:12:01 +02:00
Matt T. Proud	e20e6980e9	Completely extract response payload for decoding. This commit forces the extraction framework to read the entire response payload into a buffer before attempting to decode it, for the underlying Protocol Buffer message readers do not block on partial messages.	2013-07-14 23:04:08 +02:00
Julius Volz	9a48f57b66	Continue scraping old targets on SD fail. When we have trouble resolving the targets for a job via service discovery, we shouldn't just stop scraping the targets we currently have.	2013-07-12 22:38:42 +02:00
juliusv	24715f0ee5	Merge pull request #322 from prometheus/refactor/client/new-model Include Accept header for telemetry request.	2013-06-27 09:52:00 -07:00
Matt T. Proud	b8c7fd8c34	Include Accept header for telemetry request. This pull request introduces a HTTP Accept header to indicate a preference for Protocol Buffer-encoded messages.	2013-06-27 18:32:28 +02:00
Johannes 'fish' Ziemke	4bdf1adb6c	Use github.com/miekg/dns for resolving SRV records	2013-06-26 16:04:25 +02:00
Matt T. Proud	30b1cf80b5	WIP - Snapshot of Moving to Client Model.	2013-06-25 15:52:42 +02:00
Julius Volz	91cf1e9a26	Fix DNS-SD target refresh condition.	2013-06-13 16:10:39 +02:00
Julius Volz	d9b4f98b44	Integrate DNS-SD support for discovering job targets.	2013-06-12 18:11:48 +02:00
Julius Volz	1fe3d3b06b	Remove obsolete argument from target handling code.	2013-06-11 17:54:58 +02:00
Julius Volz	558281890b	Minor "go tool vet" cleanups	2013-06-07 15:34:41 +02:00
Matt T. Proud	d4db3cf00b	Code Review: Last replacement wins.	2013-06-05 16:29:05 +02:00
Matt T. Proud	9cde48754b	Fix race conditions in TargetPool. The race condition detector found a few anomalies whereby a TargetPool could be read during a mutation. This has been fixed.	2013-06-05 14:44:20 +02:00
Julius Volz	dcfd09c801	Prepend "exporter_" to labels that already exist in exported metrics. If the metrics exported by a process already contain any of a target's base labels (such as "job" or "instance", but also any manually assigned target-group label), don't overwrite that label, but instead add a new label consisting of the original label name prepended with "exporter_". This is to accomodate intermediate exporter jobs, which might indicate e.g. the jobs and instances for which they are exporting data.	2013-06-02 22:48:46 +02:00
Julius Volz	081191afb8	Remember and display last scrape errors in web UI.	2013-05-21 15:31:27 +02:00
Matt T. Proud	8f4c7ece92	Destroy naked returns in half of corpus. The use of naked return values is frowned upon. This is the first of two bulk updates to remove them.	2013-05-16 10:53:25 +03:00
Bernerd Schaefer	428d91c86f	Rename test helper files to helpers_test.go This ensures that these files are properly included only in testing.	2013-05-14 16:30:47 +02:00
Matt T. Proud	244a4a9cdb	Update to go1.1. This commit updates the documentation, Makefiles, formatting, and code semantics to support the 1.1. runtime, which includes ... 1. ``make advice``, 2. ``make format``, and 3. ``go fix`` on various targets.	2013-05-14 12:39:08 +02:00
Julius Volz	9cea5d9df8	Convert the Prometheus configuration to protocol buffers.	2013-04-30 22:26:00 +02:00
Julius Volz	d8110fcd9c	Send sample arrays instead of single samples over channels.	2013-04-29 17:24:17 +02:00
Matt T. Proud	6fac20c8af	Harden the tests against OOMs. This commit employs explicit memory freeing for the in-memory storage arenas. Secondarily, we take advantage of smaller channel buffer sizes in the test.	2013-04-29 11:46:01 +02:00
Bernerd Schaefer	b8bc91d6c0	Target test uses correct telemetry headers	2013-04-29 10:36:08 +02:00
Bernerd Schaefer	c98fc8a495	Merge pull request #196 from prometheus/fix/timeout-target-scrapes Target uses HTTP transport with deadlines	2013-04-29 01:30:54 -07:00
Bernerd Schaefer	b04cd28862	Merge pull request #192 from prometheus/feature/negotiate-telemetry-schema-through-mime-type Use Content-Type data for telemetry versioning	2013-04-29 01:30:37 -07:00
Bernerd Schaefer	3929582892	Target uses HTTP transport with deadlines Instead of externally handling timeouts when scraping a target, we set timeouts on the HTTP connection. This ensures that we don't leak goroutines on timeouts. [fixes #181]	2013-04-29 09:46:40 +02:00
Matt T. Proud	a48ab34dd0	Refresh Prometheus client API usage. The client API has been updated per https://github.com/prometheus/client_golang/pull/9.	2013-04-28 19:40:30 +02:00
Bernerd Schaefer	cf3e6ae084	Add LabelSet helper to fix go 1.0.3 build	2013-04-26 14:27:42 +02:00
Bernerd Schaefer	dfd5c9ce28	Refactor processor for 0.0.2 schema Primary changes: * Strictly typed unmarshalling of metric values * Schema types are contained by the processor (no "type entity002") Minor changes: * Added ProcessorFunc type for expressing processors as simple functions. * Added non-destructive `Merge` method to `model.LabelSet`	2013-04-26 11:52:26 +02:00
Bernerd Schaefer	7c3e04c546	Add version 0.0.2 processor	2013-04-25 17:37:16 +02:00
Bernerd Schaefer	76731c80c6	Use Content-Type data for telemetry versioning ProcessorForRequestHeader now looks first for a header like `Content-Type: application/json; schema="prometheus/telemetry"; version="0.0.1"` before falling back to checking `X-Prometheus-API-Version`.	2013-04-25 16:05:37 +02:00
Julius Volz	d4ff85db5a	Add instance label to health (up) timeseries.	2013-04-24 21:50:49 +02:00
Matt T. Proud	f9e99bd08a	Refresh SampleValue to 64-bit floating point. We always knew that this needed to be fixed.	2013-04-21 20:31:50 +02:00

... 4 5 6 7 8 ...

582 commits