Commit graph

252 commits

Author SHA1 Message Date
Jimmi Dyson ea9a173008 Kubernetes SD: Use node name as instance label 2015-10-12 21:26:09 +01:00
Julius Volz d88aea7e6f Fix SD mechanism source prefix handling.
The prefixed target provider changed a pointerized target group that was
reused in the wrapped target provider, causing an ever-increasing chain
of source prefixes in target groups from the Consul target provider.

We now make this bug generally impossible by switching the target group
channel from pointer to value type and thus ensuring that target groups
are copied before being passed on to other parts of the system.

I tried to not let the depointerization leak too far outside of the
channel handling (both upstream and downstream) because I tried that
initially and caused some nasty bugs, which I want to minimize.

Fixes https://github.com/prometheus/prometheus/issues/1083
2015-10-09 14:08:22 +02:00
Julius Volz dec9fc9c32 Merge pull request #1148 from prometheus/fix-serverset-multiple-paths
Fix watching multiple Zookeeper paths in serverset SD.
2015-10-08 19:27:06 +02:00
Matt Jibson dcb4856d72 Add SD for Amazon EC2 instances 2015-10-06 18:36:17 -04:00
Julius Volz 60cf4015a4 Fix watching multiple Zookeeper paths in serverset SD.
Fix https://github.com/prometheus/prometheus/issues/1137
2015-10-06 15:54:54 +02:00
Fabian Reinartz e3b6ec9784 Switch to common/log 2015-10-03 10:21:43 +02:00
Jimmi Dyson 0d61605526 Kubernetes SD example: separate out cluster level components & services 2015-09-29 11:22:18 +01:00
Julius Volz 99e8fff872 Fix target manager CPU busyloop caused by bad done-channel handling.
Unfortunately this isn't nicely testable, as it's timing-dependent and
one would have to detect a stray goroutine doing a CPU busyloop...

Fixes https://github.com/prometheus/prometheus/issues/1114
2015-09-28 11:51:16 +02:00
Fabian Reinartz 097d810f37 Merge pull request #1120 from prometheus/flaky-test
retrieval: Reduce flakiness of TestTargetRunScraperScrapes
2015-09-28 09:57:16 +02:00
Brian Brazil ba6688bfce retrieval: Reduce flakiness of TestTargetRunScraperScrapes 2015-09-28 08:34:54 +01:00
Brian Brazil b03569267e retrieval: Add URL parameters to fullLabels too
Move all the special cases into one map, rather than
spreading the logic around.
2015-09-26 16:59:24 +01:00
Brian Brazil 50258929ac Retrieval: Show error message for failed test scrape
This is flaky, and I suspect it was due the to I/O timeout that I've
already fixed. In case that wasn't it, display the error should it
happen again.
2015-09-23 09:24:50 +01:00
Brian Brazil 4bc39dc60e retrieval: Reduce flakiness of TestTargetManagerChan
This will increase test time by a few hundred ms,
this is the 2nd most common cause of flakiness.
2015-09-23 09:00:37 +01:00
Brian Brazil 93145b960a retrieval: Reduce flakiness of target tests
Bump timeouts of tests where we don't want I/O timeouts.

Adjust the full channel test to be much more reliable,
by reducing the ingestion timeout from 1ms to 0.
2015-09-22 19:23:36 +01:00
Fabian Reinartz cac6eea434 Merge pull request #1105 from prometheus/consulnil
Fix nil panic on consul error
2015-09-22 14:55:31 +02:00
Fabian Reinartz 327152862c Update expfmt.NewDecoder usage 2015-09-22 12:11:28 +02:00
Fabian Reinartz 1ce89a4a0b Fix nil panic on consul error 2015-09-22 09:04:31 +02:00
Julius Volz af513468eb Fix some dead code, missing error checks, shadowings.
I applied
https://medium.com/@jgautheron/quality-pipeline-for-go-projects-497e34d6567
and was greeted with a deluge of warnings, most of which were not
applicable or really fixable realistically. These are some of the first
ones I decided to fix.
2015-09-14 12:21:34 +02:00
Jimmi Dyson 7ef9399920 Clean up kubernetes http response bodies 2015-09-11 11:44:28 +01:00
Anders Daljord Morken 9fb65a91af Close HTTP connections on HTTP errors too.
Move defer resp.Body.Close() up to make sure it's called even when the
HTTP request returns something other than 200 or Decoder construction
fails. This avoids leaking and eventually running out of file descriptors.
2015-09-10 22:41:05 +02:00
Fabian Reinartz 8456b7e12f Use go1.5.1 2015-09-10 12:11:44 +02:00
Jimmi Dyson a1574aa2b3 Move TLS options to scrape config
Fixes #1013, fixes #989
2015-09-09 09:52:21 +01:00
Julius Volz b7b7b2e883 Merge pull request #1050 from fabric8io/kubernetes-discovery
Kubernetes SD improvements
2015-09-04 14:58:11 +02:00
Jimmi Dyson d7a7fd4589 Kubernetes SD improvements
* Support multiple masters with retries against each master as required.
* Scrape masters' metrics.
* Add role meta label for node/service/master to make it easier for relabeling.
2015-09-04 11:31:20 +01:00
Fabian Reinartz cc1a2a2061 Remove attachment of global labels upon ingestion 2015-09-03 14:16:23 +02:00
Fabian Reinartz ebf417a282 Fix map initialization 2015-09-01 18:06:22 +02:00
Julius Volz f63a899744 Change config regexes to full-string matches.
This anchors all regular expressions entered via the config to match a
full string vs. a substring.

THIS IS A BREAKING CHANGE!

Fixes part of https://github.com/prometheus/prometheus/issues/996
2015-09-01 15:46:41 +02:00
Fabian Reinartz 542da6774e Fix draining of file watcher events 2015-08-28 12:17:22 +02:00
Daniel Lundin 4abf54b747 serverset: extract shard number from serverset data 2015-08-27 16:26:00 +02:00
Julius Volz 29eaa8c7cf Merge pull request #1030 from prometheus/fix-flakey-filesd
Fix flakey FileSD test.
2015-08-26 13:25:00 +02:00
Julius Volz 3fd5826589 Fix flakey FileSD test.
When the test ends, all files matching the watcher's glob are removed
via defer. In that moment, the draining goroutine may still be running
and then detect no files matching the configured glob just before the
test exits.

This is now solved by waiting for the draining goroutine to finish
before leaving the test function and thus causing the deferred file
removal.
2015-08-26 13:06:34 +02:00
Julius Volz 744d5d5a7a Merge pull request #1029 from prometheus/vet-fixes
Fix "go vet" errors.
2015-08-26 12:50:18 +02:00
Julius Volz 995d3b831d Fix most golint warnings.
This is with `golint -min_confidence=0.5`.

I left several lint warnings untouched because they were either
incorrect or I felt it was better not to change them at the moment.
2015-08-26 12:44:46 +02:00
Julius Volz 963ad82dcb Fix "go vet" errors.
I ignored all errors of the type "composite literal uses unkeyed
fields". Most of them are wrong because of
https://github.com/golang/go/issues/9171.
2015-08-26 02:05:04 +02:00
Fabian Reinartz 6664b77f36 Merge pull request #1021 from prometheus/appenders
move metric modifications into SampleAppenders
2015-08-25 17:47:55 +02:00
Fabian Reinartz 01834fa528 Move metric modifications into SampleAppenders 2015-08-25 15:32:37 +02:00
Fabian Reinartz d6d88f8950 Add missing license headers 2015-08-24 19:19:21 +02:00
Julius Volz d36a7f4e6f Fix busylooping in case of no target providers.
merge() closes the channel that handleUpdates() reads from when there
are zero configured target providers in the configuration. In that case,
the for-select loop in handleUpdates() entered a busy loop. It should
exit when the upstream channel is closed.
2015-08-24 16:42:28 +02:00
Fabian Reinartz 3a0145c09e Reenable blocked appending tests 2015-08-22 09:47:57 +02:00
Fabian Reinartz 438e232c9b Fix grouping of import blocks 2015-08-22 09:42:45 +02:00
Fabian Reinartz 6d0f58dcf3 sanitize scrape health recording code 2015-08-21 23:01:08 +02:00
Fabian Reinartz 25bf5fdaf5 Timeout sample appends 2015-08-21 18:04:35 +02:00
Fabian Reinartz 11a577fcd0 Switch to common/expfmt for extraction 2015-08-21 13:33:38 +02:00
Fabian Reinartz 306e8468a0 Switch from client_golang/model to common/model 2015-08-21 13:33:38 +02:00
Sharif Nassar 6cb519fe82 Add Consul ServiceID to the discovery meta labels. 2015-08-20 14:04:42 -07:00
Fabian Reinartz 0f5022c091 Add missing Kubernetes doc strings 2015-08-18 14:37:28 +02:00
Fabian Reinartz f592740bac Only exit static target provider on done 2015-08-18 11:51:53 +02:00
Julius Volz b4adf2723d Merge pull request #994 from robbiet480/consul-datacenter-name
Pass through current agent Consul datacenter name
2015-08-18 01:09:24 +02:00
Robbie Trencheny 48e461f7db Pass through current agent Consul datacenter name
Instead of only filling __meta_consul_dc when datacenter is set in
consul_sd_config this change fills the label based on what the agent
reports it's current data center is, if datacenter isn't manually set,
otherwise it uses whatever datacenter was set to.
2015-08-17 16:00:26 -07:00
Fabian Reinartz d0a90964c1 Fix license header 2015-08-17 19:51:12 +02:00