Commit graph

587 commits

Author SHA1 Message Date
Julien Pivotto 93e9c010f3
Add more Go leak tests (#7652)
* Implement go leak test for promql

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Implement go leak test for Consul SD

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Implement go leak test in discovery manager

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-24 10:10:20 +01:00
Julien Pivotto 89d2f5ec1d
Merge pull request #7635 from roidelapluie/sdtests2
Tests for digitalocean and Docker Swarm configs
2020-07-22 10:56:37 +02:00
Julien Pivotto 52cdcc2a3b
Add a check-list for new SD's (#7634)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-22 00:07:33 +02:00
Julien Pivotto a197508d09 Add docker swarm test
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-22 00:04:36 +02:00
Björn Rabenstein 79620c78db
Merge pull request #7604 from roidelapluie/swarmsocket
Docker swarm: enable unix socket
2020-07-20 13:11:06 +02:00
johncming 6da680c7e4 discovery/config: add swarmsd config validation.
Signed-off-by: johncming <johncming@yahoo.com>
2020-07-19 22:50:22 +02:00
Julien Pivotto 49f48d8f65 Fix comment
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-17 17:48:05 +02:00
Julien Pivotto 968c86d642 Fix comment
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-17 17:41:02 +02:00
Julien Pivotto 45644c82f6 Docker swarm: enable unix socket
Fixes #7603

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-17 17:38:49 +02:00
Julien Pivotto 93ecf0e14c
Refactor dockerswarm refresh for testing (#7541)
We were missing testing on the behaviour of the configuration
unmarshalling.

This PR adds a refresh command that can be used to test that we
use the correct refresh function.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-09 13:01:08 +02:00
Julien Pivotto 27867412a7
openstack tests: use new test.Cleanup function (#7514)
Since we dependend on go1.14 now, we can use T.Cleanup
https://golang.org/pkg/testing/#T.Cleanup

This provides a nicer approach to shut down the test server.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-06 20:24:16 +02:00
John Bampton 98a69b77d1
Fix spelling (#7512)
Signed-off-by: John Bampton <jbampton@users.noreply.github.com>
2020-07-04 14:54:26 +02:00
Julien Pivotto e1f9816a33
Openstack: Reduce timeouts (#7507)
Set saner values for openstack timeouts

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-02 22:50:32 +02:00
Steffen Neubauer 9c9b872087
OpenStack SD: Add availability config option, to choose endpoint type (#7494)
* OpenStack SD: Add availability config option, to choose endpoint type

In some environments Prometheus must query OpenStack via an alternative
endpoint type (gophercloud calls this `availability`.

This commit implements this option.

Co-Authored-By: Dennis Kuhn <d.kuhn@syseleven.de>
Signed-off-by: Steffen Neubauer <s.neubauer@syseleven.de>
2020-07-02 15:17:56 +01:00
Julien Pivotto aa452d8ab4
digitalocean: use a safer pagination method (#7498)
this method is documented here: https://github.com/digitalocean/godo

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-02 00:13:39 +02:00
Hu Shuai a94b570dc6
Add a unit test for newAzureResourceFromID in discovery/azure/azure.go. (#7484)
This PR is about adding a unit test for newAzureResourceFromID in discovery/azure/azure.go.

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2020-06-30 11:11:57 +01:00
Julien Pivotto 59de58d380
Docker Swarm service discovery (#7420)
* Docker Swarm service discovery

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-26 12:25:58 +02:00
Pierre Souchay 1508678001
Use 10m timeouts for watches (#7423)
use ?wait=10m will give results as fast as usual when data is changing
but will perform far less requests when services do not change.

On large infrastructure, this will reduce quite a lot the number of
qps on Consul servers while having the same performance for freshness
of results.

Signed-off-by: Pierre Souchay <p.souchay@criteo.com>
2020-06-20 20:22:45 +01:00
Julien Pivotto fb9a1a872e
DigitalOcean: limit refresh timeout (#7425)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-20 09:45:28 +02:00
Julien Pivotto c61141ce51
Add DigitalOcean service discovery (#7407)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-18 17:04:41 +02:00
Frederic Branczyk f6c5a75661 discovery/kubernetes: Add Kubernetes EndpointSlice discovery
Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-14 21:55:27 +02:00
Martin Lee b5d61fb66c
Add AMI to labels scraped during service discovery. (#7386)
Signed-off-by: Martin Lee <martin@martinlee.org>
2020-06-11 18:25:58 +01:00
Frederic Branczyk 7b1c0d6b66
discovery/kubernetes: Fix incorrect premature break of reading results
Previously `max` results stopped reading from results in tests
prematurely, as it stopped when `max` number of items were received from
the channel instead of `max` number of unique target groups received.
This caused flaky tests where the same target group was received
multiple times, as Kubernetes informers may emit the same event multiple
times.

Before this patch, running this test repeatedly failed eventually. After
this patch I have run the test many thousand times without failure.

```bash
go test -run TestEndpointsDiscoveryNamespaces -count 1000 -test.v
```

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
2020-06-11 16:08:28 +02:00
Tariq Ibrahim 06a6621b6c
update kubernetes to v1.18.x and update ingress apiVersion
Signed-off-by: Tariq Ibrahim <tariq181290@gmail.com>
2020-06-01 08:26:50 -07:00
Jop Zinkweg 1f69c38ba4
Add discovery support for triton compute nodes (#7250)
Added optional configuration item role, defaults to 'container' (backwards-compatible).
Setting role to 'cn' will discover compute nodes instead.

Human-friendly compute node hostname discovery depends on cmon 1.7.0:
c1a2aeca36

Adjust testcases to use discovery config per case as two different types are now supported.

Updated documentation:
* new role setting
* clarify what the name 'container' covers as triton uses different names in different locations

Signed-off-by: jzinkweg <jzinkweg@gmail.com>
2020-05-22 16:19:21 +01:00
Guangming Wang 5b4006ac86
cleanup: remove unnacessary nil check before range (#7194)
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2020-05-02 07:25:44 +01:00
ZouYu 2b7437d60e
Fix some warnings: 'redundant type from array, slice, or map composite literal' (#7109)
Signed-off-by: ZouYu <zouy.fnst@cn.fujitsu.com>
2020-04-15 11:17:41 +01:00
Marek Slabicki 8224ddec23
Capitalizing first letter of all log lines (#7043)
Signed-off-by: Marek Slabicki <thaniri@gmail.com>
2020-04-11 09:22:18 +01:00
ZouYu f494426f73
fix warning redundant type from array, slice, or map composite literal (#7106)
Signed-off-by: ZouYu <zouy.fnst@cn.fujitsu.com>
2020-04-09 11:29:19 +01:00
Tariq Ibrahim 0730d6eb74
remove deprecated methods from the MetricProvider interface
Signed-off-by: Tariq Ibrahim <tariq181290@gmail.com>
2020-04-06 09:23:58 -07:00
Deepjyoti Mondal c38ca2ca95
Fix #6999 : Add architecture meta label for EC2 (#7000)
This PR adds architecture meta labels for EC2 instances

Signed-off-by: Deepjyoti Mondal <djmdeveloper060796@gmail.com>
2020-03-28 20:41:37 +00:00
coding3min 4dfbf328f2
[OpenStack SD] Add HypervisorID meta labels about id (#6962)
Add extra meta labels which will be useful in the case
Prometheus discovery hypervisor .

Signed-off-by: pzqu <pzqu@qq.com>

Co-authored-by: pzqu <pzqu@example.com>
2020-03-11 08:38:14 +00:00
Alex Gaganov df92a00838
Expose EC2 instance lifecycle as label (#6914)
Signed-off-by: Alex Gaganov <alex.gaganov@fiverr.com>
2020-03-03 08:03:16 +00:00
Julien Pivotto c67f81937c
discovery: updateGroup should not create targets[poolKey] in the loop (#6903)
We can assume that not all target groups are nil in normal scernarios,
so we can create targets[poolKey] outside the loop.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-02 07:35:02 +00:00
Mathilde Gilles 9b9c58aea8
[Consul] Add health label to metrics (#5313)
Label metrics with the target health using consul's /health endpoint.

Signed-off-by: Mathilde Gilles <m.gilles@criteo.com>
2020-02-25 13:32:30 +00:00
Frederic Branczyk d06f1034db discovery/kubernetes: Fix race in test setup
Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
2020-02-25 10:33:41 +01:00
李国忠 029b45aa30
add service type metadata to kubernetes_sd_config service role #6496 (#6684)
* [service discovery] add service type metadata to kubernetes_sd_config service role

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>

* [fix] ServiceType -> string

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>

* [fix] fix testcase

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>

* [style]

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>

* [doc] add service type

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>

* [doc] sort

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>
2020-02-25 09:22:14 +01:00
Simon Pasquier 06c1a07d5a discovery/kubernetes: remove extraneous parameters from send()
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-02-18 17:36:57 +01:00
Aleksandra Gacek 8e53c19f9c discovery/kubernetes: expose label_selector and field_selector
Close #6807

Co-authored-by @shuttie
Signed-off-by: Aleksandra Gacek <algacek@google.com>
2020-02-15 14:57:56 +01:00
Grebennikov Roman b4445ff03f discovery/kubernetes: expose label_selector and field_selector
Closes #6096

Signed-off-by: Grebennikov Roman <grv@dfdx.me>
2020-02-15 14:57:38 +01:00
Simon Pasquier fe76ccbfe3
discovery/consul: fix logging of tags (#6783)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-02-13 13:11:44 +01:00
Ben Ye 1a18594176
keep kubernetes metrics in global vars (#6765)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2020-02-06 15:52:57 +00:00
Ben Ye 60527de355
keep consul service metrics in global variables (#6764)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2020-02-06 05:48:58 +00:00
Julien Pivotto cf42888e4d Fix order of testutil.Equals (#6695)
Equals takes the expected value as first parameter, and the actual value
as second parameter.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-27 12:21:59 +00:00
johncming 17683d074c discovery: fix bug that use rlock for read. (#5928)
Signed-off-by: johncming <johncming@yahoo.com>
2020-01-22 09:57:37 +00:00
Julien Pivotto 2b2eb79e8b Add windows tests for query logger (#6653)
* Add windows tests
* Do not rely on time.Time in timer

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-20 13:17:11 +00:00
Josh Soref 91d76c8023 Spelling (#6517)
* spelling: alertmanager

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: attributes

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: autocomplete

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: bootstrap

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: caught

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: chunkenc

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: compaction

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: corrupted

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: deletable

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: expected

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: fine-grained

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: initialized

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: iteration

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: javascript

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: multiple

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: number

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: overlapping

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: possible

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: postings

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: procedure

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: programmatic

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: queuing

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: querier

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: repairing

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: received

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: reproducible

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: retention

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: sample

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: segements

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: semantic

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: software [LICENSE]

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: staging

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: timestamp

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: unfortunately

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: uvarint

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: subsequently

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: ressamples

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-01-02 15:54:09 +01:00
Simon Pasquier 75470f86b4 discovery/kubernetes: fix client metrics
The Kubernetes client records workqueue duration and latency metrics as
seconds so there's no need to convert the values from microseconds to
seconds anymore.

The cache metrics (prometheus_sd_kubernetes_cache_*) are removed because
they aren't used anymore by the client though still exposed by its API.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-11-29 12:34:36 +01:00
Callum Styan 7bf17b654c As per dev summit, SD moratorium has been lifted. (#6324)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-11-21 00:22:15 +00:00
Jean-Baptiste Le Duigou 5973227434 adding additional unit tests for getDataCenter() in consul (#6192)
* adding additional unit tests for getDataCenter() in consul

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>

* Consult Tests : update comments to start with uppercase and end with point

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>

* Consult Test : using table-driven tests

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>

* Consul Test : cleaner syntax

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>

* Consul Test : even cleaner syntax

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>

* Consul Test : update comments

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>

* Fixing naming convention by removing underscore in function name

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>

* Removing duplicated test case for getDatacenter()

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>
2019-11-15 14:52:39 +01:00
Yao Zengzeng 1afa476b8a minor fix for making map (#6076)
Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>
2019-10-25 20:06:00 -06:00
Simon Pasquier 3acc3e856c
Adding unit test for target group (#6141)
* adding unit test for target group

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>

* Improve unit tests for target group

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>

* Fix imports

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>

* Improve test by asserting on whole Target Group object

Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>
2019-10-18 16:24:26 +02:00
Simon Pasquier 19ce6b7f5f
discovery: fix more error logs on context cancelation (#6133)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-10-18 11:48:51 +02:00
Jean-Baptiste Le Duigou 0939d566f3 Improve test by asserting on whole Target Group object
Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>
2019-10-17 21:25:38 +02:00
Jean-Baptiste Le Duigou 3309ffa482 Fix imports
Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>
2019-10-17 21:12:50 +02:00
Jean-Baptiste Le Duigou 9372a224b5 Improve unit tests for target group
Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>
2019-10-16 18:03:05 +02:00
Jean-Baptiste Le Duigou 1f9eb09e8e Improve unit tests for target group
Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>
2019-10-15 21:43:31 +02:00
Jean-Baptiste Le Duigou 5146bb14ef adding unit test for target group (#6138)
Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>
2019-10-14 23:48:08 +01:00
Jean-Baptiste Le Duigou 15de05d55e adding unit test for target group
Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>
2019-10-14 21:05:11 +02:00
Simon Pasquier 8ec6f02854 discovery: don't log errors on context cancelation
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-10-09 11:51:38 +02:00
Nevill 7465f27ea5 Refactor on discovery/manager_test.go
- Use testutil.ToFloat64 to collect testing metrics
- Declare ServiceDiscoveryConfig directly instead of calling Unmarshal on a piece of YAML

Signed-off-by: Nevill <nevill.dutt@gmail.com>
2019-10-08 10:18:48 +08:00
陈谭军 c6928b5c6e fix-up typo unkown->unknown (#6055)
Signed-off-by: chentanjun <2799194073@qq.com>
2019-09-25 09:51:43 +02:00
Simon Pasquier 80bc8553be
discovery/file: fix flaky tests (#5948)
* discovery/file: fix flaky tests

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Fix typos

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Add copyFileTo() method

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-09-24 14:54:50 +02:00
Björn Rabenstein 52e0504f83
Merge pull request #5254 from nevill/fix-4890
Change prometheus_sd_configs_failed_total to Gauge
2019-09-24 12:10:40 +02:00
Nevill 55661ab004 Set failedConfigs only once right after registerProviders finished
Signed-off-by: Nevill <nevill.dutt@gmail.com>
2019-09-24 09:15:40 +08:00
johncming 31a8ac3219 discovery/dns: add test case for SDConfig.UnmarshalYAML. (#6035)
* discovery/dns: Add code coverage.

Signed-off-by: johncming <johncming@yahoo.com>

* discovery/dns: add test case for SDConfig.UnmarshalYAML.

Signed-off-by: johncming <johncming@yahoo.com>
2019-09-23 13:26:11 +02:00
Nevill 048f81218d Change prometheus_sd_configs_failed_total to Gauge
Signed-off-by: Nevill <nevill.dutt@gmail.com>
2019-09-16 10:38:43 +08:00
Harkishen Singh d98d4a9bf0 remove resetting of manager properties and init manager props under locking (#5979)
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
2019-09-06 12:46:24 +02:00
Tariq Ibrahim f0a5f88b95 [prometheus_sd/kubernetes]add new node address types for discover (#5902)
Signed-off-by: Tariq Ibrahim <tariq181290@gmail.com>
2019-08-20 15:52:11 +01:00
Bartek Płotka 5cb32d67f9
Merge pull request #5893 from prometheus/unify-tsdbutil
Removed extra tsdb/testutil after merge.
2019-08-15 12:07:59 +01:00
Bartek Plotka f0863a604e Removed extra tsdb/testutil after merge.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2019-08-14 10:12:32 +01:00
Julius Volz b5c833ca21
Update go.mod dependencies before release (#5883)
* Update go.mod dependencies before release

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Add issue for showing query warnings in promtool

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Revert json-iterator back to 1.1.6

It produced errors when marshaling Point values with special float
values.

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Fix expected step values in promtool tests after client_golang update

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Update generated protobuf code after proto dep updates

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2019-08-14 11:00:39 +02:00
Björn Rabenstein 70ce3df23c
Merge pull request #5860 from tariq1890/variadic
pass multiple args to Registerer.MustRegister method
2019-08-13 13:22:30 +02:00
Ganesh Vernekar 5ecef3542d
Cleanup after merging tsdb into prometheus
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2019-08-13 14:04:14 +05:30
tariqibrahim df99d943ba pass multiple args to Registers.MustRegister method
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-08-12 22:56:10 -07:00
Chris Marchbanks 529ccff07b
Remove all usages of stretchr/testify
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2019-08-08 19:49:27 -06:00
AllenZMC 41151ca8dc fix mis-spelling in consul_test.go (#5836)
Signed-off-by: czm <zhongming.chang@daocloud.io>
2019-08-06 06:11:41 +01:00
dzzg 938ca06057
fix wrong spells in ingress.go
Signed-off-by: dzzg <zhengguang.zhu@daocloud.io>
2019-07-28 02:07:23 +08:00
Ye Ji 9229811c94 give each tree cache its unique channel to avoid multiple close on the same channel
Signed-off-by: Ye Ji <ye@hioscar.com>
2019-07-09 16:38:56 -04:00
beorn7 dd81912554 Add objectives to Summaries
With the next release of client_golang, Summaries will not have
objectives by default. To not lose the objectives we have right now,
explicitly state the current default objectives.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-12 02:03:13 +02:00
sh0rez 8ba23fb336
fix(style): container_is_init to container_init
Removes 'is' keyword to comply style guide

Signed-off-by: sh0rez <me@shorez.de>
2019-05-29 16:16:19 +02:00
sh0rez 88b79bae64
chore(style): Comply with style guide, order list
Signed-off-by: sh0rez <me@shorez.de>
2019-05-29 11:22:10 +02:00
sh0rez 6618f28fd7
test(discovery/kubernetes): TestPodDiscoveryInitContainer
Adds a test to check whether an InitContainer is included in the discovery

Signed-off-by: sh0rez <me@shorez.de>
2019-05-28 16:51:58 +02:00
sh0rez fbd5c6f310
test(discovery/kubernetes): add container_is_init label to tests
Adds the new container_is_init label to the current tests to make them pass again

Signed-off-by: sh0rez <me@shorez.de>
2019-05-27 19:16:03 +02:00
sh0rez cfa253ae06
feat(discovery/kubernetes): container_is_init label
Adds a label that shows whether the container is an init container or not

Signed-off-by: sh0rez <me@shorez.de>
2019-05-27 17:48:15 +02:00
sh0rez bea07fe866
feat(discovery/kubernetes): include InitContainers
Includes InitContainers in the ServiceDiscovery

Signed-off-by: sh0rez <me@shorez.de>
2019-05-26 22:53:14 +02:00
Bevisy b7cdd3e840 Exhaust request body before closing it (#5596)
Signed-off-by: bevisy <binbin36520@gmail.com>
2019-05-25 11:27:12 +01:00
Dmitry Shmulevich d81df5609d fix nil pointer dereference in azure discovery (#5587)
Signed-off-by: Dmitry Shmulevich <dmitry.shmulevich@sysdig.com>
2019-05-21 20:03:24 +01:00
Simon Pasquier 3441ecdea1 discovery/kubernetes: add node name and hostname to endpoints
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-05-16 10:49:13 +02:00
Simon Pasquier 45506841e6
*: enable all default linters (#5504)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-05-03 15:11:28 +02:00
Frederic Branczyk 3cffd81035
Merge pull request #5520 from YaoZengzeng/service
The workqueue of service should be `service` other than `ingress`
2019-04-29 14:03:21 +02:00
YaoZengzeng 658b33808c The workqueue of service should be service other than ingress
Signed-off-by: YaoZengzeng <yaozengzeng@zju.edu.cn>
2019-04-29 17:21:35 +08:00
Frederic Branczyk f874555a0d
Merge pull request #5486 from tariq1890/update_kubernetes
update client-go,api,api-machinery and klog dependencies
2019-04-29 09:34:27 +02:00
Björn Rabenstein 0be9388f8d
Merge pull request #5463 from prometheus/beorn7/templating
Follow-up on #5009
2019-04-24 16:42:23 +02:00
Tariq Ibrahim 00036cd1e5
update client-go,api,api-machinery and klog dependencies
Signed-off-by: Tariq Ibrahim <tariq181290@gmail.com>
2019-04-23 10:54:18 -07:00
Romain Baugue 95193fa027 Exhaust every request body before closing it (#5166) (#5479)
From the documentation:
> The default HTTP client's Transport may not
> reuse HTTP/1.x "keep-alive" TCP connections if the Body is
> not read to completion and closed.

This effectively enable keep-alive for the fixed requests.

Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>
2019-04-18 09:50:37 +01:00
EarthmanT 35be8c9e25 Add azure public ip label (#5475)
* Update Azure SD Config with Public IP label

Signed-off-by: earthmant <trammell@cloudify.co>
2019-04-17 16:05:44 +01:00
Bjoern Rabenstein a92ef68dd8 Fix staticcheck errors
Not sure why they only show up now.

Signed-off-by: Bjoern Rabenstein <bjoern@rabenste.in>
2019-04-17 01:40:10 +02:00
Simon Pasquier 559237cc4f discovery/kubernetes: fix missing label sanitization (#5462)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-04-15 19:04:50 +01:00
Brian Brazil 8ff6938fa4
Update dependencies. (#5449)
Including going to tsdb 0.7.0.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2019-04-10 18:47:25 +01:00
Simon Pasquier dafd1632a2 discovery/kubernetes: add present labels for labels/annotations (#5443)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-04-10 13:21:42 +01:00
Simon Pasquier 4f47806a7d
discovery/dns: fix slice with wrong length (#5432)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-04-04 17:05:35 +02:00
Kien Nguyen-Tuan 813b58367a [OpenStack SD] Add ProjectID and UserID meta labels (#5431)
Add extra meta labels which will be useful in the case
Prometheus discovery instances from all projects.

Signed-off-by: Kien Nguyen <kiennt2609@gmail.com>
2019-04-04 10:02:31 +01:00
Tariq Ibrahim 8fdfa8abea refine error handling in prometheus (#5388)
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-26 00:01:12 +01:00
Simon Pasquier 782d00059a
discovery: factorize for SD based on refresh (#5381)
* discovery: factorize for SD based on refresh

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* discovery: use common metrics for refresh

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-03-25 11:54:22 +01:00
Tariq Ibrahim 0d7104b7eb discovery/azure:optimize iteration logic for VMScalesets, VMScalesetVMs, and VMs (#5363)
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-20 09:03:47 +00:00
Tariq Ibrahim 5f933e99d0 discovery/azure: make local virtualMachine struct more generic by removing the go sdk field reference (#5350)
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-15 16:18:37 +00:00
Mario Trangoni 5354ffff99 Fix some spelling issues (#5361)
See,
$ codespell -S './vendor/*,./.git*,./web/ui/static/vendor*' --ignore-words-list="uint,dur,ue,iff,te,wan"

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2019-03-14 14:38:54 +00:00
Simon Pasquier 67385f356f
discovery/openstack: pass context to the OpenStack client (#5231)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-03-12 13:53:03 +01:00
Callum Styan 83c46fd549 update Consul vendor code so that catalog.ServiceMultipleTags can be (#5151)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-03-12 10:31:27 +00:00
Tariq Ibrahim 197e5ac597 docs: minor improvements to the service discovery README.md (#5296)
i) Increased the size of the Service Discovery Readme title
ii) Changed `TargetGroups` to "target groups" as it has been relocated and renamed to another package.

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-03 19:48:03 +01:00
JoeWrightss e4b88704a6 Fix misspell in manager_test.go (#5279)
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-02-27 11:22:31 +01:00
Simon Pasquier 1d2fc95b1c
discovery/marathon: pass context to the client (#5232)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:49:16 +01:00
Simon Pasquier e60d314f43
discovery/consul: pass current context to Consul queries (#5230)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:48:19 +01:00
Simon Pasquier 8f578d9c6b
discovery/ec2: pass context to the client (#5234)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:48:03 +01:00
Simon Pasquier 4997dcb4a1
discovery/gce: pass context to the client (#5233)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:47:43 +01:00
Simon Pasquier 9040dddd0c
discovery/azure: pass context to the client (#5255)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:47:26 +01:00
Simon Pasquier fe7a1bcfc6
discovery/triton: pass context to the client (#5235)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:47:04 +01:00
Björn Rabenstein ad29221a7b
Merge pull request #5020 from erikh/upgrade-miekg-dns
Upgrade miekg dns
2019-02-25 12:47:32 +01:00
Simon Pasquier e72c875e63
config: fix Kubernetes config with empty API server (#5256)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-22 15:51:47 +01:00
Nguyen Hai Truong aed9ea144a Remove duplicated words in comments
Although it is spelling mistakes, it might make an affects
while reading.

Co-Authored-By: Kim Bao Long longkb@vn.fujitsu.com
Signed-off-by: Nguyen Hai Truong <truongnh@vn.fujitsu.com>
2019-02-20 17:41:02 -08:00
Simon Pasquier c8a1a5a93c
discovery/kubernetes: fix support for password_file and bearer_token_file (#5211)
* discovery/kubernetes: fix support for password_file

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Create and pass custom RoundTripper to Kubernetes client

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use inline HTTPClientConfig

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-20 11:22:34 +01:00
Erik Hollensbe be3c082539 discovery/dns/dns.go: fix handling of truncated dns records
https://github.com/miekg/dns/pull/815 goes into the detail, but more or
less the existing solution was no longer supported and needed to be
rewritten to support the new versions of the library. miekg additionally
claims this is more correct in the ticket.

Signed-off-by: Erik Hollensbe <github@hollensbe.org>
2019-02-20 00:36:41 +00:00
Simon Pasquier f9462d5d44 discovery/consul: pass current context to Consul queries
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-18 14:23:56 +01:00
JoeWrightss 4cb6c202ff Fix fmt.Errorf error message (#5199)
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-02-10 15:16:20 +05:30
tariqibrahim b173de0c26 fix ineffectual assignment in dns.go
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-01-28 17:15:43 -08:00
Jannick Fahlbusch ฏ๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎ 63f375e80a [FIX] Azure DS: Return error when request failed (#4719)
This fixes the issue that the error is swallowed when the request failed.

Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
2019-01-28 21:31:45 +00:00
Tariq Ibrahim f4275d2352 Use the latest versions of azure go sdk and go-autorest (#5015)
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-01-28 18:30:29 +00:00
Tariq Ibrahim bfcdba211f remove the prepended watch reactor from the fake k8s client (#5140)
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2019-01-28 16:42:25 +01:00
Simon Pasquier 68e4c211f2
discovery/azure: more robust handling of go routines (#5106)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-18 09:55:47 +01:00
Matt Layher 302148fd69 *: apply gofmt -s
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-01-16 17:28:14 -05:00
Simon Pasquier 22a1def98d
Merge pull request #5099 from prometheus/release-2.6
Merge release-2.6 to master
2019-01-16 09:26:00 +01:00
tommarute 9922c35a23 marathon-sd - use Tasks.Ports instead of PortDefinitions.Ports if RequirePorts is false (#5022) (#5026)
Signed-off-by: tommarute <tommarute@gmail.com>
2019-01-14 17:20:22 +00:00
Sylvain Rabot d9f4a8c95f sd: Fix stuck Azure service discovery (#5088)
Signed-off-by: Sylvain Rabot <s.rabot@lectra.com>
2019-01-14 15:09:27 +00:00
Kevin Bulebush 718344434c openstack_sd: Supporting application credential for authentication. (#4968)
* openstack_sd: Support application credentials for authentication.
Updated gophercloud

Signed-off-by: Kevin Bulebush <kmbulebu@gmail.com>
2019-01-09 15:18:58 +00:00
Frederic Branczyk e9ae0b5a1b
Merge pull request #4927 from tariq1890/update_k8s
update client-go to v10.0.0 and other k8s deps to v1.13.1
2019-01-07 10:54:34 +01:00
Fabian Reinartz ca93c8e19b
Merge pull request #4969 from prometheus/azuresubid
Add Azure tenant and subscription ID labels
2019-01-06 12:25:06 +01:00
Simon Pasquier f678e27eb6
*: use latest release of staticcheck (#5057)
* *: use latest release of staticcheck

It also fixes a couple of things in the code flagged by the additional
checks.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use official release of staticcheck

Also run 'go list' before staticcheck to avoid failures when downloading packages.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-04 14:47:38 +01:00
tariqibrahim aa94efe4b5 Merge branch 'master' of https://github.com/prometheus/prometheus into update_k8s 2019-01-03 10:27:12 -08:00
Fabian Reinartz 7a41038695 Add Azure tenant and subscription ID labels
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2019-01-03 13:09:13 +01:00
Lv Jiawei ad22389218 Add ingress in UnmarshalYAML and init (#5035)
Both UnmarshalYAML and init lacks the role type ingress.

Signed-off-by: MIBc <lvjiawei@cmss.chinamobile.com>
2018-12-24 09:24:01 +00:00
tariqibrahim 122b47caa0 address review comment in client_metrics
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-21 00:46:47 -08:00
tariqibrahim 1e4e4c46ba Merge branch 'master' of https://github.com/prometheus/prometheus into update_k8s
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-20 11:36:54 -08:00
Ilya Gladyshev 922c17e119 added name label to all discovery metrics (#5002)
Signed-off-by: Ilya Gladyshev <ilya.v.gladyshev@gmail.com>
2018-12-20 14:47:29 +00:00
Erik Hollensbe b94eea482c discovery/gce: oauth2.NoContext is deprecated, replace with context.Background() (#5024)
* vendor update
* discovery/gce: oauth2.NoContext is deprecated, replace with context.Background()

Signed-off-by: Erik Hollensbe <github@hollensbe.org>
2018-12-20 14:45:18 +00:00
Marcel D. Juhnke c7d83b2b6a discovery: add support for Managed Identity authentication in Azure SD (#4590)
Signed-off-by: Marcel Juhnke <marrat@marrat.de>
2018-12-19 10:03:33 +00:00
tariqibrahim 0d4b6e4e66 address review comments
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-18 08:14:30 -08:00
Tariq Ibrahim de6f3b6af7 expose kubernetes service cluster ip (#4940)
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2018-12-18 15:17:34 +00:00
JoeWrightss e8be31eed9 Fixs typo: 'possibliy' to 'possibly' (#4974)
Signed-off-by: JoeWrightss <zhoulin.xie@daocloud.io>
2018-12-18 11:52:40 +01:00
Samuel Alfageme 240321acee Add taggedAddress to the labels in ConsulSD (#5001)
Useful when multiple (tagged) addresses for a node are exposed on the catalog API
Ref. https://www.consul.io/api/catalog.html#taggedaddresses

Signed-off-by: Samuel Alfageme <samuel@alfage.me>
2018-12-18 11:51:05 +01:00
Tariq Ibrahim e3bdc463fa Revert "add logic to check if an azure VM is deallocated or not (#4908)" (#4980)
This reverts commit 61cf4365

Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-12 09:27:12 +01:00
tariqibrahim 1fd438ed2b rebase and resolve merge conflicts
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-06 09:15:34 -08:00
tariqibrahim 412ca33226 update kubernetes deps to v1.13.0
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-03 19:32:16 -08:00
Julius Volz d28246e337
Fix config loading panics on nil pointer slice elements (#4942)
Fixes https://github.com/prometheus/prometheus/issues/4902
Fixes https://github.com/prometheus/prometheus/issues/4889

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-12-03 18:09:02 +08:00
Simon Pasquier 8b91d39c43
discovery: send empty group on empty SD config (#4819)
* discovery: send empty group on blank SD config

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Update comments

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Add another comment

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-30 17:59:57 +01:00
Tariq Ibrahim 61cf4365d6 add logic to check if an azure VM is deallocated or not (#4908)
* add logic to check if an azure VM is deallocated or not
* update documentation  with the new azure power state label

Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-11-30 11:32:40 +00:00
Serghei Anicheev 8e659a5109 Adding private_dns_name to the list of ec2 labels which can be used i… (#4693)
* Adding private_dns_name to the list of ec2 labels which can be used in node naming for dynamic environments

Signed-off-by: Serghei Anicheev <serghei@rentalcover.com>
2018-11-30 11:11:06 +00:00
mengnan a5d39361ab discovery/azure: Fail hard when Azure authentication parameters are missing (#4907)
* discovery/azure: fail hard when client_id/client_secret is empty

Signed-off-by: mengnan <supernan1994@gmail.com>

* discovery/azure: fail hard when authentication parameters are missing

Signed-off-by: mengnan <supernan1994@gmail.com>

* add unit test

Signed-off-by: mengnan <supernan1994@gmail.com>

* add unit test

Signed-off-by: mengnan <supernan1994@gmail.com>

* format code

Signed-off-by: mengnan <supernan1994@gmail.com>
2018-11-29 16:47:59 +01:00
Ben Kochie c6399296dc
Fix spelling/typos (#4921)
* Fix spelling/typos

Fix spelling/typos reported by codespell/misspell.
* UK -> US spelling changes.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-11-27 17:44:29 +01:00
Simon Pasquier 0bb810d126
discovery/marathon: fix leaked connections (#4915)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-27 14:58:27 +01:00
Timo Beckers bea302e061 marathon-sd - use 'hostPort' member of portMapping to construct target endpoints (#4887)
Fixes #4855 - ServicePort was wrongly used to construct an address to endpoints
defined in portMappings. This was changed to HostPort. Support for obtaining
auto-generated host ports was also added.

Signed-off-by: Timo Beckers <timo@incline.eu>
2018-11-26 13:39:35 +01:00
Daniele Sluijters f25a6baedb remote: Set User-Agent header in requests (#4891)
Currently Prometheus requests show up with a UA of Go-http-client/1.1
which isn't super helpful. Though the X-Prometheus-Remote-* headers
exist they need to be explicitly configured when logging the request in
order to be able to deduce this is a request originating from
Prometheus. By setting the header we remove this ambiguity and make
default server logs just a bit more useful.

This also updates a few other places to consistently capitalize the 'P'
in the user agent, as well as ensure we set a UA to begin with.

Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>
2018-11-23 22:49:49 +08:00
Sylvain Rabot 1fd3b33dcd Prevent Azure SD panic (fix #4779) (#4867)
Signed-off-by: Sylvain Rabot <s.rabot@lectra.com>
2018-11-19 12:23:12 +00:00
Bryan Boreham cf37e1feb4 Add __meta_kubernetes_pod_phase label in discovery (#4824)
This lets you add a relabel rule to drop scrapes for pods which are
not running.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-11-06 14:40:24 +00:00
Silvio Gissi 6100f160ad EC2 Platform meta label (#4663)
Set __meta_ec2_platform label with the instance platform string. Set to 'windows' on Windows servers and absent otherwise.


Signed-off-by: Silvio Gissi <silvio@gissilabs.com>
2018-11-06 14:39:48 +00:00
Goutham Veeramachaneni f988af7235 Revert #4586 (#4766)
This breaks people if they are depending on the contents of
__address__ label.

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-10-24 10:16:36 +02:00
Simon Pasquier a30348f1a4 discovery: add config label to discovered targets metric (#4753)
* discovery: add labels to discovered targets metric

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-18 16:46:59 +01:00
Simon Pasquier 5824d6902d
openstack: fix client when using env variables (#4734)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-17 16:04:07 +02:00
Kien Nguyen-Tuan 9c5370fdfe Support discover instances from all projects (#4682)
By default, OpenStack SD only queries for instances
from specified project. To discover instances from other
projects, users have to add more openstack_sd_configs for
each project.

This patch adds `all_tenants` <bool> options to
openstack_sd_configs. For example:

- job_name: 'openstack_all_instances'
  openstack_sd_configs:
    - role: instance
      region: RegionOne
      identity_endpoint: http://<identity_server>/identity/v3
      username: <username>
      password: <super_secret_password>
      domain_name: Default
      all_tenants: true

Co-authored-by: Kien Nguyen <kiennt2609@gmail.com>
Signed-off-by: dmatosl <danielmatos.lima@gmail.com>
2018-10-17 13:01:33 +01:00
Simon Pasquier c4a6acfb1e
*: move to go 1.11 (#4626)
* *: move to go 1.11

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Reduce number of places where we specify the Go version

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-16 09:41:45 +02:00
Goutham Veeramachaneni ffb7f829ec
Merge pull request #4730 from prometheus/release-2.4
Release 2.4
2018-10-12 14:15:42 -07:00
Simon Pasquier 3e6b9d43c3
Merge pull request #4720 from teresy/redundant-nil-check-slice
Remove redundant nil check
2018-10-11 10:24:55 +02:00
Rijnard van Tonder 9d102e3bff The nil check before the range loop is redundant
Signed-off-by: Rijnard van Tonder <hi.teresy@gmail.com>
2018-10-10 16:11:45 -04:00
Richard Kiene b537f6047a Add ability to filter triton_sd targets by pre-defined groups (#4701)
Additionally, add triton groups metadata to the discovery reponse
and correct a documentation error regarding the triton server id
metadata.

Signed-off-by: Richard Kiene <richard.kiene@joyent.com>
2018-10-10 10:03:34 +01:00
Simon Pasquier a2a78d0a09 discovery/openstack: discover all interfaces (#4649)
* discovery/openstack: discover all interfaces
* Add address pool label

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-09 16:17:08 +01:00
Simon Pasquier e1e2821cca
Merge pull request #4654 from simonpasquier/openstack-tls
discovery/openstack: support tls_config
2018-10-05 18:11:55 +02:00
Jannick Fahlbusch ฏ๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎ f78e59577b [FIX] EC2 DS: Check for existence of OwnerID (#4672)
Commit 1c89984 introduced the ability to expose the owner of the instance.
However, this breaks Prometheus if there is no OwnerID in the reservation (Eg. if you are using a private EC2-API introduced by #4333)

Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
2018-10-02 16:18:31 +05:30
Simon Pasquier 657199af22 Address Krasi comments
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-28 12:29:24 +02:00
Simon Pasquier 5df757fdd4 zookeeper: fix panic
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-28 11:39:40 +02:00
Simon Pasquier 365931ea83 discovery: add metrics + send updates from one goroutine only
The added metrics are:

* prometheus_sd_discovered_targets
* prometheus_sd_received_updates_total
* prometheus_sd_updates_delayed_total
* prometheus_sd_updates_total

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-27 15:59:42 +02:00
Simon Pasquier f2d43af820
Merge pull request #4582 from simonpasquier/add-discovery-tests
discovery: add more tests
2018-09-27 15:18:42 +02:00
Simon Pasquier ff08c40091 discovery/openstack: support tls_config
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-25 14:31:32 +02:00
Frederic Branczyk b75ec7e6ef
Merge pull request #4458 from FUSAKLA/k8s-sd-add-metrics
feat: added more k8s SD metrics
2018-09-21 13:10:48 +02:00
Timo Beckers 1c9fbd65c4 marathon-sd - change port gathering strategy, support for container networking (#4499)
* marathon-sd - change port gathering strategy, add support for container networking

- removed unnecessary error check on HTTPClientConfig.Validate()
- renamed PortDefinitions and PortMappings to PortDefinition and PortMapping respectively
- extended data model for extra parsed fields from Marathon json
- support container networking on Marathon 1.5+ (target Task.IPAddresses.x.Address)
- expanded test suite to cover all new cases
- test: cancel context when reading from doneCh before returning from function
- test: split test suite into Ports/PortMappings/PortDefinitions

Signed-off-by: Timo Beckers <timo@incline.eu>
2018-09-21 11:53:04 +01:00
Martin Chodur f2d037133e
feat: added more k8s SD metrics
Signed-off-by: Martin Chodur <m.chodur@seznam.cz>
2018-09-20 22:28:51 +02:00
Camille Janicki b035ea0ea9 Change discovery subpackages to not use testify in tests (#4612)
* Change discovery subpackages to not use testify in tests

Signed-off-by: Camille Janicki <camille.janicki@gmail.com>

* Remove testify suite from vendor dir

Signed-off-by: Camille Janicki <camille.janicki@gmail.com>
2018-09-18 17:35:22 +02:00
Simon Pasquier 128ff546b8 config: add test for OpenStack SD (#4594)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-13 21:44:27 +05:30
Tom Wilkie e3d36f4802 Don't import testing from non-test code. (#4595)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-13 16:03:26 +05:30
Bryan Boreham 968f657eaa Stop removing the final dot from rooted DNS names (#4586)
Removing a final dot changes the meaning of the name and can cause
extra DNS lookups as the resolver traverses its search path.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-09-13 15:28:38 +05:30
Simon Pasquier e7cee1b5ba Remove tests redundant with TestTargetUpdatesOrder
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 17:56:53 +02:00
Simon Pasquier 7dc3f11306 WIP discovery: refactor TestTargetUpdatesOrder
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:15:03 +02:00
Simon Pasquier 8fd891bf3f Speed up tests that were still using the 5s timeout
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 8289501420 Address krasi's comments
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 1cee5b5b06 Don't multiple the interval value by 1ms in the mock
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 4900405d2f Refactor TestCoordinationWithReceiver() to work with any Discoverer
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 0798f14e02 Add TestCoordinationWithEmptyProvider
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 48989d8996 discovery: add more tests
Co-authored-by: Camille Janicki <camille.janicki@gmail.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Krasi Georgiev ba7eb733e8 tidy up the discovery logs,updating loops and selects (#4556)
* tidy up the discovery logs,updating loops and selects

few objects renamings

removed a very noise debug log on the k8s discovery. It would be usefull
to show some summary rather than every update as this is impossible to
follow.

added most comments as debug logs so each block becomes self
explanatory.

when the discovery receiving channel is full will retry again on the
next cycle.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>

* add noop logger for the SD manager tests.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>

* spelling nits

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-09-05 17:02:47 +05:30
Tariq Ibrahim f708fd5c99 Adding support for multiple azure environments (#4569)
Signed-off-by: Tariq Ibrahim <tariq.ibrahim@microsoft.com>
2018-09-04 17:55:40 +02:00
Simon Pasquier 674c76adb8 discovery: coalesce identical SD configurations (#3912)
* discovery: coalesce identical SD configurations

Instead of creating as many SD providers as declared in the
configuration, the discovery manager merges identical configurations
into the same provider and keeps track of the subscribers. When
the manager receives target updates from a SD provider, it will
broadcast the updates to all interested subscribers.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-01 08:51:31 +01:00
Krasi Georgiev 53691ae261 Simplify SD update throttling (#4523)
* simplfied SD updates throtling

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>

* add default to catch cases when we don't have new updates.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-08-27 17:12:11 +02:00
Fabian Reinartz f571b69010
Merge pull request #4514 from jkohen/ec2-targets
Expose EC2 instance owner as a discovery label.
2018-08-20 08:43:44 +02:00
Javier Kohen 1c89984778 Expose EC2 instance owner as a discovery label.
This exposes the OwnerID field of the DescribeInstances respons as .

Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-17 11:30:18 -04:00
Yecheng Fu d4eae8cc0c Wait for all internal discoveries are done before exiting. (#4508)
Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-08-17 18:50:22 +05:30
Fabian Reinartz b04ab71268
Merge pull request #4488 from jkohen/patch-3
Populate __meta_gce_instance_id discovery label
2018-08-11 09:52:28 +02:00
Javier Kohen 403ac08ece Expose __meta_gce_instance_id as an integer (instead of raw bytes).
Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-10 16:21:46 -04:00
Javier Kohen 7e9549b398 Added __meta_gce_instance_id discovery label
Populated from instance.ID. I will follow up with a change to the documentation.

Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-10 11:57:55 -04:00
Simon Pasquier b7054f3a78
Merge pull request #4443 from simonpasquier/fix-consul-connections-leak
discovery/consul: close idle connections on stop
2018-08-10 17:43:39 +02:00
Benji Visser 46fb4078a6 handle nil pointer in ec2 discovery (#4469)
This handles a nil pointer that was being accessed in EC2 discovery.

Fixes: #4441

Signed-off-by: noqcks <benny@noqcks.io>
2018-08-07 08:35:22 +01:00
Johannes Scheuermann f978f5bba3 Fixes #4202, correctly parse VMs with empty tags (#4450)
Signed-off-by: Johannes M. Scheuermann <joh.scheuer@gmail.com>
2018-08-02 10:10:17 +01:00
jojohappy e060f7755f To keep comment of NodeLegacyHostIP for k8s node address
Signed-off-by: jojohappy <sarahdj0917@gmail.com>
2018-08-02 10:25:28 +08:00
jojohappy e81785d1a3 To keep depecrate k8s node NodeLegacyHostIP as local constant to keep compatibility for older k8s version
Signed-off-by: jojohappy <sarahdj0917@gmail.com>
2018-08-02 10:25:28 +08:00
jojohappy 21e50a3f9d Upgrade k8s client to kubernetes-1.11.0
Signed-off-by: jojohappy <sarahdj0917@gmail.com>
2018-08-02 10:25:27 +08:00
Simon Pasquier 1cd29f782c discovery/consul: close idle connections on stop
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-08-01 17:26:52 +02:00
Johannes Scheuermann 7608ee87d0 Inital support for Azure VMSS (#4202)
* Inital support for Azure VMSS

Signed-off-by: Johannes Scheuermann <johannes.scheuermann@inovex.de>

* Add documentation for the newly introduced label

Signed-off-by: Johannes M. Scheuermann <joh.scheuer@gmail.com>
2018-08-01 12:52:21 +01:00
José Martínez 791c13b142 discovery/ec2: Add primary_subnet_id label
Signed-off-by: José Martínez <xosemp@gmail.com>
2018-07-25 09:20:58 +01:00
José Martínez 5e4a33c890 discovery/ec2: Maintain order of subnet_id label
Signed-off-by: José Martínez <xosemp@gmail.com>
2018-07-25 09:20:58 +01:00
Jannick Fahlbusch ฏ๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎ 0be25f92e2 EC2 Discovery: Allow to set a custom endpoint (#4333)
Allowing to set a custom endpoint makes it easy to monitor targets on non AWS providers with EC2 compliant APIs.

Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
2018-07-18 10:48:14 +01:00
Ivan Voronchihin 59d214d277 Update autorest vedoring (#4147)
Signed-off-by: bege13mot <bege13mot@gmail.com>
2018-07-18 05:24:15 +01:00
Julius Volz 219e477272 Fix some (valid) lint errors (#4287)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 05:07:33 +01:00
Romain Baugue b41be4ef52 Discovery consul service meta (#4280)
* Upgrade Consul client
* Add ServiceMeta to the labels in ConsulSD

Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>
2018-07-18 05:06:56 +01:00
Simon Pasquier f32acc0b7b discovery/openstack: remove unneeded assignment
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-07-15 12:37:57 +01:00
Julius Volz 05d6d6a2e5
k8s SD: Fix "schema" -> "scheme" typo (#4371)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-12 16:12:32 +02:00
Krasi Georgiev a155b6d29d fix the zookeper race (#4355)
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-07-06 08:39:38 +01:00
Dmitry Bashkatov 72327d98fb discovery/kubernetes/ingress: remove unnecessary check
Signed-off-by: Dmitry Bashkatov <dbashkatov@gmail.com>
2018-07-04 15:47:11 +03:00
Dmitry Bashkatov e2baf89eac discovery/kubernetes/ingress: fix scheme discovery (Closes #4327)
Signed-off-by: Dmitry Bashkatov <dbashkatov@gmail.com>
2018-07-04 13:28:44 +03:00
Dmitry Bashkatov 9cdca50bdd discovery/kubernetes/ingress: add more tests
Signed-off-by: Dmitry Bashkatov <dbashkatov@gmail.com>
2018-07-04 13:28:44 +03:00
Julius Volz 5cf0113762
Add "omitempty" to some SD config YAML field tags (#4338)
Especially for Kubernetes SD, this fixes a bug where the rendered
configuration says "api_server: null", which when read back is not
interpreted as an un-set API server (thus the default is not applied).

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-03 13:43:41 +02:00
Simon Pasquier 6eab4bbca1 kubernetes_sd: fix namespace filtering (#4273)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-06-15 09:08:14 +01:00
Paul Gier d24d2acd11 config: set target group source index during unmarshalling (#4245)
* config: set target group source index during unmarshalling

Fixes issue #4214 where the scrape pool is unnecessarily reloaded for a
config reload where the config hasn't changed.  Previously, the discovery
manager changed the static config after loading which caused the in-memory
config to differ from a freshly reloaded config.

Signed-off-by: Paul Gier <pgier@redhat.com>

* [issue #4214] Test that static targets are not modified by discovery manager

Signed-off-by: Paul Gier <pgier@redhat.com>
2018-06-13 16:34:59 +01:00
Simon Pasquier 0e5e7f75cd discovery/file: fix logging (#4178)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-06-12 12:45:59 +01:00
Callum Styan 03578d5df8 add example usage of SD adapter for converting unsupported SD type to filesd (#3720)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2018-05-30 13:14:34 +01:00
Adam Shannon a22e1736b9 discovery/marathon: include url in fetchApps error (#4171)
This was previously part of a larger PR, but that was closed.

https://github.com/prometheus/prometheus/issues/4048#issuecomment-389899997

This change could include auth information in the URL. That's been
fixed in upstream go, but not until Go 1.11. See: https://github.com/golang/go/issues/24572

Signed-off-by: Adam Shannon <adamkshannon@gmail.com>
2018-05-18 10:20:14 +01:00
Damien Lespiau e64037053d Expose controller kind and name to labelling rules
Relabelling rules can use this information to attach the name of the controller
that has created a pod.

In turn, this can be used to slice metrics by workload at query time, ie.
"Give me all metrics that have been created by the $name Deployment"

Signed-off-by: Damien Lespiau <damien@weave.works>
2018-05-09 11:51:37 +02:00
Nathan Graves 5b27996cb3 Include GCE labels during service discovery. Updated vendor files for Google API. (#4150)
Signed-off-by: Nathan Graves <nathan.graves@kofile.us>
2018-05-08 17:37:47 +01:00
beorn7 a4e4bec3fe Merge branch 'release-2.2' 2018-04-30 14:38:29 +02:00
Elif T. Kuş 57dcdfb15f Rewrote tests with testutil for several test files (#4086)
* promql: Rewrote tests with testutil for functions_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>

* pkg/relabel: Rewrote tests with testutil for relabel_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>

* discovery/consul: Rewrote tests with testutil for consul_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>

* scrape: Rewrote tests with testutil for manager_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
2018-04-27 13:11:16 +01:00
Yecheng Fu 2be543e65a Simplify some code and comments.
Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-04-25 19:29:34 +02:00
Yecheng Fu 46683dd67d Simplify code.
- Unified `send` function.
- Pass InformerSynced functions to `cache.WaitForCacheSync`.
- Use `Role\w+` constants instead of literal string.

Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-04-25 19:29:21 +02:00
Yecheng Fu 3a253f796c Fix grammar in comments and add missing expectedMaxItems to let it
break fast.

Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-04-25 19:29:03 +02:00
Yecheng Fu d73b0d3141 Move hasSynced interface and its implementations to *_test.go files.
Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-04-25 19:28:49 +02:00
Yecheng Fu 8ceb8f2ae8 Refactor Kubernetes Discovery Part 2: Refactoring
- Do initial listing and syncing to scrape manager, then register event
  handlers may lost events happening in listing and syncing (if it
  lasted a long time). We should register event handlers at the very
  begining, before processing just wait until informers synced (sync in
  informer will list all objects and call OnUpdate event handler).
- Use a queue then we don't block event callbacks and an object will be
  processed only once if added multiple times before it being processed.
- Fix bug in `serviceUpdate` in endpoints.go, we should build endpoints
  when `exists && err == nil`. Add `^TestEndpointsDiscoveryWithService`
  tests to test this feature.

Testing:

- Use `k8s.io/client-go` testing framework and fake implementations which are
  more robust and reliable for testing.
- `Test\w+DiscoveryBeforeRun` are used to test objects created before
  discoverer runs
- `Test\w+DiscoveryAdd\w+` are used to test adding objects
- `Test\w+DiscoveryDelete\w+` are used to test deleting objects
- `Test\w+DiscoveryUpdate\w+` are used to test updating objects
- `TestEndpointsDiscoveryWithService\w+` are used to test endpoints
  events triggered by services
- `cache.DeletedFinalStateUnknown` related stuffs are removed, because
  we don't care deleted objects in store, we only need its name to send
  a specical `targetgroup.Group` to scrape manager

Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-04-25 19:28:34 +02:00
Adam Shannon 809881d7f5 support reading basic_auth password_file for HTTP basic auth (#4077)
Issue: https://github.com/prometheus/prometheus/issues/4076

Signed-off-by: Adam Shannon <adamkshannon@gmail.com>
2018-04-25 18:19:06 +01:00
Rohit Gupta 30c3e02864 Fixes #4090. Marathon service discovery for 5XX http response (#4091)
Signed-off-by: rohit01 <hello@rohit.io>
2018-04-17 09:28:06 +01:00
sev3ryn cc917aee7f fix of endless loop while doing Consul service discovery. (#4044)
Reloading Prometheus configs doesn't make loop end.
It produced a goroutine leak
2018-04-05 10:41:09 +01:00
Philippe Laflamme 2aba238f31 Use common HTTPClientConfig for marathon_sd configuration (#4009)
This adds support for basic authentication which closes #3090

The support for specifying the client timeout was removed as discussed in https://github.com/prometheus/common/pull/123. Marathon was the only sd mechanism doing this and configuring the timeout is done through `Context`.

DC/OS uses a custom `Authorization` header for authenticating. This adds 2 new configuration properties to reflect this.

Existing configuration files that use the bearer token will no longer work. More work is required to make this backwards compatible.
2018-04-05 09:08:18 +01:00
Manos Fokas 25f929b772 Yaml UnmarshalStrict implementation. (#4033)
* Updated yaml vendor package.

* remove checkOverflow duplicate in rulefmt

* remove duplicated HTTPClientConfig.Validate()

* Added yaml static check.
2018-04-04 09:07:39 +01:00
albatross0 0245fd55bf Add a machine type label to GCE SD (#4032) 2018-03-31 09:20:19 +01:00
Kristiyan Nikolov be85ba3842 discovery/ec2: Support filtering instances in discovery (#4011) 2018-03-31 07:51:11 +01:00
Corentin Chary 60dafd425c consul: improve consul service discovery (#3814)
* consul: improve consul service discovery

Related to #3711

- Add the ability to filter by tag and node-meta in an efficient way (`/catalog/services`
  allow filtering by node-meta, and returns a `map[string]string` or `service`->`tags`).
  Tags and nore-meta are also used in `/catalog/service` requests.
- Do not require a call to the catalog if services are specified by name. This is important
  because on large cluster `/catalog/services` changes all the time.
- Add `allow_stale` configuration option to do stale reads. Non-stale
  reads can be costly, even more when you are doing them to a remote
  datacenter with 10k+ targets over WAN (which is common for federation).
- Add `refresh_interval` to minimize the strain on the catalog and on the
  service endpoint. This is needed because of that kind of behavior from
  consul: https://github.com/hashicorp/consul/issues/3712 and because a catalog
  on a large cluster would basically change *all* the time. No need to discover
  targets in 1sec if we scrape them every minute.
- Added plenty of unit tests.

Benchmarks
----------

```yaml
scrape_configs:

- job_name: prometheus
  scrape_interval: 60s
  static_configs:
    - targets: ["127.0.0.1:9090"]

- job_name: "observability-by-tag"
  scrape_interval: "60s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      tag: marathon-user-observability  # Used in After
      refresh_interval: 30s             # Used in After+delay
  relabel_configs:
    - source_labels: [__meta_consul_tags]
      regex: ^(.*,)?marathon-user-observability(,.*)?$
      action: keep

- job_name: "observability-by-name"
  scrape_interval: "60s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      services:
        - observability-cerebro
        - observability-portal-web

- job_name: "fake-fake-fake"
  scrape_interval: "15s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      services:
        - fake-fake-fake
```

Note: tested with ~1200 services, ~5000 nodes.

| Resource | Empty | Before | After | After + delay |
| -------- |:-----:|:------:|:-----:|:-------------:|
|/service-discovery size|5K|85MiB|27k|27k|27k|
|`go_memstats_heap_objects`|100k|1M|120k|110k|
|`go_memstats_heap_alloc_bytes`|24MB|150MB|28MB|27MB|
|`rate(go_memstats_alloc_bytes_total[5m])`|0.2MB/s|28MB/s|2MB/s|0.3MB/s|
|`rate(process_cpu_seconds_total[5m])`|0.1%|15%|2%|0.01%|
|`process_open_fds`|16|*1236*|22|22|
|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="services"}[5m])`|~0|1|1|*0.03*|
|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="service"}[5m])`|0.1|*80*|0.5|0.5|
|`prometheus_target_sync_length_seconds{quantile="0.9",scrape_job="observability-by-tag"}`|N/A|200ms|0.2ms|0.2ms|
|Network bandwidth|~10kbps|~2.8Mbps|~1.6Mbps|~10kbps|

Filtering by tag using relabel_configs uses **100kiB and 23kiB/s per service per job** and quite a lot of CPU. Also sends and additional *1Mbps* of traffic to consul.
Being a little bit smarter about this reduces the overhead quite a lot.
Limiting the number of `/catalog/services` queries per second almost removes the overhead of service discovery.

* consul: tweak `refresh_interval` behavior

`refresh_interval` now does what is advertised in the documentation,
there won't be more that one update per `refresh_interval`. It now
defaults to 30s (which was also the current waitTime in the consul query).

This also make sure we don't wait another 30s if we already waited 29s
in the blocking call by substracting the number of elapsed seconds.

Hopefully this will do what people expect it does and will be safer
for existing consul infrastructures.
2018-03-23 14:48:43 +00:00