Commit graph

385 commits

Author SHA1 Message Date
Kien Nguyen-Tuan 813b58367a [OpenStack SD] Add ProjectID and UserID meta labels (#5431)
Add extra meta labels which will be useful in the case
Prometheus discovery instances from all projects.

Signed-off-by: Kien Nguyen <kiennt2609@gmail.com>
2019-04-04 10:02:31 +01:00
Tariq Ibrahim 8fdfa8abea refine error handling in prometheus (#5388)
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-26 00:01:12 +01:00
Simon Pasquier 782d00059a
discovery: factorize for SD based on refresh (#5381)
* discovery: factorize for SD based on refresh

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* discovery: use common metrics for refresh

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-03-25 11:54:22 +01:00
Tariq Ibrahim 0d7104b7eb discovery/azure:optimize iteration logic for VMScalesets, VMScalesetVMs, and VMs (#5363)
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-20 09:03:47 +00:00
Tariq Ibrahim 5f933e99d0 discovery/azure: make local virtualMachine struct more generic by removing the go sdk field reference (#5350)
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-15 16:18:37 +00:00
Mario Trangoni 5354ffff99 Fix some spelling issues (#5361)
See,
$ codespell -S './vendor/*,./.git*,./web/ui/static/vendor*' --ignore-words-list="uint,dur,ue,iff,te,wan"

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2019-03-14 14:38:54 +00:00
Simon Pasquier 67385f356f
discovery/openstack: pass context to the OpenStack client (#5231)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-03-12 13:53:03 +01:00
Callum Styan 83c46fd549 update Consul vendor code so that catalog.ServiceMultipleTags can be (#5151)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-03-12 10:31:27 +00:00
Tariq Ibrahim 197e5ac597 docs: minor improvements to the service discovery README.md (#5296)
i) Increased the size of the Service Discovery Readme title
ii) Changed `TargetGroups` to "target groups" as it has been relocated and renamed to another package.

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-03 19:48:03 +01:00
JoeWrightss e4b88704a6 Fix misspell in manager_test.go (#5279)
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-02-27 11:22:31 +01:00
Simon Pasquier 1d2fc95b1c
discovery/marathon: pass context to the client (#5232)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:49:16 +01:00
Simon Pasquier e60d314f43
discovery/consul: pass current context to Consul queries (#5230)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:48:19 +01:00
Simon Pasquier 8f578d9c6b
discovery/ec2: pass context to the client (#5234)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:48:03 +01:00
Simon Pasquier 4997dcb4a1
discovery/gce: pass context to the client (#5233)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:47:43 +01:00
Simon Pasquier 9040dddd0c
discovery/azure: pass context to the client (#5255)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:47:26 +01:00
Simon Pasquier fe7a1bcfc6
discovery/triton: pass context to the client (#5235)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 14:47:04 +01:00
Björn Rabenstein ad29221a7b
Merge pull request #5020 from erikh/upgrade-miekg-dns
Upgrade miekg dns
2019-02-25 12:47:32 +01:00
Simon Pasquier e72c875e63
config: fix Kubernetes config with empty API server (#5256)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-22 15:51:47 +01:00
Nguyen Hai Truong aed9ea144a Remove duplicated words in comments
Although it is spelling mistakes, it might make an affects
while reading.

Co-Authored-By: Kim Bao Long longkb@vn.fujitsu.com
Signed-off-by: Nguyen Hai Truong <truongnh@vn.fujitsu.com>
2019-02-20 17:41:02 -08:00
Simon Pasquier c8a1a5a93c
discovery/kubernetes: fix support for password_file and bearer_token_file (#5211)
* discovery/kubernetes: fix support for password_file

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Create and pass custom RoundTripper to Kubernetes client

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use inline HTTPClientConfig

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-20 11:22:34 +01:00
Erik Hollensbe be3c082539 discovery/dns/dns.go: fix handling of truncated dns records
https://github.com/miekg/dns/pull/815 goes into the detail, but more or
less the existing solution was no longer supported and needed to be
rewritten to support the new versions of the library. miekg additionally
claims this is more correct in the ticket.

Signed-off-by: Erik Hollensbe <github@hollensbe.org>
2019-02-20 00:36:41 +00:00
Simon Pasquier f9462d5d44 discovery/consul: pass current context to Consul queries
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-18 14:23:56 +01:00
JoeWrightss 4cb6c202ff Fix fmt.Errorf error message (#5199)
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-02-10 15:16:20 +05:30
tariqibrahim b173de0c26 fix ineffectual assignment in dns.go
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-01-28 17:15:43 -08:00
Jannick Fahlbusch ฏ๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎ 63f375e80a [FIX] Azure DS: Return error when request failed (#4719)
This fixes the issue that the error is swallowed when the request failed.

Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
2019-01-28 21:31:45 +00:00
Tariq Ibrahim f4275d2352 Use the latest versions of azure go sdk and go-autorest (#5015)
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-01-28 18:30:29 +00:00
Tariq Ibrahim bfcdba211f remove the prepended watch reactor from the fake k8s client (#5140)
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2019-01-28 16:42:25 +01:00
Simon Pasquier 68e4c211f2
discovery/azure: more robust handling of go routines (#5106)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-18 09:55:47 +01:00
Matt Layher 302148fd69 *: apply gofmt -s
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-01-16 17:28:14 -05:00
Simon Pasquier 22a1def98d
Merge pull request #5099 from prometheus/release-2.6
Merge release-2.6 to master
2019-01-16 09:26:00 +01:00
tommarute 9922c35a23 marathon-sd - use Tasks.Ports instead of PortDefinitions.Ports if RequirePorts is false (#5022) (#5026)
Signed-off-by: tommarute <tommarute@gmail.com>
2019-01-14 17:20:22 +00:00
Sylvain Rabot d9f4a8c95f sd: Fix stuck Azure service discovery (#5088)
Signed-off-by: Sylvain Rabot <s.rabot@lectra.com>
2019-01-14 15:09:27 +00:00
Kevin Bulebush 718344434c openstack_sd: Supporting application credential for authentication. (#4968)
* openstack_sd: Support application credentials for authentication.
Updated gophercloud

Signed-off-by: Kevin Bulebush <kmbulebu@gmail.com>
2019-01-09 15:18:58 +00:00
Frederic Branczyk e9ae0b5a1b
Merge pull request #4927 from tariq1890/update_k8s
update client-go to v10.0.0 and other k8s deps to v1.13.1
2019-01-07 10:54:34 +01:00
Fabian Reinartz ca93c8e19b
Merge pull request #4969 from prometheus/azuresubid
Add Azure tenant and subscription ID labels
2019-01-06 12:25:06 +01:00
Simon Pasquier f678e27eb6
*: use latest release of staticcheck (#5057)
* *: use latest release of staticcheck

It also fixes a couple of things in the code flagged by the additional
checks.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use official release of staticcheck

Also run 'go list' before staticcheck to avoid failures when downloading packages.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-04 14:47:38 +01:00
tariqibrahim aa94efe4b5 Merge branch 'master' of https://github.com/prometheus/prometheus into update_k8s 2019-01-03 10:27:12 -08:00
Fabian Reinartz 7a41038695 Add Azure tenant and subscription ID labels
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2019-01-03 13:09:13 +01:00
Lv Jiawei ad22389218 Add ingress in UnmarshalYAML and init (#5035)
Both UnmarshalYAML and init lacks the role type ingress.

Signed-off-by: MIBc <lvjiawei@cmss.chinamobile.com>
2018-12-24 09:24:01 +00:00
tariqibrahim 122b47caa0 address review comment in client_metrics
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-21 00:46:47 -08:00
tariqibrahim 1e4e4c46ba Merge branch 'master' of https://github.com/prometheus/prometheus into update_k8s
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-20 11:36:54 -08:00
Ilya Gladyshev 922c17e119 added name label to all discovery metrics (#5002)
Signed-off-by: Ilya Gladyshev <ilya.v.gladyshev@gmail.com>
2018-12-20 14:47:29 +00:00
Erik Hollensbe b94eea482c discovery/gce: oauth2.NoContext is deprecated, replace with context.Background() (#5024)
* vendor update
* discovery/gce: oauth2.NoContext is deprecated, replace with context.Background()

Signed-off-by: Erik Hollensbe <github@hollensbe.org>
2018-12-20 14:45:18 +00:00
Marcel D. Juhnke c7d83b2b6a discovery: add support for Managed Identity authentication in Azure SD (#4590)
Signed-off-by: Marcel Juhnke <marrat@marrat.de>
2018-12-19 10:03:33 +00:00
tariqibrahim 0d4b6e4e66 address review comments
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-18 08:14:30 -08:00
Tariq Ibrahim de6f3b6af7 expose kubernetes service cluster ip (#4940)
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2018-12-18 15:17:34 +00:00
JoeWrightss e8be31eed9 Fixs typo: 'possibliy' to 'possibly' (#4974)
Signed-off-by: JoeWrightss <zhoulin.xie@daocloud.io>
2018-12-18 11:52:40 +01:00
Samuel Alfageme 240321acee Add taggedAddress to the labels in ConsulSD (#5001)
Useful when multiple (tagged) addresses for a node are exposed on the catalog API
Ref. https://www.consul.io/api/catalog.html#taggedaddresses

Signed-off-by: Samuel Alfageme <samuel@alfage.me>
2018-12-18 11:51:05 +01:00
Tariq Ibrahim e3bdc463fa Revert "add logic to check if an azure VM is deallocated or not (#4908)" (#4980)
This reverts commit 61cf4365

Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-12 09:27:12 +01:00
tariqibrahim 1fd438ed2b rebase and resolve merge conflicts
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-06 09:15:34 -08:00
tariqibrahim 412ca33226 update kubernetes deps to v1.13.0
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-03 19:32:16 -08:00
Julius Volz d28246e337
Fix config loading panics on nil pointer slice elements (#4942)
Fixes https://github.com/prometheus/prometheus/issues/4902
Fixes https://github.com/prometheus/prometheus/issues/4889

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-12-03 18:09:02 +08:00
Simon Pasquier 8b91d39c43
discovery: send empty group on empty SD config (#4819)
* discovery: send empty group on blank SD config

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Update comments

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Add another comment

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-30 17:59:57 +01:00
Tariq Ibrahim 61cf4365d6 add logic to check if an azure VM is deallocated or not (#4908)
* add logic to check if an azure VM is deallocated or not
* update documentation  with the new azure power state label

Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-11-30 11:32:40 +00:00
Serghei Anicheev 8e659a5109 Adding private_dns_name to the list of ec2 labels which can be used i… (#4693)
* Adding private_dns_name to the list of ec2 labels which can be used in node naming for dynamic environments

Signed-off-by: Serghei Anicheev <serghei@rentalcover.com>
2018-11-30 11:11:06 +00:00
mengnan a5d39361ab discovery/azure: Fail hard when Azure authentication parameters are missing (#4907)
* discovery/azure: fail hard when client_id/client_secret is empty

Signed-off-by: mengnan <supernan1994@gmail.com>

* discovery/azure: fail hard when authentication parameters are missing

Signed-off-by: mengnan <supernan1994@gmail.com>

* add unit test

Signed-off-by: mengnan <supernan1994@gmail.com>

* add unit test

Signed-off-by: mengnan <supernan1994@gmail.com>

* format code

Signed-off-by: mengnan <supernan1994@gmail.com>
2018-11-29 16:47:59 +01:00
Ben Kochie c6399296dc
Fix spelling/typos (#4921)
* Fix spelling/typos

Fix spelling/typos reported by codespell/misspell.
* UK -> US spelling changes.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-11-27 17:44:29 +01:00
Simon Pasquier 0bb810d126
discovery/marathon: fix leaked connections (#4915)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-27 14:58:27 +01:00
Timo Beckers bea302e061 marathon-sd - use 'hostPort' member of portMapping to construct target endpoints (#4887)
Fixes #4855 - ServicePort was wrongly used to construct an address to endpoints
defined in portMappings. This was changed to HostPort. Support for obtaining
auto-generated host ports was also added.

Signed-off-by: Timo Beckers <timo@incline.eu>
2018-11-26 13:39:35 +01:00
Daniele Sluijters f25a6baedb remote: Set User-Agent header in requests (#4891)
Currently Prometheus requests show up with a UA of Go-http-client/1.1
which isn't super helpful. Though the X-Prometheus-Remote-* headers
exist they need to be explicitly configured when logging the request in
order to be able to deduce this is a request originating from
Prometheus. By setting the header we remove this ambiguity and make
default server logs just a bit more useful.

This also updates a few other places to consistently capitalize the 'P'
in the user agent, as well as ensure we set a UA to begin with.

Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>
2018-11-23 22:49:49 +08:00
Sylvain Rabot 1fd3b33dcd Prevent Azure SD panic (fix #4779) (#4867)
Signed-off-by: Sylvain Rabot <s.rabot@lectra.com>
2018-11-19 12:23:12 +00:00
Bryan Boreham cf37e1feb4 Add __meta_kubernetes_pod_phase label in discovery (#4824)
This lets you add a relabel rule to drop scrapes for pods which are
not running.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-11-06 14:40:24 +00:00
Silvio Gissi 6100f160ad EC2 Platform meta label (#4663)
Set __meta_ec2_platform label with the instance platform string. Set to 'windows' on Windows servers and absent otherwise.


Signed-off-by: Silvio Gissi <silvio@gissilabs.com>
2018-11-06 14:39:48 +00:00
Goutham Veeramachaneni f988af7235 Revert #4586 (#4766)
This breaks people if they are depending on the contents of
__address__ label.

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2018-10-24 10:16:36 +02:00
Simon Pasquier a30348f1a4 discovery: add config label to discovered targets metric (#4753)
* discovery: add labels to discovered targets metric

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-18 16:46:59 +01:00
Simon Pasquier 5824d6902d
openstack: fix client when using env variables (#4734)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-17 16:04:07 +02:00
Kien Nguyen-Tuan 9c5370fdfe Support discover instances from all projects (#4682)
By default, OpenStack SD only queries for instances
from specified project. To discover instances from other
projects, users have to add more openstack_sd_configs for
each project.

This patch adds `all_tenants` <bool> options to
openstack_sd_configs. For example:

- job_name: 'openstack_all_instances'
  openstack_sd_configs:
    - role: instance
      region: RegionOne
      identity_endpoint: http://<identity_server>/identity/v3
      username: <username>
      password: <super_secret_password>
      domain_name: Default
      all_tenants: true

Co-authored-by: Kien Nguyen <kiennt2609@gmail.com>
Signed-off-by: dmatosl <danielmatos.lima@gmail.com>
2018-10-17 13:01:33 +01:00
Simon Pasquier c4a6acfb1e
*: move to go 1.11 (#4626)
* *: move to go 1.11

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Reduce number of places where we specify the Go version

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-16 09:41:45 +02:00
Goutham Veeramachaneni ffb7f829ec
Merge pull request #4730 from prometheus/release-2.4
Release 2.4
2018-10-12 14:15:42 -07:00
Simon Pasquier 3e6b9d43c3
Merge pull request #4720 from teresy/redundant-nil-check-slice
Remove redundant nil check
2018-10-11 10:24:55 +02:00
Rijnard van Tonder 9d102e3bff The nil check before the range loop is redundant
Signed-off-by: Rijnard van Tonder <hi.teresy@gmail.com>
2018-10-10 16:11:45 -04:00
Richard Kiene b537f6047a Add ability to filter triton_sd targets by pre-defined groups (#4701)
Additionally, add triton groups metadata to the discovery reponse
and correct a documentation error regarding the triton server id
metadata.

Signed-off-by: Richard Kiene <richard.kiene@joyent.com>
2018-10-10 10:03:34 +01:00
Simon Pasquier a2a78d0a09 discovery/openstack: discover all interfaces (#4649)
* discovery/openstack: discover all interfaces
* Add address pool label

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-09 16:17:08 +01:00
Simon Pasquier e1e2821cca
Merge pull request #4654 from simonpasquier/openstack-tls
discovery/openstack: support tls_config
2018-10-05 18:11:55 +02:00
Jannick Fahlbusch ฏ๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎ f78e59577b [FIX] EC2 DS: Check for existence of OwnerID (#4672)
Commit 1c89984 introduced the ability to expose the owner of the instance.
However, this breaks Prometheus if there is no OwnerID in the reservation (Eg. if you are using a private EC2-API introduced by #4333)

Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
2018-10-02 16:18:31 +05:30
Simon Pasquier 657199af22 Address Krasi comments
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-28 12:29:24 +02:00
Simon Pasquier 5df757fdd4 zookeeper: fix panic
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-28 11:39:40 +02:00
Simon Pasquier 365931ea83 discovery: add metrics + send updates from one goroutine only
The added metrics are:

* prometheus_sd_discovered_targets
* prometheus_sd_received_updates_total
* prometheus_sd_updates_delayed_total
* prometheus_sd_updates_total

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-27 15:59:42 +02:00
Simon Pasquier f2d43af820
Merge pull request #4582 from simonpasquier/add-discovery-tests
discovery: add more tests
2018-09-27 15:18:42 +02:00
Simon Pasquier ff08c40091 discovery/openstack: support tls_config
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-25 14:31:32 +02:00
Frederic Branczyk b75ec7e6ef
Merge pull request #4458 from FUSAKLA/k8s-sd-add-metrics
feat: added more k8s SD metrics
2018-09-21 13:10:48 +02:00
Timo Beckers 1c9fbd65c4 marathon-sd - change port gathering strategy, support for container networking (#4499)
* marathon-sd - change port gathering strategy, add support for container networking

- removed unnecessary error check on HTTPClientConfig.Validate()
- renamed PortDefinitions and PortMappings to PortDefinition and PortMapping respectively
- extended data model for extra parsed fields from Marathon json
- support container networking on Marathon 1.5+ (target Task.IPAddresses.x.Address)
- expanded test suite to cover all new cases
- test: cancel context when reading from doneCh before returning from function
- test: split test suite into Ports/PortMappings/PortDefinitions

Signed-off-by: Timo Beckers <timo@incline.eu>
2018-09-21 11:53:04 +01:00
Martin Chodur f2d037133e
feat: added more k8s SD metrics
Signed-off-by: Martin Chodur <m.chodur@seznam.cz>
2018-09-20 22:28:51 +02:00
Camille Janicki b035ea0ea9 Change discovery subpackages to not use testify in tests (#4612)
* Change discovery subpackages to not use testify in tests

Signed-off-by: Camille Janicki <camille.janicki@gmail.com>

* Remove testify suite from vendor dir

Signed-off-by: Camille Janicki <camille.janicki@gmail.com>
2018-09-18 17:35:22 +02:00
Simon Pasquier 128ff546b8 config: add test for OpenStack SD (#4594)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-13 21:44:27 +05:30
Tom Wilkie e3d36f4802 Don't import testing from non-test code. (#4595)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-13 16:03:26 +05:30
Bryan Boreham 968f657eaa Stop removing the final dot from rooted DNS names (#4586)
Removing a final dot changes the meaning of the name and can cause
extra DNS lookups as the resolver traverses its search path.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-09-13 15:28:38 +05:30
Simon Pasquier e7cee1b5ba Remove tests redundant with TestTargetUpdatesOrder
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 17:56:53 +02:00
Simon Pasquier 7dc3f11306 WIP discovery: refactor TestTargetUpdatesOrder
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:15:03 +02:00
Simon Pasquier 8fd891bf3f Speed up tests that were still using the 5s timeout
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 8289501420 Address krasi's comments
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 1cee5b5b06 Don't multiple the interval value by 1ms in the mock
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 4900405d2f Refactor TestCoordinationWithReceiver() to work with any Discoverer
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 0798f14e02 Add TestCoordinationWithEmptyProvider
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Simon Pasquier 48989d8996 discovery: add more tests
Co-authored-by: Camille Janicki <camille.janicki@gmail.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-12 16:13:15 +02:00
Krasi Georgiev ba7eb733e8 tidy up the discovery logs,updating loops and selects (#4556)
* tidy up the discovery logs,updating loops and selects

few objects renamings

removed a very noise debug log on the k8s discovery. It would be usefull
to show some summary rather than every update as this is impossible to
follow.

added most comments as debug logs so each block becomes self
explanatory.

when the discovery receiving channel is full will retry again on the
next cycle.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>

* add noop logger for the SD manager tests.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>

* spelling nits

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-09-05 17:02:47 +05:30
Tariq Ibrahim f708fd5c99 Adding support for multiple azure environments (#4569)
Signed-off-by: Tariq Ibrahim <tariq.ibrahim@microsoft.com>
2018-09-04 17:55:40 +02:00
Simon Pasquier 674c76adb8 discovery: coalesce identical SD configurations (#3912)
* discovery: coalesce identical SD configurations

Instead of creating as many SD providers as declared in the
configuration, the discovery manager merges identical configurations
into the same provider and keeps track of the subscribers. When
the manager receives target updates from a SD provider, it will
broadcast the updates to all interested subscribers.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-01 08:51:31 +01:00
Krasi Georgiev 53691ae261 Simplify SD update throttling (#4523)
* simplfied SD updates throtling

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>

* add default to catch cases when we don't have new updates.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-08-27 17:12:11 +02:00
Fabian Reinartz f571b69010
Merge pull request #4514 from jkohen/ec2-targets
Expose EC2 instance owner as a discovery label.
2018-08-20 08:43:44 +02:00
Javier Kohen 1c89984778 Expose EC2 instance owner as a discovery label.
This exposes the OwnerID field of the DescribeInstances respons as .

Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-17 11:30:18 -04:00
Yecheng Fu d4eae8cc0c Wait for all internal discoveries are done before exiting. (#4508)
Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-08-17 18:50:22 +05:30
Fabian Reinartz b04ab71268
Merge pull request #4488 from jkohen/patch-3
Populate __meta_gce_instance_id discovery label
2018-08-11 09:52:28 +02:00
Javier Kohen 403ac08ece Expose __meta_gce_instance_id as an integer (instead of raw bytes).
Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-10 16:21:46 -04:00
Javier Kohen 7e9549b398 Added __meta_gce_instance_id discovery label
Populated from instance.ID. I will follow up with a change to the documentation.

Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-10 11:57:55 -04:00
Simon Pasquier b7054f3a78
Merge pull request #4443 from simonpasquier/fix-consul-connections-leak
discovery/consul: close idle connections on stop
2018-08-10 17:43:39 +02:00
Benji Visser 46fb4078a6 handle nil pointer in ec2 discovery (#4469)
This handles a nil pointer that was being accessed in EC2 discovery.

Fixes: #4441

Signed-off-by: noqcks <benny@noqcks.io>
2018-08-07 08:35:22 +01:00
Johannes Scheuermann f978f5bba3 Fixes #4202, correctly parse VMs with empty tags (#4450)
Signed-off-by: Johannes M. Scheuermann <joh.scheuer@gmail.com>
2018-08-02 10:10:17 +01:00
jojohappy e060f7755f To keep comment of NodeLegacyHostIP for k8s node address
Signed-off-by: jojohappy <sarahdj0917@gmail.com>
2018-08-02 10:25:28 +08:00
jojohappy e81785d1a3 To keep depecrate k8s node NodeLegacyHostIP as local constant to keep compatibility for older k8s version
Signed-off-by: jojohappy <sarahdj0917@gmail.com>
2018-08-02 10:25:28 +08:00
jojohappy 21e50a3f9d Upgrade k8s client to kubernetes-1.11.0
Signed-off-by: jojohappy <sarahdj0917@gmail.com>
2018-08-02 10:25:27 +08:00
Simon Pasquier 1cd29f782c discovery/consul: close idle connections on stop
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-08-01 17:26:52 +02:00
Johannes Scheuermann 7608ee87d0 Inital support for Azure VMSS (#4202)
* Inital support for Azure VMSS

Signed-off-by: Johannes Scheuermann <johannes.scheuermann@inovex.de>

* Add documentation for the newly introduced label

Signed-off-by: Johannes M. Scheuermann <joh.scheuer@gmail.com>
2018-08-01 12:52:21 +01:00
José Martínez 791c13b142 discovery/ec2: Add primary_subnet_id label
Signed-off-by: José Martínez <xosemp@gmail.com>
2018-07-25 09:20:58 +01:00
José Martínez 5e4a33c890 discovery/ec2: Maintain order of subnet_id label
Signed-off-by: José Martínez <xosemp@gmail.com>
2018-07-25 09:20:58 +01:00
Jannick Fahlbusch ฏ๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎ 0be25f92e2 EC2 Discovery: Allow to set a custom endpoint (#4333)
Allowing to set a custom endpoint makes it easy to monitor targets on non AWS providers with EC2 compliant APIs.

Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
2018-07-18 10:48:14 +01:00
Ivan Voronchihin 59d214d277 Update autorest vedoring (#4147)
Signed-off-by: bege13mot <bege13mot@gmail.com>
2018-07-18 05:24:15 +01:00
Julius Volz 219e477272 Fix some (valid) lint errors (#4287)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 05:07:33 +01:00
Romain Baugue b41be4ef52 Discovery consul service meta (#4280)
* Upgrade Consul client
* Add ServiceMeta to the labels in ConsulSD

Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>
2018-07-18 05:06:56 +01:00
Simon Pasquier f32acc0b7b discovery/openstack: remove unneeded assignment
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-07-15 12:37:57 +01:00
Julius Volz 05d6d6a2e5
k8s SD: Fix "schema" -> "scheme" typo (#4371)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-12 16:12:32 +02:00
Krasi Georgiev a155b6d29d fix the zookeper race (#4355)
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-07-06 08:39:38 +01:00
Dmitry Bashkatov 72327d98fb discovery/kubernetes/ingress: remove unnecessary check
Signed-off-by: Dmitry Bashkatov <dbashkatov@gmail.com>
2018-07-04 15:47:11 +03:00
Dmitry Bashkatov e2baf89eac discovery/kubernetes/ingress: fix scheme discovery (Closes #4327)
Signed-off-by: Dmitry Bashkatov <dbashkatov@gmail.com>
2018-07-04 13:28:44 +03:00
Dmitry Bashkatov 9cdca50bdd discovery/kubernetes/ingress: add more tests
Signed-off-by: Dmitry Bashkatov <dbashkatov@gmail.com>
2018-07-04 13:28:44 +03:00
Julius Volz 5cf0113762
Add "omitempty" to some SD config YAML field tags (#4338)
Especially for Kubernetes SD, this fixes a bug where the rendered
configuration says "api_server: null", which when read back is not
interpreted as an un-set API server (thus the default is not applied).

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-03 13:43:41 +02:00
Simon Pasquier 6eab4bbca1 kubernetes_sd: fix namespace filtering (#4273)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-06-15 09:08:14 +01:00
Paul Gier d24d2acd11 config: set target group source index during unmarshalling (#4245)
* config: set target group source index during unmarshalling

Fixes issue #4214 where the scrape pool is unnecessarily reloaded for a
config reload where the config hasn't changed.  Previously, the discovery
manager changed the static config after loading which caused the in-memory
config to differ from a freshly reloaded config.

Signed-off-by: Paul Gier <pgier@redhat.com>

* [issue #4214] Test that static targets are not modified by discovery manager

Signed-off-by: Paul Gier <pgier@redhat.com>
2018-06-13 16:34:59 +01:00
Simon Pasquier 0e5e7f75cd discovery/file: fix logging (#4178)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-06-12 12:45:59 +01:00
Callum Styan 03578d5df8 add example usage of SD adapter for converting unsupported SD type to filesd (#3720)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2018-05-30 13:14:34 +01:00
Adam Shannon a22e1736b9 discovery/marathon: include url in fetchApps error (#4171)
This was previously part of a larger PR, but that was closed.

https://github.com/prometheus/prometheus/issues/4048#issuecomment-389899997

This change could include auth information in the URL. That's been
fixed in upstream go, but not until Go 1.11. See: https://github.com/golang/go/issues/24572

Signed-off-by: Adam Shannon <adamkshannon@gmail.com>
2018-05-18 10:20:14 +01:00
Damien Lespiau e64037053d Expose controller kind and name to labelling rules
Relabelling rules can use this information to attach the name of the controller
that has created a pod.

In turn, this can be used to slice metrics by workload at query time, ie.
"Give me all metrics that have been created by the $name Deployment"

Signed-off-by: Damien Lespiau <damien@weave.works>
2018-05-09 11:51:37 +02:00
Nathan Graves 5b27996cb3 Include GCE labels during service discovery. Updated vendor files for Google API. (#4150)
Signed-off-by: Nathan Graves <nathan.graves@kofile.us>
2018-05-08 17:37:47 +01:00
beorn7 a4e4bec3fe Merge branch 'release-2.2' 2018-04-30 14:38:29 +02:00
Elif T. Kuş 57dcdfb15f Rewrote tests with testutil for several test files (#4086)
* promql: Rewrote tests with testutil for functions_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>

* pkg/relabel: Rewrote tests with testutil for relabel_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>

* discovery/consul: Rewrote tests with testutil for consul_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>

* scrape: Rewrote tests with testutil for manager_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
2018-04-27 13:11:16 +01:00
Yecheng Fu 2be543e65a Simplify some code and comments.
Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-04-25 19:29:34 +02:00
Yecheng Fu 46683dd67d Simplify code.
- Unified `send` function.
- Pass InformerSynced functions to `cache.WaitForCacheSync`.
- Use `Role\w+` constants instead of literal string.

Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-04-25 19:29:21 +02:00
Yecheng Fu 3a253f796c Fix grammar in comments and add missing expectedMaxItems to let it
break fast.

Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-04-25 19:29:03 +02:00
Yecheng Fu d73b0d3141 Move hasSynced interface and its implementations to *_test.go files.
Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-04-25 19:28:49 +02:00
Yecheng Fu 8ceb8f2ae8 Refactor Kubernetes Discovery Part 2: Refactoring
- Do initial listing and syncing to scrape manager, then register event
  handlers may lost events happening in listing and syncing (if it
  lasted a long time). We should register event handlers at the very
  begining, before processing just wait until informers synced (sync in
  informer will list all objects and call OnUpdate event handler).
- Use a queue then we don't block event callbacks and an object will be
  processed only once if added multiple times before it being processed.
- Fix bug in `serviceUpdate` in endpoints.go, we should build endpoints
  when `exists && err == nil`. Add `^TestEndpointsDiscoveryWithService`
  tests to test this feature.

Testing:

- Use `k8s.io/client-go` testing framework and fake implementations which are
  more robust and reliable for testing.
- `Test\w+DiscoveryBeforeRun` are used to test objects created before
  discoverer runs
- `Test\w+DiscoveryAdd\w+` are used to test adding objects
- `Test\w+DiscoveryDelete\w+` are used to test deleting objects
- `Test\w+DiscoveryUpdate\w+` are used to test updating objects
- `TestEndpointsDiscoveryWithService\w+` are used to test endpoints
  events triggered by services
- `cache.DeletedFinalStateUnknown` related stuffs are removed, because
  we don't care deleted objects in store, we only need its name to send
  a specical `targetgroup.Group` to scrape manager

Signed-off-by: Yecheng Fu <cofyc.jackson@gmail.com>
2018-04-25 19:28:34 +02:00
Adam Shannon 809881d7f5 support reading basic_auth password_file for HTTP basic auth (#4077)
Issue: https://github.com/prometheus/prometheus/issues/4076

Signed-off-by: Adam Shannon <adamkshannon@gmail.com>
2018-04-25 18:19:06 +01:00
Rohit Gupta 30c3e02864 Fixes #4090. Marathon service discovery for 5XX http response (#4091)
Signed-off-by: rohit01 <hello@rohit.io>
2018-04-17 09:28:06 +01:00
sev3ryn cc917aee7f fix of endless loop while doing Consul service discovery. (#4044)
Reloading Prometheus configs doesn't make loop end.
It produced a goroutine leak
2018-04-05 10:41:09 +01:00
Philippe Laflamme 2aba238f31 Use common HTTPClientConfig for marathon_sd configuration (#4009)
This adds support for basic authentication which closes #3090

The support for specifying the client timeout was removed as discussed in https://github.com/prometheus/common/pull/123. Marathon was the only sd mechanism doing this and configuring the timeout is done through `Context`.

DC/OS uses a custom `Authorization` header for authenticating. This adds 2 new configuration properties to reflect this.

Existing configuration files that use the bearer token will no longer work. More work is required to make this backwards compatible.
2018-04-05 09:08:18 +01:00
Manos Fokas 25f929b772 Yaml UnmarshalStrict implementation. (#4033)
* Updated yaml vendor package.

* remove checkOverflow duplicate in rulefmt

* remove duplicated HTTPClientConfig.Validate()

* Added yaml static check.
2018-04-04 09:07:39 +01:00
albatross0 0245fd55bf Add a machine type label to GCE SD (#4032) 2018-03-31 09:20:19 +01:00
Kristiyan Nikolov be85ba3842 discovery/ec2: Support filtering instances in discovery (#4011) 2018-03-31 07:51:11 +01:00
Corentin Chary 60dafd425c consul: improve consul service discovery (#3814)
* consul: improve consul service discovery

Related to #3711

- Add the ability to filter by tag and node-meta in an efficient way (`/catalog/services`
  allow filtering by node-meta, and returns a `map[string]string` or `service`->`tags`).
  Tags and nore-meta are also used in `/catalog/service` requests.
- Do not require a call to the catalog if services are specified by name. This is important
  because on large cluster `/catalog/services` changes all the time.
- Add `allow_stale` configuration option to do stale reads. Non-stale
  reads can be costly, even more when you are doing them to a remote
  datacenter with 10k+ targets over WAN (which is common for federation).
- Add `refresh_interval` to minimize the strain on the catalog and on the
  service endpoint. This is needed because of that kind of behavior from
  consul: https://github.com/hashicorp/consul/issues/3712 and because a catalog
  on a large cluster would basically change *all* the time. No need to discover
  targets in 1sec if we scrape them every minute.
- Added plenty of unit tests.

Benchmarks
----------

```yaml
scrape_configs:

- job_name: prometheus
  scrape_interval: 60s
  static_configs:
    - targets: ["127.0.0.1:9090"]

- job_name: "observability-by-tag"
  scrape_interval: "60s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      tag: marathon-user-observability  # Used in After
      refresh_interval: 30s             # Used in After+delay
  relabel_configs:
    - source_labels: [__meta_consul_tags]
      regex: ^(.*,)?marathon-user-observability(,.*)?$
      action: keep

- job_name: "observability-by-name"
  scrape_interval: "60s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      services:
        - observability-cerebro
        - observability-portal-web

- job_name: "fake-fake-fake"
  scrape_interval: "15s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      services:
        - fake-fake-fake
```

Note: tested with ~1200 services, ~5000 nodes.

| Resource | Empty | Before | After | After + delay |
| -------- |:-----:|:------:|:-----:|:-------------:|
|/service-discovery size|5K|85MiB|27k|27k|27k|
|`go_memstats_heap_objects`|100k|1M|120k|110k|
|`go_memstats_heap_alloc_bytes`|24MB|150MB|28MB|27MB|
|`rate(go_memstats_alloc_bytes_total[5m])`|0.2MB/s|28MB/s|2MB/s|0.3MB/s|
|`rate(process_cpu_seconds_total[5m])`|0.1%|15%|2%|0.01%|
|`process_open_fds`|16|*1236*|22|22|
|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="services"}[5m])`|~0|1|1|*0.03*|
|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="service"}[5m])`|0.1|*80*|0.5|0.5|
|`prometheus_target_sync_length_seconds{quantile="0.9",scrape_job="observability-by-tag"}`|N/A|200ms|0.2ms|0.2ms|
|Network bandwidth|~10kbps|~2.8Mbps|~1.6Mbps|~10kbps|

Filtering by tag using relabel_configs uses **100kiB and 23kiB/s per service per job** and quite a lot of CPU. Also sends and additional *1Mbps* of traffic to consul.
Being a little bit smarter about this reduces the overhead quite a lot.
Limiting the number of `/catalog/services` queries per second almost removes the overhead of service discovery.

* consul: tweak `refresh_interval` behavior

`refresh_interval` now does what is advertised in the documentation,
there won't be more that one update per `refresh_interval`. It now
defaults to 30s (which was also the current waitTime in the consul query).

This also make sure we don't wait another 30s if we already waited 29s
in the blocking call by substracting the number of elapsed seconds.

Hopefully this will do what people expect it does and will be safer
for existing consul infrastructures.
2018-03-23 14:48:43 +00:00
Ben Kochie 0d9fe18f5e Fix nil context staticcheck error. 2018-03-22 07:59:39 +00:00
Aaron Kirkbride c47fbcb626 Fix moved fsnotify dependency (#3995) 2018-03-21 15:46:31 +00:00