prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-15 10:04:07 -08:00

Author	SHA1	Message	Date
Bryan Boreham	b87b88ddc2	Merge branch 'main' into consul-catalog-filter-support Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-10-08 12:20:31 +01:00
TJ Hoplock	6ebfbd2d54	chore!: adopt log/slog, remove go-kit/log For: #14355 This commit updates Prometheus to adopt stdlib's log/slog package in favor of go-kit/log. As part of converting to use slog, several other related changes are required to get prometheus working, including: - removed unused logging util func `RateLimit()` - forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger - move some of the json file logging functionality to use prom/common package functionality - refactored some of the new json file logging for scraping - changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers - updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition - added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>	2024-10-07 15:58:50 -04:00
Matthieu MOREL	ab64966e9d	fix: use "ErrorContains" or "EqualError" instead of "Contains(t, err.Error()" and "Equal(t, err.Error()" (#15094 ) * fix: use "ErrorContains" or "EqualError" instead of "Contains(t, err.Error()" and "Equal(t, err.Error()" --------- Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-10-06 16:35:29 +00:00
Daniel Kimsey	aa3e58358b	consul: Add support for catalog list services filter This adds support for Consul's Catalog [List Services][^1] API's `filter` parameter added in 1.14.x. This parameter grants the operator more flexibility to do server-side filtering of the Catalog, before Prometheus subscribes for updates. Operators can use this to improve both the performance of Prometheus's Consul SD and reduce the impact of enumerating large catalogs. [^1]: https://developer.hashicorp.com/consul/api-docs/v1.14.x/catalog Signed-off-by: Daniel Kimsey <dekimsey@protonmail.com>	2024-03-17 20:32:54 -05:00
Bryan Boreham	b17f88b7fb	consul sd tests: don't call FailNow from a background goroutine This is not allowed by the Go test framework. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-02-01 15:15:48 +00:00
Bryan Boreham	46008fdecd	lint Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-02-01 14:57:11 +00:00
Paweł Szulik	7f24efccdb	Refactor discovery tests to use testify. Signed-off-by: Paweł Szulik <paul.szulik@gmail.com>	2024-01-31 16:42:11 +00:00
Paulin Todev	78411d5e8b	SD Managers taking over responsibility for registration of debug metrics (#13375 ) SD Managers take over responsibility for SD metrics registration --------- Signed-off-by: Paulin Todev <paulin.todev@gmail.com> Signed-off-by: Björn Rabenstein <github@rabenste.in> Co-authored-by: Björn Rabenstein <github@rabenste.in>	2024-01-23 16:53:55 +01:00
Paulin Todev	6de80d7fb0	Allow non-default registry to be used for metrics of SD components Signed-off-by: Paulin Todev <paulin.todev@gmail.com>	2023-12-11 11:14:26 +00:00
Matthieu MOREL	9c4782f1cc	golangci-lint: enable testifylint linter (#13254 ) Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-12-07 11:35:01 +00:00
Fish-pro	43d77f7c41	Use http constants instead of string Signed-off-by: Fish-pro <zechun.chen@daocloud.io>	2023-02-10 10:21:05 +08:00
Julien Pivotto	98039cddfa	Update Prometheus common (#10492 ) * Update Prometheus common - Oauth2 supports proxy URL - HTTP2 can be disabled Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2022-03-30 12:49:03 +02:00
Mateusz Gozdek	0bfef847b0	discovery/consul: fix leaking goroutine from test Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-10 09:40:43 +01:00
Mateusz Gozdek	1a6c2283a3	Format Go source files using 'gofumpt -w -s -extra' Part of #9557 Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-02 19:52:34 +01:00
Levi Harrison	faed8df31d	Enable reading consul token from file (#8926 ) * Adopted common http client Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-12 00:06:59 +02:00
Levi Harrison	b5f6f8fb36	Switched to go-kit/log Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-11 12:28:36 -04:00
Nick Triller	fddf4918c0	Send empty targetgroup if nothing discovered Signed-off-by: Nick Triller <nicktriller@gmail.com>	2021-04-29 09:06:52 +02:00
Julien Pivotto	6c56a1faaa	Testify: move to require (#8122 ) * Testify: move to require Moving testify to require to fail tests early in case of errors. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * More moves Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-29 09:43:23 +00:00
Julien Pivotto	1282d1b39c	Refactor test assertions (#8110 ) * Refactor test assertions This pull request gets rid of assert.True where possible to use fine-grained assertions. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-27 11:06:53 +01:00
Julien Pivotto	4e5b1722b3	Move away from testutil, refactor imports (#8087 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-22 11:00:08 +02:00
Julien Pivotto	9da53391d1	Merge pull request #7739 from prometheus/release-2.20 Merge release-2.20 into the main branch after Consul fix	2020-08-04 20:15:43 +02:00
Julien Pivotto	3a7120bc07	Consul: Reduce WatchTimeout to 2m and set it as timeout for requests Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-08-03 00:42:55 +02:00
Julien Pivotto	93e9c010f3	Add more Go leak tests (#7652 ) * Implement go leak test for promql Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Implement go leak test for Consul SD Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Implement go leak test in discovery manager Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-07-24 10:10:20 +01:00
John Bampton	98a69b77d1	Fix spelling (#7512 ) Signed-off-by: John Bampton <jbampton@users.noreply.github.com>	2020-07-04 14:54:26 +02:00
Pierre Souchay	1508678001	Use 10m timeouts for watches (#7423 ) use ?wait=10m will give results as fast as usual when data is changing but will perform far less requests when services do not change. On large infrastructure, this will reduce quite a lot the number of qps on Consul servers while having the same performance for freshness of results. Signed-off-by: Pierre Souchay <p.souchay@criteo.com>	2020-06-20 20:22:45 +01:00
Mathilde Gilles	9b9c58aea8	[Consul] Add health label to metrics (#5313 ) Label metrics with the target health using consul's /health endpoint. Signed-off-by: Mathilde Gilles <m.gilles@criteo.com>	2020-02-25 13:32:30 +00:00
Josh Soref	91d76c8023	Spelling (#6517 ) * spelling: alertmanager Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: attributes Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: autocomplete Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: bootstrap Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: caught Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: chunkenc Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: compaction Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: corrupted Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: deletable Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: expected Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: fine-grained Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: initialized Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: iteration Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: javascript Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: multiple Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: number Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: overlapping Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: possible Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: postings Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: procedure Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: programmatic Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: queuing Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: querier Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: repairing Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: received Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: reproducible Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: retention Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: sample Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: segements Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: semantic Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: software [LICENSE] Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: staging Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: timestamp Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: unfortunately Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: uvarint Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: subsequently Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: ressamples Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>	2020-01-02 15:54:09 +01:00
Jean-Baptiste Le Duigou	5973227434	adding additional unit tests for getDataCenter() in consul (#6192 ) * adding additional unit tests for getDataCenter() in consul Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Consult Tests : update comments to start with uppercase and end with point Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Consult Test : using table-driven tests Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Consul Test : cleaner syntax Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Consul Test : even cleaner syntax Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Consul Test : update comments Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Fixing naming convention by removing underscore in function name Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Removing duplicated test case for getDatacenter() Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>	2019-11-15 14:52:39 +01:00
Ganesh Vernekar	5ecef3542d	Cleanup after merging tsdb into prometheus Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2019-08-13 14:04:14 +05:30
AllenZMC	41151ca8dc	fix mis-spelling in consul_test.go (#5836 ) Signed-off-by: czm <zhongming.chang@daocloud.io>	2019-08-06 06:11:41 +01:00
Mario Trangoni	5354ffff99	Fix some spelling issues (#5361 ) See, $ codespell -S './vendor/,./.git,./web/ui/static/vendor*' --ignore-words-list="uint,dur,ue,iff,te,wan" Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>	2019-03-14 14:38:54 +00:00
Callum Styan	83c46fd549	update Consul vendor code so that catalog.ServiceMultipleTags can be (#5151 ) Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-03-12 10:31:27 +00:00
Samuel Alfageme	240321acee	Add taggedAddress to the labels in ConsulSD (#5001 ) Useful when multiple (tagged) addresses for a node are exposed on the catalog API Ref. https://www.consul.io/api/catalog.html#taggedaddresses Signed-off-by: Samuel Alfageme <samuel@alfage.me>	2018-12-18 11:51:05 +01:00
Ben Kochie	c6399296dc	Fix spelling/typos (#4921 ) * Fix spelling/typos Fix spelling/typos reported by codespell/misspell. * UK -> US spelling changes. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-11-27 17:44:29 +01:00
Romain Baugue	b41be4ef52	Discovery consul service meta (#4280 ) * Upgrade Consul client * Add ServiceMeta to the labels in ConsulSD Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>	2018-07-18 05:06:56 +01:00
Elif T. Kuş	57dcdfb15f	Rewrote tests with testutil for several test files (#4086 ) * promql: Rewrote tests with testutil for functions_test Signed-off-by: Elif T. Kuş <elifkus@gmail.com> * pkg/relabel: Rewrote tests with testutil for relabel_test Signed-off-by: Elif T. Kuş <elifkus@gmail.com> * discovery/consul: Rewrote tests with testutil for consul_test Signed-off-by: Elif T. Kuş <elifkus@gmail.com> * scrape: Rewrote tests with testutil for manager_test Signed-off-by: Elif T. Kuş <elifkus@gmail.com>	2018-04-27 13:11:16 +01:00
Corentin Chary	60dafd425c	consul: improve consul service discovery (#3814 ) * consul: improve consul service discovery Related to #3711 - Add the ability to filter by tag and node-meta in an efficient way (`/catalog/services` allow filtering by node-meta, and returns a `map[string]string` or `service`->`tags`). Tags and nore-meta are also used in `/catalog/service` requests. - Do not require a call to the catalog if services are specified by name. This is important because on large cluster `/catalog/services` changes all the time. - Add `allow_stale` configuration option to do stale reads. Non-stale reads can be costly, even more when you are doing them to a remote datacenter with 10k+ targets over WAN (which is common for federation). - Add `refresh_interval` to minimize the strain on the catalog and on the service endpoint. This is needed because of that kind of behavior from consul: https://github.com/hashicorp/consul/issues/3712 and because a catalog on a large cluster would basically change all the time. No need to discover targets in 1sec if we scrape them every minute. - Added plenty of unit tests. Benchmarks ---------- ```yaml scrape_configs: - job_name: prometheus scrape_interval: 60s static_configs: - targets: ["127.0.0.1:9090"] - job_name: "observability-by-tag" scrape_interval: "60s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 tag: marathon-user-observability # Used in After refresh_interval: 30s # Used in After+delay relabel_configs: - source_labels: [__meta_consul_tags] regex: ^(.,)?marathon-user-observability(,.)?$ action: keep - job_name: "observability-by-name" scrape_interval: "60s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 services: - observability-cerebro - observability-portal-web - job_name: "fake-fake-fake" scrape_interval: "15s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 services: - fake-fake-fake ``` Note: tested with ~1200 services, ~5000 nodes. \| Resource \| Empty \| Before \| After \| After + delay \| \| -------- \|:-----:\|:------:\|:-----:\|:-------------:\| \|/service-discovery size\|5K\|85MiB\|27k\|27k\|27k\| \|`go_memstats_heap_objects`\|100k\|1M\|120k\|110k\| \|`go_memstats_heap_alloc_bytes`\|24MB\|150MB\|28MB\|27MB\| \|`rate(go_memstats_alloc_bytes_total[5m])`\|0.2MB/s\|28MB/s\|2MB/s\|0.3MB/s\| \|`rate(process_cpu_seconds_total[5m])`\|0.1%\|15%\|2%\|0.01%\| \|`process_open_fds`\|16\|1236\|22\|22\| \|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="services"}[5m])`\|~0\|1\|1\|0.03\| \|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="service"}[5m])`\|0.1\|80\|0.5\|0.5\| \|`prometheus_target_sync_length_seconds{quantile="0.9",scrape_job="observability-by-tag"}`\|N/A\|200ms\|0.2ms\|0.2ms\| \|Network bandwidth\|~10kbps\|~2.8Mbps\|~1.6Mbps\|~10kbps\| Filtering by tag using relabel_configs uses 100kiB and 23kiB/s per service per job and quite a lot of CPU. Also sends and additional 1Mbps of traffic to consul. Being a little bit smarter about this reduces the overhead quite a lot. Limiting the number of `/catalog/services` queries per second almost removes the overhead of service discovery. * consul: tweak `refresh_interval` behavior `refresh_interval` now does what is advertised in the documentation, there won't be more that one update per `refresh_interval`. It now defaults to 30s (which was also the current waitTime in the consul query). This also make sure we don't wait another 30s if we already waited 29s in the blocking call by substracting the number of elapsed seconds. Hopefully this will do what people expect it does and will be safer for existing consul infrastructures.	2018-03-23 14:48:43 +00:00
Shubheksha Jalan	ec94df49d4	Refactor SD configuration to remove `config` dependency (#3629 ) * refactor: move targetGroup struct and CheckOverflow() to their own package * refactor: move auth and security related structs to a utility package, fix import error in utility package * refactor: Azure SD, remove SD struct from config * refactor: DNS SD, remove SD struct from config into dns package * refactor: ec2 SD, move SD struct from config into the ec2 package * refactor: file SD, move SD struct from config to file discovery package * refactor: gce, move SD struct from config to gce discovery package * refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil * refactor: consul, move SD struct from config into consul discovery package * refactor: marathon, move SD struct from config into marathon discovery package * refactor: triton, move SD struct from config to triton discovery package, fix test * refactor: zookeeper, move SD structs from config to zookeeper discovery package * refactor: openstack, remove SD struct from config, move into openstack discovery package * refactor: kubernetes, move SD struct from config into kubernetes discovery package * refactor: notifier, use targetgroup package instead of config * refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup * refactor: retrieval, use targetgroup package instead of config.TargetGroup * refactor: storage, use config util package * refactor: discovery manager, use targetgroup package instead of config.TargetGroup * refactor: use HTTPClient and TLS config from configUtil instead of config * refactor: tests, use targetgroup package instead of config.TargetGroup * refactor: fix tagetgroup.Group pointers that were removed by mistake * refactor: openstack, kubernetes: drop prefixes * refactor: remove import aliases forced due to vscode bug * refactor: move main SD struct out of config into discovery/config * refactor: rename configUtil to config_util * refactor: rename yamlUtil to yaml_config * refactor: kubernetes, remove prefixes * refactor: move the TargetGroup package to discovery/ * refactor: fix order of imports	2017-12-29 21:01:34 +01:00
Fabian Reinartz	d21f149745	*: migrate to go-kit/log	2017-09-08 22:01:51 +05:30
Chris Goller	42de0ae013	Use log.Logger interface for all discovery services	2017-06-01 11:25:55 -05:00
Robert Neumayer	feb7670929	Add tests for consul service discovery (#2490 ) * Add tests for consul service discovery * Add license header * Address comments * inline variables * check for extra error * Fix error formatting	2017-03-15 09:33:53 +01:00

41 commits