prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-14 09:34:05 -08:00

Author	SHA1	Message	Date
Simon Pasquier	128ff546b8	config: add test for OpenStack SD (#4594 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-09-13 21:44:27 +05:30
Tariq Ibrahim	f708fd5c99	Adding support for multiple azure environments (#4569 ) Signed-off-by: Tariq Ibrahim <tariq.ibrahim@microsoft.com>	2018-09-04 17:55:40 +02:00
Daisy T	7d01ead689	change time.duration to model.duration for standardization (#4479 ) Signed-off-by: Daisy T <daisyts@gmx.com>	2018-08-24 16:55:21 +02:00
Goutham Veeramachaneni	c28cc5076c	Saner defaults and metrics for remote-write (#4279 ) * Rename queueCapacity to shardCapacity * Saner defaults for remote write * Reduce allocs on retries Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>	2018-07-18 05:15:16 +01:00
Paul Gier	d24d2acd11	config: set target group source index during unmarshalling (#4245 ) * config: set target group source index during unmarshalling Fixes issue #4214 where the scrape pool is unnecessarily reloaded for a config reload where the config hasn't changed. Previously, the discovery manager changed the static config after loading which caused the in-memory config to differ from a freshly reloaded config. Signed-off-by: Paul Gier <pgier@redhat.com> * [issue #4214] Test that static targets are not modified by discovery manager Signed-off-by: Paul Gier <pgier@redhat.com>	2018-06-13 16:34:59 +01:00
Philippe Laflamme	2aba238f31	Use common HTTPClientConfig for marathon_sd configuration (#4009 ) This adds support for basic authentication which closes #3090 The support for specifying the client timeout was removed as discussed in https://github.com/prometheus/common/pull/123. Marathon was the only sd mechanism doing this and configuring the timeout is done through `Context`. DC/OS uses a custom `Authorization` header for authenticating. This adds 2 new configuration properties to reflect this. Existing configuration files that use the bearer token will no longer work. More work is required to make this backwards compatible.	2018-04-05 09:08:18 +01:00
Manos Fokas	25f929b772	Yaml UnmarshalStrict implementation. (#4033 ) * Updated yaml vendor package. * remove checkOverflow duplicate in rulefmt * remove duplicated HTTPClientConfig.Validate() * Added yaml static check.	2018-04-04 09:07:39 +01:00
Kristiyan Nikolov	be85ba3842	discovery/ec2: Support filtering instances in discovery (#4011 )	2018-03-31 07:51:11 +01:00
Corentin Chary	60dafd425c	consul: improve consul service discovery (#3814 ) * consul: improve consul service discovery Related to #3711 - Add the ability to filter by tag and node-meta in an efficient way (`/catalog/services` allow filtering by node-meta, and returns a `map[string]string` or `service`->`tags`). Tags and nore-meta are also used in `/catalog/service` requests. - Do not require a call to the catalog if services are specified by name. This is important because on large cluster `/catalog/services` changes all the time. - Add `allow_stale` configuration option to do stale reads. Non-stale reads can be costly, even more when you are doing them to a remote datacenter with 10k+ targets over WAN (which is common for federation). - Add `refresh_interval` to minimize the strain on the catalog and on the service endpoint. This is needed because of that kind of behavior from consul: https://github.com/hashicorp/consul/issues/3712 and because a catalog on a large cluster would basically change all the time. No need to discover targets in 1sec if we scrape them every minute. - Added plenty of unit tests. Benchmarks ---------- ```yaml scrape_configs: - job_name: prometheus scrape_interval: 60s static_configs: - targets: ["127.0.0.1:9090"] - job_name: "observability-by-tag" scrape_interval: "60s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 tag: marathon-user-observability # Used in After refresh_interval: 30s # Used in After+delay relabel_configs: - source_labels: [__meta_consul_tags] regex: ^(.,)?marathon-user-observability(,.)?$ action: keep - job_name: "observability-by-name" scrape_interval: "60s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 services: - observability-cerebro - observability-portal-web - job_name: "fake-fake-fake" scrape_interval: "15s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 services: - fake-fake-fake ``` Note: tested with ~1200 services, ~5000 nodes. \| Resource \| Empty \| Before \| After \| After + delay \| \| -------- \|:-----:\|:------:\|:-----:\|:-------------:\| \|/service-discovery size\|5K\|85MiB\|27k\|27k\|27k\| \|`go_memstats_heap_objects`\|100k\|1M\|120k\|110k\| \|`go_memstats_heap_alloc_bytes`\|24MB\|150MB\|28MB\|27MB\| \|`rate(go_memstats_alloc_bytes_total[5m])`\|0.2MB/s\|28MB/s\|2MB/s\|0.3MB/s\| \|`rate(process_cpu_seconds_total[5m])`\|0.1%\|15%\|2%\|0.01%\| \|`process_open_fds`\|16\|1236\|22\|22\| \|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="services"}[5m])`\|~0\|1\|1\|0.03\| \|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="service"}[5m])`\|0.1\|80\|0.5\|0.5\| \|`prometheus_target_sync_length_seconds{quantile="0.9",scrape_job="observability-by-tag"}`\|N/A\|200ms\|0.2ms\|0.2ms\| \|Network bandwidth\|~10kbps\|~2.8Mbps\|~1.6Mbps\|~10kbps\| Filtering by tag using relabel_configs uses 100kiB and 23kiB/s per service per job and quite a lot of CPU. Also sends and additional 1Mbps of traffic to consul. Being a little bit smarter about this reduces the overhead quite a lot. Limiting the number of `/catalog/services` queries per second almost removes the overhead of service discovery. * consul: tweak `refresh_interval` behavior `refresh_interval` now does what is advertised in the documentation, there won't be more that one update per `refresh_interval`. It now defaults to 30s (which was also the current waitTime in the consul query). This also make sure we don't wait another 30s if we already waited 29s in the blocking call by substracting the number of elapsed seconds. Hopefully this will do what people expect it does and will be safer for existing consul infrastructures.	2018-03-23 14:48:43 +00:00
pasquier-s	fc8cf08f42	Prevent invalid label names with labelmap (#3868 ) This change ensures that the relabeling configurations using labelmap can't generate invalid label names.	2018-02-21 10:02:22 +00:00
Shubheksha Jalan	0471e64ad1	Use shared types from the `common` repo (#3674 ) * refactor: use shared types from common repo, remove util/config * vendor: add common/config * fix nit	2018-01-11 16:10:25 +01:00
Shubheksha Jalan	ec94df49d4	Refactor SD configuration to remove `config` dependency (#3629 ) * refactor: move targetGroup struct and CheckOverflow() to their own package * refactor: move auth and security related structs to a utility package, fix import error in utility package * refactor: Azure SD, remove SD struct from config * refactor: DNS SD, remove SD struct from config into dns package * refactor: ec2 SD, move SD struct from config into the ec2 package * refactor: file SD, move SD struct from config to file discovery package * refactor: gce, move SD struct from config to gce discovery package * refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil * refactor: consul, move SD struct from config into consul discovery package * refactor: marathon, move SD struct from config into marathon discovery package * refactor: triton, move SD struct from config to triton discovery package, fix test * refactor: zookeeper, move SD structs from config to zookeeper discovery package * refactor: openstack, remove SD struct from config, move into openstack discovery package * refactor: kubernetes, move SD struct from config into kubernetes discovery package * refactor: notifier, use targetgroup package instead of config * refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup * refactor: retrieval, use targetgroup package instead of config.TargetGroup * refactor: storage, use config util package * refactor: discovery manager, use targetgroup package instead of config.TargetGroup * refactor: use HTTPClient and TLS config from configUtil instead of config * refactor: tests, use targetgroup package instead of config.TargetGroup * refactor: fix tagetgroup.Group pointers that were removed by mistake * refactor: openstack, kubernetes: drop prefixes * refactor: remove import aliases forced due to vscode bug * refactor: move main SD struct out of config into discovery/config * refactor: rename configUtil to config_util * refactor: rename yamlUtil to yaml_config * refactor: kubernetes, remove prefixes * refactor: move the TargetGroup package to discovery/ * refactor: fix order of imports	2017-12-29 21:01:34 +01:00
Brian Brazil	fba80da635	Fix default of read_recent to be false. (#3617 ) This is what is documented in the migration guide, and the default settings should make sense for a true long term storage. Document the setting.	2017-12-23 17:21:38 +00:00
Krasi Georgiev	e405e2f1ea	refactored discovery	2017-12-18 17:22:49 +00:00
Alberto Cortés	29da2fb9cd	testutil: update to go1.9 testing.Helper	2017-12-08 19:06:53 +01:00
Alberto Cortés	8f6a9f7833	config: simplify tests by using testutil.NotOk (#3289 ) Also include filename in all LoadFile errors Also add mesage to testuitl.NotOk so we can identify failing tests when using table driven tests.	2017-12-08 16:52:25 +00:00
Tobias Schmidt	7098c56474	Add remote read filter option For special remote read endpoints which have only data for specific queries, it is desired to limit the number of queries sent to the configured remote read endpoint to reduce latency and performance overhead.	2017-11-13 23:30:01 +01:00
Krasi Georgiev	e86d82ad2d	Fix regression of alert rules state loss on config reload. (#3382 ) * incorrect map name for the group prevented copying state from existing alert rules on config reload * applyConfig test * few nits * nits 2	2017-11-01 12:58:00 +01:00
Thibault Chataigner	bf4a279a91	Remote storage reads based on oldest timestamp in primary storage (#3129 ) Currently all read queries are simply pushed to remote read clients. This is fine, except for remote storage for wich it unefficient and make query slower even if remote read is unnecessary. So we need instead to compare the oldest timestamp in primary/local storage with the query range lower boundary. If the oldest timestamp is older than the mint parameter, then there is no need for remote read. This is an optionnal behavior per remote read client. Signed-off-by: Thibault Chataigner <t.chataigner@criteo.com>	2017-10-18 12:08:14 +01:00
Alberto Cortés	6c67296423	config: fix error message for unexpected result of yaml marshal	2017-10-12 19:50:07 +02:00
Alberto Cortés	0f3d8ea075	config: use testutil package	2017-10-12 19:50:07 +02:00
Fabian Reinartz	2d0b8e8b94	Merge branch 'master' into dev-2.0	2017-10-05 13:09:18 +02:00
Alberto Cortés	bb3dad9cba	config: simplify some returns	2017-09-26 16:57:56 +02:00
Bryan Boreham	9d6b945e41	Default HTTP keep-alive ON for remote read/write	2017-09-11 09:48:30 +00:00
Bryan Boreham	e0a4d18301	Allow http keep-alive setting to be overridden in config	2017-09-11 09:07:14 +00:00
Fabian Reinartz	e746282772	Merge branch 'master' into dev-2.0	2017-09-11 10:55:19 +02:00
Jamie Moore	7a135e0a1b	Add the ability to assume a role for ec2 discovery	2017-09-10 00:36:43 +10:00
Fabian Reinartz	87918f3097	Merge branch 'master' into dev-2.0	2017-09-04 14:09:21 +02:00
Johannes 'fish' Ziemke	70f3d1e9f9	k8s: Support discovery of ingresses (#3111 ) * k8s: Support discovery of ingresses * Move additional labels below allocation This makes it more obvious why the additional elements are allocated. Also fix allocation for node where we only set a single label. * k8s: Remove port from ingress discovery * k8s: Add comment to ingress discovery example	2017-09-04 13:10:44 +02:00
CuiHaozhi	b1c18bf29b	discovery openstack: support discovery hosts, add rule option. Signed-off-by: CuiHaozhi <cuihz@wise2c.com>	2017-08-29 10:14:00 -04:00
Max Leonard Inden	1c96fbb992	Expose current Prometheus config via /status/config This PR adds the `/status/config` endpoint which exposes the currently loaded Prometheus config. This is the same config that is displayed on `/config` in the UI in YAML format. The response payload looks like such: ``` { "status": "success", "data": { "yaml": <CONFIG> } } ```	2017-08-13 22:21:18 +02:00
Fabian Reinartz	25f3e1c424	Merge branch 'master' into mergemaster	2017-08-10 17:04:25 +02:00
Yuki Ito	1bf3b91ae0	Make sure that url for remote_read/write is not nil (#3024 )	2017-08-07 08:49:45 +01:00
Tom Wilkie	5169f990f9	Review feedback: add yaml struct tags, don't embed queue config. Also, rename QueueManageConfig to QueueConfig, for consistency with tags.	2017-08-01 14:43:56 +01:00
Tom Wilkie	454b661145	Make queue manager configurable.	2017-07-25 13:47:34 +01:00
Fabian Reinartz	dba7586671	Merge branch 'master' into dev-2.0	2017-07-11 17:22:14 +02:00
Fuente, Pablo Andres	9eb8c6e1d2	Renaming the config_notwin test to config_default	2017-07-10 11:08:16 -03:00
Fuente, Pablo Andres	fe73de9452	Renaming config test file to fix build tags Renaming the name of a file of the config tests, in order to properly use the Go build tags feature.	2017-07-10 00:02:08 -03:00
Fuente, Pablo Andres	193dc47230	Fixing code style to adhere gofmt	2017-07-09 02:43:33 -03:00
Fuente, Pablo Andres	902fafb8e7	Fixing tests for Windows Fixing the config/config_test, the discovery/file/file_test and the promql/promql_test tests for Windows. For most of the tests, the fix involved correct handling of path separators. In the case of the promql tests, the issue was related to the removal of the temporal directories used by the storage. The issue is that the RemoveAll() call returns an error when it tries to remove a directory which is not empty, which seems to be true due to some kind of process that is still running after closing the storage. To fix it I added some retries to the remove of the temporal directories. Adding tags file from Universal Ctags to .gitignore	2017-07-09 01:59:30 -03:00
Goutham Veeramachaneni	98d20d5880	Make sure rendering config produces valid config Fixes #2899 Signed-off-by: Goutham Veeramachaneni <goutham@boomerangcommerce.com>	2017-07-05 16:09:29 +02:00
Fabian Reinartz	65b087bcc1	config: resolve file SD paths relative to config	2017-07-04 11:40:26 +02:00
Fabian Reinartz	9ba61df45a	Merge pull request #2789 from mtanda/aws_default_region config: set default region for EC2 SD	2017-06-12 16:15:53 +02:00
Mitsuhiro Tanda	64cef5cd05	handle NewSession() error	2017-06-06 22:02:50 +09:00
Christian Groschupp	8f781e411c	Openstack Service Discovery (#2701 ) * Add openstack service discovery. * Add gophercloud code for openstack service discovery. * first changes for juliusv comments. * add gophercloud code for floatingip. * Add tests to openstack sd. * Add testify suite vendor files. * add copyright and make changes for code climate. * Fixed typos in provider openstack. * Renamed tenant to project in openstack sd. * Change type of password to Secret in openstack sd.	2017-06-01 23:49:02 +02:00
Roman Vynar	dbe2eb2afc	Hide consul token on UI. (#2797 )	2017-06-01 22:14:23 +01:00
Mitsuhiro Tanda	a1ddab463e	config: set default region for EC2 SD	2017-06-02 01:40:24 +09:00
Tobias Schmidt	287ec6e6cc	Fix outdated target_group naming in error message The target_groups config has been renamed to static_configs, the error message for overflow attributes should reflect that.	2017-05-31 11:01:13 +02:00
Julius Volz	240bb671e2	config: Fix overflow checking in global config (#2783 )	2017-05-30 20:58:06 +02:00
Conor Broderick	6766123f93	Replace regex with Secret type and remarshal config to hide secrets (#2775 )	2017-05-29 12:46:23 +01:00
Brian Akins	27d66628a1	Allow limiting Kubernetes service discover to certain namespaces Allow namespace discovery to be more easily extended in the future by using a struct rather than just a list. Rename fields for kubernetes namespace discovery	2017-04-27 07:41:36 -04:00
Julius Volz	eb14678a25	Make remote read/write use config.HTTPClientConfig	2017-03-20 13:37:50 +01:00
Julius Volz	02395a224d	[WIP] Remote Read	2017-03-20 13:13:44 +01:00
Julius Volz	525da88c35	Merge pull request #2479 from YKlausz/consul-tls Adding consul capability to connect via tls	2017-03-20 11:40:18 +01:00
Goutham Veeramachaneni	5c89cec65c	Stricter Relabel Config Checking for Labeldrop/keep (#2510 ) * Minor code cleanup * Labeldrop/Labelkeep Now Only Support Regex Ref promtheus/prometheus#2368	2017-03-18 22:32:08 +01:00
yklausz	75880b594f	Adding consul capability to connect via tls	2017-03-17 22:37:18 +01:00
Michael Kraus	04eadf6e20	Allow Marathon SD without bearer_token and bearer_token_file	2017-03-02 13:17:19 +01:00
Michael Kraus	47bdcf0f67	Allow the use of bearer_token or bearer_token_file for MarathonSD	2017-03-02 09:44:20 +01:00
Julius Volz	e9476b35d5	Re-add multiple remote writers Each remote write endpoint gets its own set of relabeling rules. This is based on the (yet-to-be-merged) https://github.com/prometheus/prometheus/pull/2419, which removes legacy remote write implementations.	2017-02-20 13:23:12 +01:00
Alex Somesan	b22eb65d0f	Cleaner separation between ServiceAccount and custom authentication in K8S SD (#2348 ) * Canonical usage of cluster service-account in K8S SD * Early validation for opt-in custom auth in K8S SD * Fix typo in condition	2017-01-19 10:52:52 +01:00
Fabian Reinartz	7eb849e6a8	Merge pull request #2307 from joyent/triton_discovery Add Joyent Triton discovery	2017-01-18 05:08:11 +01:00
Richard Kiene	f3d9692d09	Add Joyent Triton discovery	2017-01-17 20:34:32 +00:00
Björn Rabenstein	ad40d0abbc	Merge pull request #2288 from prometheus/limit-scrape Add ability to limit scrape samples, and related metrics	2017-01-08 01:34:06 +01:00
Brian Brazil	30448286c7	Add sample_limit to scrape config. This imposes a hard limit on the number of samples ingested from the target. This is counted after metric relabelling, to allow dropping of problemtic metrics. This is intended as a very blunt tool to prevent overload due to misbehaving targets that suddenly jump in sample count (e.g. adding a label containing email addresses). Add metric to track how often this happens. Fixes #2137	2016-12-16 15:10:09 +00:00
Tristan Colgate-McFarlane	4d9134e6d8	Add labeldrop and labelkeep actions. (#2279 ) Introduce two new relabel actions. labeldrop, and labelkeep. These can be used to filter the set of labels by matching regex - labeldrop: drops all labels that match the regex - labelkeep: drops all labels that do not match the regex	2016-12-14 10:17:42 +00:00
Fabian Reinartz	cc35104504	config: fix naming and typo	2016-11-25 11:04:33 +01:00
Fabian Reinartz	3fb4d1191b	config: rename AlertingConfig, resolve file paths	2016-11-24 15:19:37 +01:00
Fabian Reinartz	183c5749b9	config: add Alertmanager configuration	2016-11-23 18:23:37 +01:00
Fabian Reinartz	200bbe1bad	config: extract SD and HTTPClient configurations	2016-11-23 18:23:37 +01:00
Fabian Reinartz	ec66082749	Merge branch 'ec2_sd_profile_support' of https://github.com/Ticketmaster/prometheus into Ticketmaster-ec2_sd_profile_support	2016-11-21 11:49:23 +01:00
Kraig Amador	bec6870ed4	ec2_sd_configs: Support profiles for configuring the ec2 service	2016-11-03 08:38:02 -07:00
beorn7	b2f28a9e82	Merge branch 'release-1.2'	2016-11-03 14:42:15 +01:00
Brian Brazil	d1ece12c70	Handle null Regex in the config as the empty regex. (#2150 )	2016-11-03 13:34:15 +00:00
bekbulatov	2bc12fa2fb	Set timeout for marathon_sd	2016-10-24 11:27:08 +01:00
bekbulatov	c689b35858	Merge branch 'master' into marathon_tls	2016-10-24 10:37:32 +01:00
Matti Savolainen	aabf4a419b	use LabelNam.IsValid() instead of LabelNameRE and MatchString instead of Match	2016-10-19 16:30:52 +03:00
Matti Savolainen	ec6524ce74	test the labelTarget regex to make sure it properly validates pre-interpolated label names.	2016-10-19 13:32:42 +03:00
Matti Savolainen	f867c1fd58	formating and text fixes, adjust regexp	2016-10-19 13:31:55 +03:00
Matti Savolainen	56907ba6e3	Add interpolation to good test config. Fix regex	2016-10-19 01:19:19 +03:00
Matti Savolainen	7a36af1c85	add comment about interpolation	2016-10-19 00:42:49 +03:00
Matti Savolainen	3b8e7c1277	Merge branch 'master' of https://github.com/prometheus/prometheus into bug/target_label_unmarshal	2016-10-19 00:33:52 +03:00
Matti Savolainen	5a1e909b5d	Make TargetLabel in RelabelConfig a string	2016-10-19 00:33:22 +03:00
Björn Rabenstein	d93f73874f	Merge pull request #2093 from dominikschulz/spelling Trivial spelling corrections	2016-10-18 22:46:03 +02:00
Dominik Schulz	182e17958a	Trivial spelling corrections and a small comment.	2016-10-18 20:14:38 +02:00
bekbulatov	ac702f66eb	Resolve merge conflicts	2016-10-18 14:14:24 +01:00
Fabian Reinartz	1b6dfa32a9	config: rename role 'endpoint' to 'endpoints'	2016-10-17 11:53:49 +02:00
Frederic Branczyk	2e18c81a00	config: adapt unit tests	2016-10-17 10:32:10 +02:00
Fabian Reinartz	b24602f713	kubernetes: merge back into single configuration	2016-10-17 10:32:10 +02:00
Fabian Reinartz	2331701b50	kubernetes: Add K8S v2 pod discovery This adds plumbing for a parallel version of the new K8S SD and adds pod discovery as the first role.	2016-10-17 10:32:10 +02:00
Dominik Schulz	72cbf8af6f	Fix small copy and paste error	2016-10-08 08:49:00 +02:00
bekbulatov	01b53c1180	Add tls support	2016-10-07 13:40:22 +01:00
Brian Brazil	77605649a9	Add support for remote write relabelling. Switch back to a single remote writer, as we were only ever meant to have one and the relabel semantics are clearer that way.	2016-10-05 07:43:19 +01:00
Tom Wilkie	4520e12440	Add HTTP Basic Auth & TLS support to the generic write path. (#1957 ) * Add config, HTTP Basic Auth and TLS support to the generic write path. - Move generic write path configuration to the config file - Factor out config.TLSConfig -> tlf.Config translation - Support TLSConfig for generic remote storage - Rename Run to Start, and make it non-blocking. - Dedupe code in httputil for TLS config. - Make remote queue metrics global.	2016-09-19 22:47:51 +02:00
Tobias Schmidt	874cb44bb6	Merge pull request #1996 from ton31337/Fix/allow_numbers_as_first_letter Allow number to be the first letter as well for `job_name`	2016-09-16 11:08:52 -04:00
Donatas Abraitis	1aa8898b66	Allow number to be the first letter as well for `job_name`	2016-09-16 14:06:47 +03:00
Ingo Gottwald	3b546d061f	Add support for GCE discovery	2016-09-16 08:55:33 +02:00
Alexey Miroshkin	e29d9394e5	Forbid invalid relabel configurations This fix adds check if target_label value is set in case if action is replace or hashmod Issue [#1900]	2016-08-29 16:56:06 +02:00
Fabian Reinartz	be596f82b4	Merge pull request #1783 from knyar/json Allow URLs in targets defined via a JSON file	2016-08-10 09:42:17 +02:00
Frederic Branczyk	b655aa002f	introduce top level alerting config node	2016-08-09 14:19:25 +02:00
Frederic Branczyk	679d225c8d	allow relabeling of alerts in case of dropping don't even enqueue them	2016-08-09 14:18:31 +02:00

1 2 3 4 5 ...

275 commits