Commit graph

246 commits

Author SHA1 Message Date
Max Leonard Inden 41c22effbe
config&notifier: Add option to use Alertmanager API v2
With v0.16.0 Alertmanager introduced a new API (v2). This patch adds a
configuration option for Prometheus to send alerts to the v2 endpoint
instead of the defautl v1 endpoint.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2019-06-21 16:33:53 +02:00
Callum Styan e9129abeff Remove max_retries from queue_config since it's not used in remote write
anymore.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-06-10 12:43:08 -07:00
Tariq Ibrahim 8fdfa8abea refine error handling in prometheus (#5388)
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-26 00:01:12 +01:00
Callum Styan 5603b857a9 Check if label value is valid when unmarhsaling external labels from
YAML, add a test to config_tests for valid/invalid external label
value.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-03-18 20:31:12 +00:00
Tom Wilkie c7b3535997 Use pkg/relabelling in remote write.
- Unmarshall external_labels config as labels.Labels, add tests.
- Convert some more uses of model.LabelSet to labels.Labels.
- Remove old relabel pkg (fixes #3647).
- Validate external label names.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-03-18 20:31:12 +00:00
Julien Pivotto 4397916cb2 Add honor_timestamps (#5304)
Fixes #5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2019-03-15 10:04:15 +00:00
Simon Pasquier 027d2ece14 config: resolve more file paths (#5284)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-03-12 10:24:15 +00:00
Simon Pasquier c8a1a5a93c
discovery/kubernetes: fix support for password_file and bearer_token_file (#5211)
* discovery/kubernetes: fix support for password_file

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Create and pass custom RoundTripper to Kubernetes client

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use inline HTTPClientConfig

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-20 11:22:34 +01:00
Callum Styan 6f69e31398 Tail the TSDB WAL for remote_write
This change switches the remote_write API to use the TSDB WAL.  This should reduce memory usage and prevent sample loss when the remote end point is down.

We use the new LiveReader from TSDB to tail WAL segments.  Logic for finding the tracking segment is included in this PR.  The WAL is tailed once for each remote_write endpoint specified. Reading from the segment is based on a ticker rather than relying on fsnotify write events, which were found to be complicated and unreliable in early prototypes.

Enqueuing a sample for sending via remote_write can now block, to provide back pressure.  Queues are still required to acheive parallelism and batching.  We have updated the queue config based on new defaults for queue capacity and pending samples values - much smaller values are now possible.  The remote_write resharding code has been updated to prevent deadlocks, and extra tests have been added for these cases.

As part of this change, we attempt to guarantee that samples are not lost; however this initial version doesn't guarantee this across Prometheus restarts or non-retryable errors from the remote end (eg 400s).

This changes also includes the following optimisations:
- only marshal the proto request once, not once per retry
- maintain a single copy of the labels for given series to reduce GC pressure

Other minor tweaks:
- only reshard if we've also successfully sent recently
- add pending samples, latest sent timestamp, WAL events processed metrics

Co-authored-by: Chris Marchbanks <csmarchbanks.com> (initial prototype)
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com> (sharding changes)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-02-12 11:39:13 +00:00
Bartek Płotka 62c8337e77 Moved configuration into relabel package. (#4955)
Adapted top dir relabel to use pkg relabel structs.

Removal of this in a separate tracked here: https://github.com/prometheus/prometheus/issues/3647

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2018-12-18 11:26:36 +00:00
Ryota Arai 135d580ab2 Introduce min_shards for remote write to set minimum number of shards. (#4924)
Signed-off-by: Ryota Arai <ryota.arai@gmail.com>
2018-12-04 17:32:14 +00:00
Julius Volz d28246e337
Fix config loading panics on nil pointer slice elements (#4942)
Fixes https://github.com/prometheus/prometheus/issues/4902
Fixes https://github.com/prometheus/prometheus/issues/4889

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-12-03 18:09:02 +08:00
Ben Kochie c6399296dc
Fix spelling/typos (#4921)
* Fix spelling/typos

Fix spelling/typos reported by codespell/misspell.
* UK -> US spelling changes.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-11-27 17:44:29 +01:00
Daisy T 7d01ead689 change time.duration to model.duration for standardization (#4479)
Signed-off-by: Daisy T <daisyts@gmx.com>
2018-08-24 16:55:21 +02:00
Goutham Veeramachaneni c28cc5076c Saner defaults and metrics for remote-write (#4279)
* Rename queueCapacity to shardCapacity
* Saner defaults for remote write
* Reduce allocs on retries

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2018-07-18 05:15:16 +01:00
Paul Gier d24d2acd11 config: set target group source index during unmarshalling (#4245)
* config: set target group source index during unmarshalling

Fixes issue #4214 where the scrape pool is unnecessarily reloaded for a
config reload where the config hasn't changed.  Previously, the discovery
manager changed the static config after loading which caused the in-memory
config to differ from a freshly reloaded config.

Signed-off-by: Paul Gier <pgier@redhat.com>

* [issue #4214] Test that static targets are not modified by discovery manager

Signed-off-by: Paul Gier <pgier@redhat.com>
2018-06-13 16:34:59 +01:00
Philippe Laflamme 2aba238f31 Use common HTTPClientConfig for marathon_sd configuration (#4009)
This adds support for basic authentication which closes #3090

The support for specifying the client timeout was removed as discussed in https://github.com/prometheus/common/pull/123. Marathon was the only sd mechanism doing this and configuring the timeout is done through `Context`.

DC/OS uses a custom `Authorization` header for authenticating. This adds 2 new configuration properties to reflect this.

Existing configuration files that use the bearer token will no longer work. More work is required to make this backwards compatible.
2018-04-05 09:08:18 +01:00
Manos Fokas 25f929b772 Yaml UnmarshalStrict implementation. (#4033)
* Updated yaml vendor package.

* remove checkOverflow duplicate in rulefmt

* remove duplicated HTTPClientConfig.Validate()

* Added yaml static check.
2018-04-04 09:07:39 +01:00
pasquier-s fc8cf08f42 Prevent invalid label names with labelmap (#3868)
This change ensures that the relabeling configurations using labelmap
can't generate invalid label names.
2018-02-21 10:02:22 +00:00
Shubheksha Jalan 0471e64ad1 Use shared types from the common repo (#3674)
* refactor: use shared types from common repo, remove util/config

* vendor: add common/config

* fix nit
2018-01-11 16:10:25 +01:00
Shubheksha Jalan ec94df49d4 Refactor SD configuration to remove config dependency (#3629)
* refactor: move targetGroup struct and CheckOverflow() to their own package

* refactor: move auth and security related structs to a utility package, fix import error in utility package

* refactor: Azure SD, remove SD struct from config

* refactor: DNS SD, remove SD struct from config into dns package

* refactor: ec2 SD, move SD struct from config into the ec2 package

* refactor: file SD, move SD struct from config to file discovery package

* refactor: gce, move SD struct from config to gce discovery package

* refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil

* refactor: consul, move SD struct from config into consul discovery package

* refactor: marathon, move SD struct from config into marathon discovery package

* refactor: triton, move SD struct from config to triton discovery package, fix test

* refactor: zookeeper, move SD structs from config to zookeeper discovery package

* refactor: openstack, remove SD struct from config, move into openstack discovery package

* refactor: kubernetes, move SD struct from config into kubernetes discovery package

* refactor: notifier, use targetgroup package instead of config

* refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup

* refactor: retrieval, use targetgroup package instead of config.TargetGroup

* refactor: storage, use config util package

* refactor: discovery manager, use targetgroup package instead of config.TargetGroup

* refactor: use HTTPClient and TLS config from configUtil instead of config

* refactor: tests, use targetgroup package instead of config.TargetGroup

* refactor: fix tagetgroup.Group pointers that were removed by mistake

* refactor: openstack, kubernetes: drop prefixes

* refactor: remove import aliases forced due to vscode bug

* refactor: move main SD struct out of config into discovery/config

* refactor: rename configUtil to config_util

* refactor: rename yamlUtil to yaml_config

* refactor: kubernetes, remove prefixes

* refactor: move the TargetGroup package to discovery/

* refactor: fix order of imports
2017-12-29 21:01:34 +01:00
Brian Brazil fba80da635
Fix default of read_recent to be false. (#3617)
This is what is documented in the migration guide, and the default settings
should make sense for a true long term storage.

Document the setting.
2017-12-23 17:21:38 +00:00
Krasi Georgiev e405e2f1ea refactored discovery 2017-12-18 17:22:49 +00:00
Alberto Cortés 8f6a9f7833 config: simplify tests by using testutil.NotOk (#3289)
Also include filename in all LoadFile errors

Also add mesage to testuitl.NotOk so we can identify failing tests when
using table driven tests.
2017-12-08 16:52:25 +00:00
Tobias Schmidt 7098c56474 Add remote read filter option
For special remote read endpoints which have only data for specific
queries, it is desired to limit the number of queries sent to the
configured remote read endpoint to reduce latency and performance
overhead.
2017-11-13 23:30:01 +01:00
Thibault Chataigner bf4a279a91 Remote storage reads based on oldest timestamp in primary storage (#3129)
Currently all read queries are simply pushed to remote read clients.
This is fine, except for remote storage for wich it unefficient and
make query slower even if remote read is unnecessary.
So we need instead to compare the oldest timestamp in primary/local
storage with the query range lower boundary. If the oldest timestamp
is older than the mint parameter, then there is no need for remote read.
This is an optionnal behavior per remote read client.

Signed-off-by: Thibault Chataigner <t.chataigner@criteo.com>
2017-10-18 12:08:14 +01:00
Fabian Reinartz 2d0b8e8b94 Merge branch 'master' into dev-2.0 2017-10-05 13:09:18 +02:00
Alberto Cortés bb3dad9cba config: simplify some returns 2017-09-26 16:57:56 +02:00
Bryan Boreham 9d6b945e41 Default HTTP keep-alive ON for remote read/write 2017-09-11 09:48:30 +00:00
Bryan Boreham e0a4d18301 Allow http keep-alive setting to be overridden in config 2017-09-11 09:07:14 +00:00
Fabian Reinartz e746282772 Merge branch 'master' into dev-2.0 2017-09-11 10:55:19 +02:00
Jamie Moore 7a135e0a1b Add the ability to assume a role for ec2 discovery 2017-09-10 00:36:43 +10:00
Fabian Reinartz 87918f3097 Merge branch 'master' into dev-2.0 2017-09-04 14:09:21 +02:00
Johannes 'fish' Ziemke 70f3d1e9f9 k8s: Support discovery of ingresses (#3111)
* k8s: Support discovery of ingresses

* Move additional labels below allocation

This makes it more obvious why the additional elements are allocated.
Also fix allocation for node where we only set a single label.

* k8s: Remove port from ingress discovery

* k8s: Add comment to ingress discovery example
2017-09-04 13:10:44 +02:00
CuiHaozhi b1c18bf29b discovery openstack: support discovery hosts, add rule option.
Signed-off-by: CuiHaozhi <cuihz@wise2c.com>
2017-08-29 10:14:00 -04:00
Fabian Reinartz 25f3e1c424 Merge branch 'master' into mergemaster 2017-08-10 17:04:25 +02:00
Yuki Ito 1bf3b91ae0 Make sure that url for remote_read/write is not nil (#3024) 2017-08-07 08:49:45 +01:00
Tom Wilkie 5169f990f9 Review feedback: add yaml struct tags, don't embed queue config.
Also, rename QueueManageConfig to QueueConfig, for consistency with tags.
2017-08-01 14:43:56 +01:00
Tom Wilkie 454b661145 Make queue manager configurable. 2017-07-25 13:47:34 +01:00
Fabian Reinartz dba7586671 Merge branch 'master' into dev-2.0 2017-07-11 17:22:14 +02:00
Goutham Veeramachaneni 98d20d5880 Make sure rendering config produces valid config
Fixes #2899

Signed-off-by: Goutham Veeramachaneni <goutham@boomerangcommerce.com>
2017-07-05 16:09:29 +02:00
Fabian Reinartz 65b087bcc1 config: resolve file SD paths relative to config 2017-07-04 11:40:26 +02:00
Fabian Reinartz 9ba61df45a Merge pull request #2789 from mtanda/aws_default_region
config: set default region for EC2 SD
2017-06-12 16:15:53 +02:00
Mitsuhiro Tanda 64cef5cd05 handle NewSession() error 2017-06-06 22:02:50 +09:00
Christian Groschupp 8f781e411c Openstack Service Discovery (#2701)
* Add openstack service discovery.

* Add gophercloud code for openstack service discovery.

* first changes for juliusv comments.

* add gophercloud code for floatingip.

* Add tests to openstack sd.

* Add testify suite vendor files.

* add copyright and make changes for code climate.

* Fixed typos in provider openstack.

* Renamed tenant to project in openstack sd.

* Change type of password to Secret in openstack sd.
2017-06-01 23:49:02 +02:00
Roman Vynar dbe2eb2afc Hide consul token on UI. (#2797) 2017-06-01 22:14:23 +01:00
Mitsuhiro Tanda a1ddab463e config: set default region for EC2 SD 2017-06-02 01:40:24 +09:00
Tobias Schmidt 287ec6e6cc Fix outdated target_group naming in error message
The target_groups config has been renamed to static_configs, the error
message for overflow attributes should reflect that.
2017-05-31 11:01:13 +02:00
Julius Volz 240bb671e2 config: Fix overflow checking in global config (#2783) 2017-05-30 20:58:06 +02:00
Conor Broderick 6766123f93 Replace regex with Secret type and remarshal config to hide secrets (#2775) 2017-05-29 12:46:23 +01:00
Brian Akins 27d66628a1 Allow limiting Kubernetes service discover to certain namespaces
Allow namespace discovery to be more easily extended in the future by using a struct rather than just a list.

Rename fields for kubernetes namespace discovery
2017-04-27 07:41:36 -04:00
Julius Volz eb14678a25 Make remote read/write use config.HTTPClientConfig 2017-03-20 13:37:50 +01:00
Julius Volz 02395a224d [WIP] Remote Read 2017-03-20 13:13:44 +01:00
Julius Volz 525da88c35 Merge pull request #2479 from YKlausz/consul-tls
Adding consul capability to connect via tls
2017-03-20 11:40:18 +01:00
Goutham Veeramachaneni 5c89cec65c Stricter Relabel Config Checking for Labeldrop/keep (#2510)
* Minor code cleanup

* Labeldrop/Labelkeep Now *Only* Support Regex

Ref promtheus/prometheus#2368
2017-03-18 22:32:08 +01:00
yklausz 75880b594f Adding consul capability to connect via tls 2017-03-17 22:37:18 +01:00
Michael Kraus 04eadf6e20 Allow Marathon SD without bearer_token and bearer_token_file 2017-03-02 13:17:19 +01:00
Michael Kraus 47bdcf0f67 Allow the use of bearer_token or bearer_token_file for MarathonSD 2017-03-02 09:44:20 +01:00
Julius Volz e9476b35d5 Re-add multiple remote writers
Each remote write endpoint gets its own set of relabeling rules.

This is based on the (yet-to-be-merged)
https://github.com/prometheus/prometheus/pull/2419, which removes legacy
remote write implementations.
2017-02-20 13:23:12 +01:00
Alex Somesan b22eb65d0f Cleaner separation between ServiceAccount and custom authentication in K8S SD (#2348)
* Canonical usage of cluster service-account in K8S SD

* Early validation for opt-in custom auth in K8S SD

* Fix typo in condition
2017-01-19 10:52:52 +01:00
Fabian Reinartz 7eb849e6a8 Merge pull request #2307 from joyent/triton_discovery
Add Joyent Triton discovery
2017-01-18 05:08:11 +01:00
Richard Kiene f3d9692d09 Add Joyent Triton discovery 2017-01-17 20:34:32 +00:00
Björn Rabenstein ad40d0abbc Merge pull request #2288 from prometheus/limit-scrape
Add ability to limit scrape samples, and related metrics
2017-01-08 01:34:06 +01:00
Brian Brazil 30448286c7 Add sample_limit to scrape config.
This imposes a hard limit on the number of samples ingested from the
target. This is counted after metric relabelling, to allow dropping of
problemtic metrics.

This is intended as a very blunt tool to prevent overload due to
misbehaving targets that suddenly jump in sample count (e.g. adding
a label containing email addresses).

Add metric to track how often this happens.

Fixes #2137
2016-12-16 15:10:09 +00:00
Tristan Colgate-McFarlane 4d9134e6d8 Add labeldrop and labelkeep actions. (#2279)
Introduce two new relabel actions. labeldrop, and labelkeep.
These can be used to filter the set of labels by matching regex

- labeldrop: drops all labels that match the regex
- labelkeep: drops all labels that do not match the regex
2016-12-14 10:17:42 +00:00
Fabian Reinartz cc35104504 config: fix naming and typo 2016-11-25 11:04:33 +01:00
Fabian Reinartz 3fb4d1191b config: rename AlertingConfig, resolve file paths 2016-11-24 15:19:37 +01:00
Fabian Reinartz 183c5749b9 config: add Alertmanager configuration 2016-11-23 18:23:37 +01:00
Fabian Reinartz 200bbe1bad config: extract SD and HTTPClient configurations 2016-11-23 18:23:37 +01:00
Fabian Reinartz ec66082749 Merge branch 'ec2_sd_profile_support' of https://github.com/Ticketmaster/prometheus into Ticketmaster-ec2_sd_profile_support 2016-11-21 11:49:23 +01:00
Kraig Amador bec6870ed4 ec2_sd_configs: Support profiles for configuring the ec2 service 2016-11-03 08:38:02 -07:00
beorn7 b2f28a9e82 Merge branch 'release-1.2' 2016-11-03 14:42:15 +01:00
Brian Brazil d1ece12c70 Handle null Regex in the config as the empty regex. (#2150) 2016-11-03 13:34:15 +00:00
bekbulatov 2bc12fa2fb Set timeout for marathon_sd 2016-10-24 11:27:08 +01:00
bekbulatov c689b35858 Merge branch 'master' into marathon_tls 2016-10-24 10:37:32 +01:00
Matti Savolainen aabf4a419b use LabelNam.IsValid() instead of LabelNameRE and MatchString instead of Match 2016-10-19 16:30:52 +03:00
Matti Savolainen f867c1fd58 formating and text fixes, adjust regexp 2016-10-19 13:31:55 +03:00
Matti Savolainen 56907ba6e3 Add interpolation to good test config. Fix regex 2016-10-19 01:19:19 +03:00
Matti Savolainen 7a36af1c85 add comment about interpolation 2016-10-19 00:42:49 +03:00
Matti Savolainen 3b8e7c1277 Merge branch 'master' of https://github.com/prometheus/prometheus into bug/target_label_unmarshal 2016-10-19 00:33:52 +03:00
Matti Savolainen 5a1e909b5d Make TargetLabel in RelabelConfig a string 2016-10-19 00:33:22 +03:00
Björn Rabenstein d93f73874f Merge pull request #2093 from dominikschulz/spelling
Trivial spelling corrections
2016-10-18 22:46:03 +02:00
Dominik Schulz 182e17958a Trivial spelling corrections and a small comment. 2016-10-18 20:14:38 +02:00
bekbulatov ac702f66eb Resolve merge conflicts 2016-10-18 14:14:24 +01:00
Fabian Reinartz 1b6dfa32a9 config: rename role 'endpoint' to 'endpoints' 2016-10-17 11:53:49 +02:00
Fabian Reinartz b24602f713 kubernetes: merge back into single configuration 2016-10-17 10:32:10 +02:00
Fabian Reinartz 2331701b50 kubernetes: Add K8S v2 pod discovery
This adds plumbing for a parallel version of the new K8S SD
and adds pod discovery as the first role.
2016-10-17 10:32:10 +02:00
Dominik Schulz 72cbf8af6f Fix small copy and paste error 2016-10-08 08:49:00 +02:00
bekbulatov 01b53c1180 Add tls support 2016-10-07 13:40:22 +01:00
Brian Brazil 77605649a9 Add support for remote write relabelling.
Switch back to a single remote writer, as we were only ever meant to
have one and the relabel semantics are clearer that way.
2016-10-05 07:43:19 +01:00
Tom Wilkie 4520e12440 Add HTTP Basic Auth & TLS support to the generic write path. (#1957)
* Add config, HTTP Basic Auth and TLS support to the generic write path.

- Move generic write path configuration to the config file
- Factor out config.TLSConfig -> tlf.Config translation
- Support TLSConfig for generic remote storage
- Rename Run to Start, and make it non-blocking.
- Dedupe code in httputil for TLS config.
- Make remote queue metrics global.
2016-09-19 22:47:51 +02:00
Tobias Schmidt 874cb44bb6 Merge pull request #1996 from ton31337/Fix/allow_numbers_as_first_letter
Allow number to be the first letter as well for `job_name`
2016-09-16 11:08:52 -04:00
Donatas Abraitis 1aa8898b66 Allow number to be the first letter as well for job_name 2016-09-16 14:06:47 +03:00
Ingo Gottwald 3b546d061f Add support for GCE discovery 2016-09-16 08:55:33 +02:00
Alexey Miroshkin e29d9394e5 Forbid invalid relabel configurations
This fix adds check if target_label value is set in case if action is replace or
hashmod
Issue [#1900]
2016-08-29 16:56:06 +02:00
Fabian Reinartz be596f82b4 Merge pull request #1783 from knyar/json
Allow URLs in targets defined via a JSON file
2016-08-10 09:42:17 +02:00
Frederic Branczyk b655aa002f introduce top level alerting config node 2016-08-09 14:19:25 +02:00
Frederic Branczyk 679d225c8d allow relabeling of alerts
in case of dropping don't even enqueue them
2016-08-09 14:18:31 +02:00
Fabian Reinartz 7a0b3af0b7 config: validate Kubernetes role correctly. 2016-07-18 22:24:41 +09:00
Fabian Reinartz 919558f601 config: remove deprecated target_groups configuration 2016-07-14 09:55:00 +09:00