* Track remote write queues via a map so we don't care about index.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Support a job name for remote write/read so we can differentiate between
them using the name.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Remote write/read has Name to not confuse the meaning of the field with
scrape job names.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Split queue/client label into remote_name and url labels.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Don't allow for duplicate remote write/read configs.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Ensure we restart remote write queues if the hash of their config has
not changed, but the remote name has changed.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Include name in remote read/write config hashes, simplify duplicates
check, update test accordingly.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
This adds support for basic authentication which closes#3090
The support for specifying the client timeout was removed as discussed in https://github.com/prometheus/common/pull/123. Marathon was the only sd mechanism doing this and configuring the timeout is done through `Context`.
DC/OS uses a custom `Authorization` header for authenticating. This adds 2 new configuration properties to reflect this.
Existing configuration files that use the bearer token will no longer work. More work is required to make this backwards compatible.
* consul: improve consul service discovery
Related to #3711
- Add the ability to filter by tag and node-meta in an efficient way (`/catalog/services`
allow filtering by node-meta, and returns a `map[string]string` or `service`->`tags`).
Tags and nore-meta are also used in `/catalog/service` requests.
- Do not require a call to the catalog if services are specified by name. This is important
because on large cluster `/catalog/services` changes all the time.
- Add `allow_stale` configuration option to do stale reads. Non-stale
reads can be costly, even more when you are doing them to a remote
datacenter with 10k+ targets over WAN (which is common for federation).
- Add `refresh_interval` to minimize the strain on the catalog and on the
service endpoint. This is needed because of that kind of behavior from
consul: https://github.com/hashicorp/consul/issues/3712 and because a catalog
on a large cluster would basically change *all* the time. No need to discover
targets in 1sec if we scrape them every minute.
- Added plenty of unit tests.
Benchmarks
----------
```yaml
scrape_configs:
- job_name: prometheus
scrape_interval: 60s
static_configs:
- targets: ["127.0.0.1:9090"]
- job_name: "observability-by-tag"
scrape_interval: "60s"
metrics_path: "/metrics"
consul_sd_configs:
- server: consul.service.par.consul.prod.crto.in:8500
tag: marathon-user-observability # Used in After
refresh_interval: 30s # Used in After+delay
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: ^(.*,)?marathon-user-observability(,.*)?$
action: keep
- job_name: "observability-by-name"
scrape_interval: "60s"
metrics_path: "/metrics"
consul_sd_configs:
- server: consul.service.par.consul.prod.crto.in:8500
services:
- observability-cerebro
- observability-portal-web
- job_name: "fake-fake-fake"
scrape_interval: "15s"
metrics_path: "/metrics"
consul_sd_configs:
- server: consul.service.par.consul.prod.crto.in:8500
services:
- fake-fake-fake
```
Note: tested with ~1200 services, ~5000 nodes.
| Resource | Empty | Before | After | After + delay |
| -------- |:-----:|:------:|:-----:|:-------------:|
|/service-discovery size|5K|85MiB|27k|27k|27k|
|`go_memstats_heap_objects`|100k|1M|120k|110k|
|`go_memstats_heap_alloc_bytes`|24MB|150MB|28MB|27MB|
|`rate(go_memstats_alloc_bytes_total[5m])`|0.2MB/s|28MB/s|2MB/s|0.3MB/s|
|`rate(process_cpu_seconds_total[5m])`|0.1%|15%|2%|0.01%|
|`process_open_fds`|16|*1236*|22|22|
|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="services"}[5m])`|~0|1|1|*0.03*|
|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="service"}[5m])`|0.1|*80*|0.5|0.5|
|`prometheus_target_sync_length_seconds{quantile="0.9",scrape_job="observability-by-tag"}`|N/A|200ms|0.2ms|0.2ms|
|Network bandwidth|~10kbps|~2.8Mbps|~1.6Mbps|~10kbps|
Filtering by tag using relabel_configs uses **100kiB and 23kiB/s per service per job** and quite a lot of CPU. Also sends and additional *1Mbps* of traffic to consul.
Being a little bit smarter about this reduces the overhead quite a lot.
Limiting the number of `/catalog/services` queries per second almost removes the overhead of service discovery.
* consul: tweak `refresh_interval` behavior
`refresh_interval` now does what is advertised in the documentation,
there won't be more that one update per `refresh_interval`. It now
defaults to 30s (which was also the current waitTime in the consul query).
This also make sure we don't wait another 30s if we already waited 29s
in the blocking call by substracting the number of elapsed seconds.
Hopefully this will do what people expect it does and will be safer
for existing consul infrastructures.
For special remote read endpoints which have only data for specific
queries, it is desired to limit the number of queries sent to the
configured remote read endpoint to reduce latency and performance
overhead.
Currently all read queries are simply pushed to remote read clients.
This is fine, except for remote storage for wich it unefficient and
make query slower even if remote read is unnecessary.
So we need instead to compare the oldest timestamp in primary/local
storage with the query range lower boundary. If the oldest timestamp
is older than the mint parameter, then there is no need for remote read.
This is an optionnal behavior per remote read client.
Signed-off-by: Thibault Chataigner <t.chataigner@criteo.com>
Fixing the config/config_test, the discovery/file/file_test and the
promql/promql_test tests for Windows. For most of the tests, the fix involved
correct handling of path separators. In the case of the promql tests, the
issue was related to the removal of the temporal directories used by the
storage. The issue is that the RemoveAll() call returns an error when it
tries to remove a directory which is not empty, which seems to be true due to
some kind of process that is still running after closing the storage. To fix
it I added some retries to the remove of the temporal directories.
Adding tags file from Universal Ctags to .gitignore
Allow namespace discovery to be more easily extended in the future by using a struct rather than just a list.
Rename fields for kubernetes namespace discovery
Each remote write endpoint gets its own set of relabeling rules.
This is based on the (yet-to-be-merged)
https://github.com/prometheus/prometheus/pull/2419, which removes legacy
remote write implementations.
This imposes a hard limit on the number of samples ingested from the
target. This is counted after metric relabelling, to allow dropping of
problemtic metrics.
This is intended as a very blunt tool to prevent overload due to
misbehaving targets that suddenly jump in sample count (e.g. adding
a label containing email addresses).
Add metric to track how often this happens.
Fixes#2137
Introduce two new relabel actions. labeldrop, and labelkeep.
These can be used to filter the set of labels by matching regex
- labeldrop: drops all labels that match the regex
- labelkeep: drops all labels that do not match the regex
This adds `role` field to the Kubernetes SD config, which indicates
which type of Kubernetes SD should be run.
This no longer allows discovering pods and nodes with the same SD
configuration for example.
This change deprecates the `target_groups` option in favor
of `static_configs`. The old configuration is still accepted
but prints a warning.
Configuration loading errors if both options are set.