This PR adds the `/status/config` endpoint which exposes the currently
loaded Prometheus config. This is the same config that is displayed on
`/config` in the UI in YAML format. The response payload looks like
such:
```
{
"status": "success",
"data": {
"yaml": <CONFIG>
}
}
```
With the squaring of the timestamp, we run into the
limitations of the 53bit mantissa for a 64bit float.
By subtracting away a timestamp of one of the samples (which is how the
intercept is used) we avoid this issue in practice as it's unlikely
that it is used over a very long time range.
Fixes#2674
We would overscan when hitting a value directly, interspersed with
samples in between timestamps. Apparently, that happens rarely enough
that it was only noticed recently.
Kubernetes 1.7+ no longer exposes cAdvisor metrics on the Kubelet
metrics endpoint. Update the example configuration to scrape cAdvisor
in addition to Kubelet. The provided configuration works for 1.7.3+
and commented notes are given for 1.7.2 and earlier versions.
Also remove the comment about node (Kubelet) CA not matching the master
CA. Since the example no longer connects directly to the nodes, it
doesn't matter what CA they're using.
References:
- https://github.com/kubernetes/kubernetes/issues/48483
- https://github.com/kubernetes/kubernetes/pull/49079
This can happen in the situation where the system scales up the number of shards massively (to deal with some backlog), then scales it down again as the number of samples sent during the time period is less than the number received.
Fixing the config/config_test, the discovery/file/file_test and the
promql/promql_test tests for Windows. For most of the tests, the fix involved
correct handling of path separators. In the case of the promql tests, the
issue was related to the removal of the temporal directories used by the
storage. The issue is that the RemoveAll() call returns an error when it
tries to remove a directory which is not empty, which seems to be true due to
some kind of process that is still running after closing the storage. To fix
it I added some retries to the remove of the temporal directories.
Adding tags file from Universal Ctags to .gitignore
The changes [1][] to Marathon service discovery to support multiple
ports mean that Prometheus now attempts to scrape all ports belonging to
a Marathon service.
You can use port definition or port mapping labels to filter out which
ports to scrape but that requires service owners to update their
Marathon configuration.
To allow for a smoother migration path, add a
`__meta_marathon_port_index` label, whose value is set to the port's
sequential index integer. For example, PORT0 has the value `0`, PORT1
has the value `1`, and so on.
This allows you to support scraping both the first available port (the
previous behaviour) in addition to ports with a `metrics` label.
For example, here's the relabel configuration we might use with
this patch:
- action: keep
source_labels: ['__meta_marathon_port_definition_label_metrics', '__meta_marathon_port_mapping_label_metrics', '__meta_marathon_port_index']
# Keep if port mapping or definition has a 'metrics' label with any
# non-empty value, or if no 'metrics' port label exists but this is the
# service's first available port
regex: ([^;]+;;[^;]+|;[^;]+;[^;]+|;;0)
This assumes that the Marathon API returns the ports in sorted order
(matching PORT0, PORT1, etc), which it appears that it does.
[1]: https://github.com/prometheus/prometheus/pull/2506