Picks up the fix for https://github.com/go-yaml/yaml/issues/665 -- we
picked up the important fix for CVE-2022-28948 already.
This only affects go-yaml *v3*; the only user of v3 in Prometheus itself
is rulefmt so the impact seems limited.
Signed-off-by: David Leadbeater <dgl@dgl.cx>
The minimum go version was bumped to 1.17 in
29b58448e1, but the main README still
referenced go 1.16 as the minimum version required. This updates that.
I took a quick look through the other docs in the repo (ie, I did some
naive grepping), and this is the only reference I spotted.
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* Rename walDir parameter to dir
Signed-off-by: Matej Gera <matejgera@gmail.com>
* Improve NewQueueManager comment
Signed-off-by: Matej Gera <matejgera@gmail.com>
* Check syntax of example configurations
Fix a mistake in the hetzner and vultr configs.
Also it's easier not to fight the build system, and this will lint
example code, so ignore a lint issue in custom-sd.
Signed-off-by: David Leadbeater <dgl@dgl.cx>
* No need to import Makefile.common, it just complicates things
Signed-off-by: David Leadbeater <dgl@dgl.cx>
This commit fixes a typo when reporting an error that the the symbols
table size has been exceeded.
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Update the release documentation about dependencies.
* We now have dependabot to auto-update things.
* Add note about some manual dependency work.
Signed-off-by: SuperQ <superq@gmail.com>
This moves prometheus_ready to the web package and links it with the ready variable that decides if HTTP requests should return 200 or 503.
This is a follow up change from #10682
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
* Set Go minimum version to 1.17.
* Update go.mod format for 1.17.
* Remove unecessary exclude block for k8s.io/client-go.
* Remove unecessary retract section.
Signed-off-by: SuperQ <superq@gmail.com>
When Prometheus starts it can take a long time before WAL is replayed and it can do anything useful. While it's starting it exposes metrics and other Prometheus servers can scrape it.
We do have alerts that fire if any Prometheus server is not ingesting samples and so far we've been dealing with instances that are starting for a long time by adding a check on Prometheus process uptime. Relying on uptime isn't ideal because the time needed to start depends on the number of metrics scraped, and so on the amount of data in WAL.
To help write better alerts it would be great if Prometheus exposed a metric that tells us it's fully started, that way any alert that suppose to notify us about any runtime issue can filter out starting instances.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
* promtool: support matchers when querying label values
Signed-off-by: Ben Ye <ben.ye@bytedance.com>
* address review comment
Signed-off-by: Ben Ye <ben.ye@bytedance.com>
During shutdown TSDB is stopped before rule manager is stopped. Since TSDB shutdown can take a long time (minutes or 10s of minutes) it keeps rule manager running while parts of Prometheus are already stopped (most notebly scrape manager). This can cause false positive alerts to fire, mostly those that rely on absent() calls since new sample appends will stop while alert queries are still evaluated.
Stop rules before stopping TSDB and scrape manager to avoid this problem.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Since Prometheus documentation is versioned, do not write down that a
specific function was added in Prom 2.0, for consistency.
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
We know the max size of our map so we can create it with that information and avoid extra allocations
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
* Provide a callout about the vector matching keywords and the group
modifiers.
Relates prometheus/docs#2106
Signed-off-by: Harold Dost <h.dost@criteo.com>
This commit introduces a new metric to count the number of failed
requests to Linode's API when using Linode SD. Resolves#10672, inspired
by #10476.
_Note_: this doens't count failures when polling the `/account/events`
endpoint, as a `401` there is how we determine if the supplied token has
the needed API scopes to do event polling vs full refreshes each
interval.
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
On macOS, the TestTombstoneCleanRetentionLimitsRace performs very
poorly. It takes more than a second to write out one block, and as it
writes 400 of them, we run into the 10-minute test timeout frequently.
While this doesn't fix the actual performance issue, breaking each
iteration into a subtest makes the test pass reliably (because each
iteration comfortably finishes in under a minute).
Related report: https://groups.google.com/g/prometheus-developers/c/jxQ6Ayg6VJ4/m/03H_DS9PDAAJ
Signed-off-by: Matthias Rampke <matthias@prometheus.io>