* adding information about the health and errors for Rules
adding Health() and LastError() to the Rule interface. This will allow
us to easily surface information about rules.
Signed-off-by: noqcks <benny@noqcks.io>
* updating rules.html with fields for Rule errors and health state
Signed-off-by: noqcks <benny@noqcks.io>
* fix code comment grammar & access Rule health/error info using a mutex
Signed-off-by: noqcks <benny@noqcks.io>
* s/Errors/Error/ in rules.html to remain consistent with targets.html
Signed-off-by: noqcks <benny@noqcks.io>
* adding periods to code comments in reporting/alerting
Signed-off-by: noqcks <benny@noqcks.io>
* putting health/error below mutex in struct field
Signed-off-by: noqcks <benny@noqcks.io>
It was added 5 years ago by Matt and I'm not sure anyone ever used
it after public release (since we have /debug/pprof/heap as well).
It also lacked error checking and allows people to write to disk over HTTP.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
Displaying all the dropped targets in the service-discovery page hurts
the Prometheus server as well as the browser when thousands of dropped
targets exist. This change limits this number to 1,000 and display the
number of active/total targets per scrape configuration.
Add warning when more than 100 targets are dropped
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Fix race by properly locking access to scrape pools. Use separate mutex for information needed by UI so that UI isn't blocked when targets are being updated.
* web: replace deprecated InstrumentHandler()
This change replaces the deprecated InstrumentHandler function by the
equivalent functions from the promhttp package.
The following metrics are removed:
* http_request_duration_microseconds (Summary).
* http_request_size_bytes (Summary).
* http_requests_total (Counter).
And the following metrics are added instead:
* prometheus_http_request_duration_seconds (Histogram).
* prometheus_http_response_size_bytes (Histogram).
* promhttp_metric_handler_requests_in_flight (Gauge).
* promhttp_metric_handler_requests_total (Counter).
* Update github.com/prometheus/common/route package
* web: refactor using the new prometheus/common/route package
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
net.Listener converts 0.0.0.0 to :: which fails for hosts where IPv6 is
disabled. This change uses the original listen address parameter instead
of grpcl.Addr().String().
Whenever a route prefix is applied, the router prepends the prefix to
the URL path on the request. For most handlers, this is not an issue
because the request's path is only used for routing and is not actually
needed by the handler itself. However, Prometheus delegates the handling
of the /debug/* endpoints to the http.DefaultServeMux which has it's own
routing logic that depends on the url.Path. As a result, whenever a
prefix is applied, the prefixed URL is passed to the DefaultServeMux
which has no awareness of the prefix and returns a 404.
This change fixes the issue by creating a new serveDebug handler which
routes requests /debug/* requests to appropriate net/http/pprof handler
and removing the net/http/pprof import in cmd/prometheus since it is no
longer necessary.
Fixes#2183.
This PR adds the `/status/config` endpoint which exposes the currently
loaded Prometheus config. This is the same config that is displayed on
`/config` in the UI in YAML format. The response payload looks like
such:
```
{
"status": "success",
"data": {
"yaml": <CONFIG>
}
}
```
* Use request.Context() instead of a global map of contexts.
* Add some basic opentracing instrumentation on the query path.
* Remove tracehandler endpoint.
retreival.Target contains a mutex. It was copied in the Targets()
call. This potentially can wreak a lot of havoc.
It might even have caused the issues reported as #2266 and #2262 .
Right now the /alerts page of Prometheus sorts alerts by severity
(firing, pending, inactive). Once multiple alerts have the same
severity, their order seems to correlate to how they are placed in the
configuration files, but not always. Looking at the code, we make use of
sort.Sort(), which is documented not to provide a stable sort. The
Less() function also only takes the alert state into account.
This change extends the Less() function to provide a lexicographic order
on both the alert state and the name. This means I can finally find the
alerts I'm looking for without using my browser's search feature.
This is based on https://github.com/prometheus/prometheus/pull/1997.
This adds contexts to the relevant Storage methods and already passes
PromQL's new per-query context into the storage's query methods.
The immediate motivation supporting multi-tenancy in Frankenstein, but
this could also be used by Prometheus's normal local storage to support
cancellations and timeouts at some point.
For Weaveworks' Frankenstein, we need to support multitenancy. In
Frankenstein, we initially solved this without modifying the promql
package at all: we constructed a new promql.Engine for every
query and injected a storage implementation into that engine which would
be primed to only collect data for a given user.
This is problematic to upstream, however. Prometheus assumes that there
is only one engine: the query concurrency gate is part of the engine,
and the engine contains one central cancellable context to shut down all
queries. Also, creating a new engine for every query seems like overkill.
Thus, we want to be able to pass per-query contexts into a single engine.
This change gets rid of the promql.Engine's built-in base context and
allows passing in a per-query context instead. Central cancellation of
all queries is still possible by deriving all passed-in contexts from
one central one, but this is now the responsibility of the caller. The
central query context is now created in main() and passed into the
relevant components (web handler / API, rule manager).
In a next step, the per-query context would have to be passed to the
storage implementation, so that the storage can implement multi-tenancy
or other features based on the contextual information.
This will avoid duplicate MetricFamilies, thereby shrinking the size
of the federation payload and also creating legal text format.
Also, add unit tests for federation. They were also needed for the
previous state of the code, but were missing.
I got feedback from different sources about rules and targets being
too heavy in the status tab if their are lots of them.
This change also allows for more fine-granular locking.
This is with `golint -min_confidence=0.5`.
I left several lint warnings untouched because they were either
incorrect or I felt it was better not to change them at the moment.
This got broken in
78047326b4
since it stopped using the DefaultServeMux.
This approach will defer pprof requests to the DefaultServeMux, which
may or may not have pprof enabled (in Prometheus, it gets it included in
main.go). An alternative approach would be to duplicate the four lines in
https://golang.org/src/net/http/pprof/pprof.go#L62. When choosing that
approach though, we would not automatically gain any new endpoints added
by net/http/pprof or other /debug endpoints in the future.
Besides fixing https://github.com/prometheus/prometheus/issues/805 by
making the entire externally reachable server URL configurable, this
adds tests for the "globalURL" template function and makes it easier to
test other such functions in the future.
This breaks the `web.Hostname` flag (and introduces `web.external-url`).
This flag is likely only used by few users, so I hope that's
justifiable.
Fixes https://github.com/prometheus/prometheus/issues/805
This commit adds a federation handler on /federate. It accepts `match[]`
query parameters containing vector selectors. Their intersection determines
the in-memory metrics that are returned in the same way as the
/metrics endpoint does (modulo sorting).
Version information is determined at build-time and thus there is
no need to pass it down from main. In its own package it can
be used from various other packages.
Main changes:
- Switched to using `go-bindata` in place of `scripts/embed-static.sh`.
- Support for building Prometheus without a `Makefile`.
- Minor typo fix to make Prometheus build on Windows (without Makefiles).
Please note that this does not mean that prometheus will work on Windows.
There are still failing tests!
Previously we redirected any non-existent path to the root (or path
prefix).
The new behavior:
With no path prefix:
- "" -> "/"
- "/biz" -> 404
With path prefix of "/foo/bar":
- "" -> "/foo/bar/"
- "/" -> "/foo/bar/"
- "/foo/bar" -> "/foo/bar/"
- "/biz" -> /foo/bar/biz"
(anything not starting with the path prefix gets the prefix prepended)
- "/foo/bar/biz" -> 404
This change is conceptually very simple, although the diff is large. It
switches logging from "github.com/golang/glog" to
"github.com/prometheus/log", while not actually changing any log
messages. V(1)-style logging has been changed to be log.Debug*().
Appending to the storage can block for a long time. Timing out
scrapes can also cause longer blocks. This commit avoids that those
blocks affect other compnents than the target itself.
Also the Target interface was removed.
The target implementation and interface contain methods only serving a
specific purpose of the templates. They were moved to the template
as they operate on more fundamental target data.
This adds the population standard deviation and
variance as aggregation functions, useful for
spotting how many standard deviations some samples
are from the mean.