* Use request.Context() instead of a global map of contexts.
* Add some basic opentracing instrumentation on the query path.
* Remove tracehandler endpoint.
retreival.Target contains a mutex. It was copied in the Targets()
call. This potentially can wreak a lot of havoc.
It might even have caused the issues reported as #2266 and #2262 .
Right now the /alerts page of Prometheus sorts alerts by severity
(firing, pending, inactive). Once multiple alerts have the same
severity, their order seems to correlate to how they are placed in the
configuration files, but not always. Looking at the code, we make use of
sort.Sort(), which is documented not to provide a stable sort. The
Less() function also only takes the alert state into account.
This change extends the Less() function to provide a lexicographic order
on both the alert state and the name. This means I can finally find the
alerts I'm looking for without using my browser's search feature.
This is based on https://github.com/prometheus/prometheus/pull/1997.
This adds contexts to the relevant Storage methods and already passes
PromQL's new per-query context into the storage's query methods.
The immediate motivation supporting multi-tenancy in Frankenstein, but
this could also be used by Prometheus's normal local storage to support
cancellations and timeouts at some point.
For Weaveworks' Frankenstein, we need to support multitenancy. In
Frankenstein, we initially solved this without modifying the promql
package at all: we constructed a new promql.Engine for every
query and injected a storage implementation into that engine which would
be primed to only collect data for a given user.
This is problematic to upstream, however. Prometheus assumes that there
is only one engine: the query concurrency gate is part of the engine,
and the engine contains one central cancellable context to shut down all
queries. Also, creating a new engine for every query seems like overkill.
Thus, we want to be able to pass per-query contexts into a single engine.
This change gets rid of the promql.Engine's built-in base context and
allows passing in a per-query context instead. Central cancellation of
all queries is still possible by deriving all passed-in contexts from
one central one, but this is now the responsibility of the caller. The
central query context is now created in main() and passed into the
relevant components (web handler / API, rule manager).
In a next step, the per-query context would have to be passed to the
storage implementation, so that the storage can implement multi-tenancy
or other features based on the contextual information.
This will avoid duplicate MetricFamilies, thereby shrinking the size
of the federation payload and also creating legal text format.
Also, add unit tests for federation. They were also needed for the
previous state of the code, but were missing.
I got feedback from different sources about rules and targets being
too heavy in the status tab if their are lots of them.
This change also allows for more fine-granular locking.
This is with `golint -min_confidence=0.5`.
I left several lint warnings untouched because they were either
incorrect or I felt it was better not to change them at the moment.
This got broken in
78047326b4
since it stopped using the DefaultServeMux.
This approach will defer pprof requests to the DefaultServeMux, which
may or may not have pprof enabled (in Prometheus, it gets it included in
main.go). An alternative approach would be to duplicate the four lines in
https://golang.org/src/net/http/pprof/pprof.go#L62. When choosing that
approach though, we would not automatically gain any new endpoints added
by net/http/pprof or other /debug endpoints in the future.
Besides fixing https://github.com/prometheus/prometheus/issues/805 by
making the entire externally reachable server URL configurable, this
adds tests for the "globalURL" template function and makes it easier to
test other such functions in the future.
This breaks the `web.Hostname` flag (and introduces `web.external-url`).
This flag is likely only used by few users, so I hope that's
justifiable.
Fixes https://github.com/prometheus/prometheus/issues/805
This commit adds a federation handler on /federate. It accepts `match[]`
query parameters containing vector selectors. Their intersection determines
the in-memory metrics that are returned in the same way as the
/metrics endpoint does (modulo sorting).
Version information is determined at build-time and thus there is
no need to pass it down from main. In its own package it can
be used from various other packages.
Main changes:
- Switched to using `go-bindata` in place of `scripts/embed-static.sh`.
- Support for building Prometheus without a `Makefile`.
- Minor typo fix to make Prometheus build on Windows (without Makefiles).
Please note that this does not mean that prometheus will work on Windows.
There are still failing tests!
Previously we redirected any non-existent path to the root (or path
prefix).
The new behavior:
With no path prefix:
- "" -> "/"
- "/biz" -> 404
With path prefix of "/foo/bar":
- "" -> "/foo/bar/"
- "/" -> "/foo/bar/"
- "/foo/bar" -> "/foo/bar/"
- "/biz" -> /foo/bar/biz"
(anything not starting with the path prefix gets the prefix prepended)
- "/foo/bar/biz" -> 404
This change is conceptually very simple, although the diff is large. It
switches logging from "github.com/golang/glog" to
"github.com/prometheus/log", while not actually changing any log
messages. V(1)-style logging has been changed to be log.Debug*().
Appending to the storage can block for a long time. Timing out
scrapes can also cause longer blocks. This commit avoids that those
blocks affect other compnents than the target itself.
Also the Target interface was removed.
The target implementation and interface contain methods only serving a
specific purpose of the templates. They were moved to the template
as they operate on more fundamental target data.
This adds the population standard deviation and
variance as aggregation functions, useful for
spotting how many standard deviations some samples
are from the mean.
The one central sample ingestion channel has caused a variety of
trouble. This commit removes it. Targets and rule evaluation call an
Append method directly now. To incorporate multiple storage backends
(like OpenTSDB), storage.Tee forks the Append into two different
appenders.
Note that the tsdb queue manager had its own queue anyway. It was a
queue after a queue... Much queue, so overhead...
Targets have their own little buffer (implemented as a channel) to
avoid stalling during an http scrape. But a new scrape will only be
started once the old one is fully ingested.
The contraption of three pipelined ingesters was removed. A Target is
an ingester itself now. Despite more logic in Target, things should be
less confusing now.
Also, remove lint and vet warnings in ast.go.
- Move CONTRIBUTORS.md to the more common AUTHORS.
- Added the required NOTICE file.
- Changed "Prometheus Team" to "The Prometheus Authors".
- Reverted the erroneous changes to the Apache License.
Essentially:
- Remove unused code.
- Make it 'go vet' clean. The only remaining warnings are in generated code.
- Make it 'golint' clean. The only remaining warnings are in gerenated code.
- Smoothed out same minor things.
Change-Id: I3fe5c1fbead27b0e7a9c247fee2f5a45bc2d42c6
Having metrics with variable timestamps inconsistently
spaced when things fail will make it harder to write correct rules.
Update status page, requires some refactoring to insert a function.
Change-Id: Ie1c586cca53b8f3b318af8c21c418873063738a8
Due to the lack of a </a>, this makes the entire header render badly.
Accordingly it's safe to assume noone is using it, so remove it.
With the new console template support, we'll need to something a bit
more nuanced later.
Change-Id: I3424bed6aea18cbd4c63ad48f98808098dadc3ad
Move rulemanager to it's own package to break cicrular dependency.
Make NewTestTieredStorage available to tests, remove duplication.
Change-Id: I33b321245a44aa727bfc3614a7c9ae5005b34e03
The closing of Prometheus now using a sync.Once wrapper to prevent
any accidental multiple invocations of it, which could trigger
corruption or a race condition. The shutdown process is made more
verbose through logging.
A not-enabled by default web handler has been provided to trigger a
remote shutdown if requested for debugging purposes.
Change-Id: If4fee75196bbff1fb1e4a4ef7e1cfa53fef88f2e
Due to on going issues, we've decided to remove gorest. It started with gorest
not being thread-safe (it does introspection to create a new handler which is
an easy process to mess up with multiple threads of execution):
https://code.google.com/p/gorest/issues/detail?id=15
While the issue has been marked fixed, it looks like the patch has introduced
more problems than the original issue and simply doesn't work properly.
I'm not sure the behaviour was thought through properly. If a new instance is
needed every request then a handler-factory is needed or the library needs to
set expectations about how the new objects should interact with their
constructor state.
While it was tempting to try out another routing library, I think for now
it's better to use dumb vanilla Go routing. At least until we decide which
URL format we intend to standardize on.
Change-Id: Ica3da135d05f8ab8fc206f51eeca4f684f8efa0e
This commit fixes a critique of the old storage API design, whereby
the input parameters were always as raw bytes and never Protocol
Buffer messages that encapsulated the data, meaning every place a
read or mutation was conducted needed to manually perform said
translations on its own. This is taxing.
Change-Id: I4786938d0d207cefb7782bd2bd96a517eead186f
An design question was open for me in the beginning was whether to
serialize other types to disk, but Protocol Buffers quickly won out,
which allows us to drop support for other types. This is a good
start to cleaning up a lot of cruft in the storage stack and
can let us eventually decouple the various moving parts into
separate subsystems for easier reasoning.
This commit is not strictly required, but it is a start to making
the rest a lot more enjoyable to interact with.
This commit simplifies the way that compactions across a database's
keyspace occur due to reading the LevelDB internals. Secondarily it
introduces the database size estimation mechanisms.
Include database health and help interfaces.
Add database statistics; remove status goroutines.
This commit kills the use of Go routines to expose status throughout
the web components of Prometheus. It also dumps raw LevelDB status
on a separate /databases endpoint.
To achieve that, this PR
- converts static/index.html ("console") and graph to templates
- moved the handlebars template to separated file to avoid escaping issues
Route changes:
/status -> /
/static -> /console
/static/graph.html -> /graph
- utility/embed-static.sh, get called in Makefile to create go map from files
- web/blob/blob.go implements http Handle for serving the files from the map
- web/status.go uses blog.GetFile() to get the template file
The assets are gzipped and decompressed on demand.
This roughly comprises the following changes:
- index target pools by job instead of scrape interval
- make targets within a pool exchangable while preserving existing
health state for targets
- allow exchanging targets via HTTP API (PUT)
- show target lists in /status (experimental, for own debug use)