An design question was open for me in the beginning was whether to
serialize other types to disk, but Protocol Buffers quickly won out,
which allows us to drop support for other types. This is a good
start to cleaning up a lot of cruft in the storage stack and
can let us eventually decouple the various moving parts into
separate subsystems for easier reasoning.
This commit is not strictly required, but it is a start to making
the rest a lot more enjoyable to interact with.
This adds timers around several query-relevant code blocks. For now, the
query timer stats are only logged for queries initiated through the UI.
In other cases (rule evaluations), the stats are simply thrown away.
My hope is that this helps us understand where queries spend time,
especially in cases where they sometimes hang for unusual amounts of
time.
In order to help corroborate whether a Prometheus instance has
flapped until meta-monitoring is in-place, we ought to provide the
instance's start time in the console to aid in diagnostics.
This commit simplifies the way that compactions across a database's
keyspace occur due to reading the LevelDB internals. Secondarily it
introduces the database size estimation mechanisms.
Include database health and help interfaces.
Add database statistics; remove status goroutines.
This commit kills the use of Go routines to expose status throughout
the web components of Prometheus. It also dumps raw LevelDB status
on a separate /databases endpoint.
This commit introduces three background compactors, which compact
sparse samples together.
1. Older than five minutes is grouped together into chunks of 50 every 30
minutes.
2. Older than 60 minutes is grouped together into chunks of 250 every 50
minutes.
3. Older than one day is grouped together into chunks of 5000 every 70
minutes.
Unfortunately ``cp`` on Darwin regards some flags as positional and
requires them to be in a specific place. The new Protocol Buffer
descriptor bundling fails on Mac OS.
The Protocol Buffer compiler supports generating a machine-readable
descriptor file encoded as a provided Protocol Buffer message type,
which can be used to decode messages that have been encoded with it
after-the-fact. The generated descriptor also bundles in dependent
message types.
We can use this to perform forensics on old Prometheus clients, if
necessary.
Go's time.Time represents time as UTC in its fundamental data type.
That said, when using ``time.Unix(...)``, it sets the zone for the
time representation to the local. Unfortunately with diagnosis and
our tests, it is a PITA to jump between various zones, even though
the serialized version remains the same.
To keep things easy, all places where times are generated or read
are converted into UTC. These conversions are cheap, for
``Time.In`` merely changes a pointer reference in the struct,
nothing more. This enables me to diagnose test failures with fixture
data very easily.
By setting Access-Control headers, the Prometheus metrics API can be
accessed by cross-origin javascript applications (e.g., an external
dashboard pulling Prometheus metrics).
To achieve that, this PR
- converts static/index.html ("console") and graph to templates
- moved the handlebars template to separated file to avoid escaping issues
Route changes:
/status -> /
/static -> /console
/static/graph.html -> /graph
The curator doesn't do anything yet; rather, this is the type
definition including the anciliary testing scaffold.
Improve Makefile and Git developer experience.
The top-level Makefile was a bit overloaded in terms of generation of
assets and their management. This has been offloaded into separate
Makefiles.
The Git developer experience sucked due to lack of .gitignore
policies.
Also: Fix faulty skiplist naming from old merge.
- utility/embed-static.sh, get called in Makefile to create go map from files
- web/blob/blob.go implements http Handle for serving the files from the map
- web/status.go uses blog.GetFile() to get the template file
The assets are gzipped and decompressed on demand.
This roughly comprises the following changes:
- index target pools by job instead of scrape interval
- make targets within a pool exchangable while preserving existing
health state for targets
- allow exchanging targets via HTTP API (PUT)
- show target lists in /status (experimental, for own debug use)