This was initially motivated by wanting to distribute the rule checker
tool under `tools/rule_checker`. However, this was not possible without
also distributing the LevelDB dynamic libraries because the tool
transitively depended on Levigo:
rule checker -> query layer -> tiered storage layer -> leveldb
This change separates external storage interfaces from the
implementation (tiered storage, leveldb storage, memory storage) by
putting them into separate packages:
- storage/metric: public, implementation-agnostic interfaces
- storage/metric/tiered: tiered storage implementation, including memory
and LevelDB storage.
I initially also considered splitting up the implementation into
separate packages for tiered storage, memory storage, and LevelDB
storage, but these are currently so intertwined that it would be another
major project in itself.
The query layers and most other parts of Prometheus now have notion of
the storage implementation anymore and just use whatever implementation
they get passed in via interfaces.
The rule_checker is now a static binary :)
Change-Id: I793bbf631a8648ca31790e7e772ecf9c2b92f7a0
The closing of Prometheus now using a sync.Once wrapper to prevent
any accidental multiple invocations of it, which could trigger
corruption or a race condition. The shutdown process is made more
verbose through logging.
A not-enabled by default web handler has been provided to trigger a
remote shutdown if requested for debugging purposes.
Change-Id: If4fee75196bbff1fb1e4a4ef7e1cfa53fef88f2e
This also fixes the compaction test, which before worked only because
the input sample sorting was accidentally equal to the resulting on-disk
sample sorting.
Change-Id: I2a21c4b46ba562424b27058fc02eba84fa6a6006
So far we've been using Go's native time.Time for anything related to sample
timestamps. Since the range of time.Time is much bigger than what we need, this
has created two problems:
- there could be time.Time values which were out of the range/precision of the
time type that we persist to disk, therefore causing incorrectly ordered keys.
One bug caused by this was:
https://github.com/prometheus/prometheus/issues/367
It would be good to use a timestamp type that's more closely aligned with
what the underlying storage supports.
- sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit
Unix timestamp (possibly even a 32-bit one). Since we store samples in large
numbers, this seriously affects memory usage. Furthermore, copying/working
with the data will be faster if it's smaller.
*MEMORY USAGE RESULTS*
Initial memory usage comparisons for a running Prometheus with 1 timeseries and
100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my
tests, this advantage for some reason decreased a bit the more samples the
timeseries had (to 5-7% for millions of samples). This I can't fully explain,
but perhaps garbage collection issues were involved.
*WHEN TO USE THE NEW TIMESTAMP TYPE*
The new clientmodel.Timestamp type should be used whenever time
calculations are either directly or indirectly related to sample
timestamps.
For example:
- the timestamp of a sample itself
- all kinds of watermarks
- anything that may become or is compared to a sample timestamp (like the timestamp
passed into Target.Scrape()).
When to still use time.Time:
- for measuring durations/times not related to sample timestamps, like duration
telemetry exporting, timers that indicate how frequently to execute some
action, etc.
*NOTE ON OPERATOR OPTIMIZATION TESTS*
We don't use operator optimization code anymore, but it still lives in
the code as dead code. It still has tests, but I couldn't get all of them to
pass with the new timestamp format. I commented out the failing cases for now,
but we should probably remove the dead code soon. I just didn't want to do that
in the same change as this.
Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f
Due to on going issues, we've decided to remove gorest. It started with gorest
not being thread-safe (it does introspection to create a new handler which is
an easy process to mess up with multiple threads of execution):
https://code.google.com/p/gorest/issues/detail?id=15
While the issue has been marked fixed, it looks like the patch has introduced
more problems than the original issue and simply doesn't work properly.
I'm not sure the behaviour was thought through properly. If a new instance is
needed every request then a handler-factory is needed or the library needs to
set expectations about how the new objects should interact with their
constructor state.
While it was tempting to try out another routing library, I think for now
it's better to use dumb vanilla Go routing. At least until we decide which
URL format we intend to standardize on.
Change-Id: Ica3da135d05f8ab8fc206f51eeca4f684f8efa0e
This commit fixes a critique of the old storage API design, whereby
the input parameters were always as raw bytes and never Protocol
Buffer messages that encapsulated the data, meaning every place a
read or mutation was conducted needed to manually perform said
translations on its own. This is taxing.
Change-Id: I4786938d0d207cefb7782bd2bd96a517eead186f
An design question was open for me in the beginning was whether to
serialize other types to disk, but Protocol Buffers quickly won out,
which allows us to drop support for other types. This is a good
start to cleaning up a lot of cruft in the storage stack and
can let us eventually decouple the various moving parts into
separate subsystems for easier reasoning.
This commit is not strictly required, but it is a start to making
the rest a lot more enjoyable to interact with.
This adds timers around several query-relevant code blocks. For now, the
query timer stats are only logged for queries initiated through the UI.
In other cases (rule evaluations), the stats are simply thrown away.
My hope is that this helps us understand where queries spend time,
especially in cases where they sometimes hang for unusual amounts of
time.
In order to help corroborate whether a Prometheus instance has
flapped until meta-monitoring is in-place, we ought to provide the
instance's start time in the console to aid in diagnostics.
This commit simplifies the way that compactions across a database's
keyspace occur due to reading the LevelDB internals. Secondarily it
introduces the database size estimation mechanisms.
Include database health and help interfaces.
Add database statistics; remove status goroutines.
This commit kills the use of Go routines to expose status throughout
the web components of Prometheus. It also dumps raw LevelDB status
on a separate /databases endpoint.
This commit introduces three background compactors, which compact
sparse samples together.
1. Older than five minutes is grouped together into chunks of 50 every 30
minutes.
2. Older than 60 minutes is grouped together into chunks of 250 every 50
minutes.
3. Older than one day is grouped together into chunks of 5000 every 70
minutes.
Unfortunately ``cp`` on Darwin regards some flags as positional and
requires them to be in a specific place. The new Protocol Buffer
descriptor bundling fails on Mac OS.
The Protocol Buffer compiler supports generating a machine-readable
descriptor file encoded as a provided Protocol Buffer message type,
which can be used to decode messages that have been encoded with it
after-the-fact. The generated descriptor also bundles in dependent
message types.
We can use this to perform forensics on old Prometheus clients, if
necessary.
Go's time.Time represents time as UTC in its fundamental data type.
That said, when using ``time.Unix(...)``, it sets the zone for the
time representation to the local. Unfortunately with diagnosis and
our tests, it is a PITA to jump between various zones, even though
the serialized version remains the same.
To keep things easy, all places where times are generated or read
are converted into UTC. These conversions are cheap, for
``Time.In`` merely changes a pointer reference in the struct,
nothing more. This enables me to diagnose test failures with fixture
data very easily.
By setting Access-Control headers, the Prometheus metrics API can be
accessed by cross-origin javascript applications (e.g., an external
dashboard pulling Prometheus metrics).
To achieve that, this PR
- converts static/index.html ("console") and graph to templates
- moved the handlebars template to separated file to avoid escaping issues
Route changes:
/status -> /
/static -> /console
/static/graph.html -> /graph
The curator doesn't do anything yet; rather, this is the type
definition including the anciliary testing scaffold.
Improve Makefile and Git developer experience.
The top-level Makefile was a bit overloaded in terms of generation of
assets and their management. This has been offloaded into separate
Makefiles.
The Git developer experience sucked due to lack of .gitignore
policies.
Also: Fix faulty skiplist naming from old merge.
- utility/embed-static.sh, get called in Makefile to create go map from files
- web/blob/blob.go implements http Handle for serving the files from the map
- web/status.go uses blog.GetFile() to get the template file
The assets are gzipped and decompressed on demand.
This roughly comprises the following changes:
- index target pools by job instead of scrape interval
- make targets within a pool exchangable while preserving existing
health state for targets
- allow exchanging targets via HTTP API (PUT)
- show target lists in /status (experimental, for own debug use)