And I swear I'll never use 'rebase' to 'clean something up' ever agin,
even if Julius tells me to do so...
Change-Id: Ifeabab20445279bf693c95f062da769b60fe195f
A common problem in Prometheus alerting is to detect when no timeseries
exist for a given metric name and label combination. Unfortunately,
Prometheus alert expressions need to be of vector type, and
"count(nonexistent_metric)" results in an empty vector, yielding no
output vector elements to base an alert on. The newly introduced
absent() function solves this issue:
ALERT FooAbsent IF absent(foo{job="myjob"}) [...]
absent() has the following behavior:
- if the vector passed to it has any elements, it returns an empty
vector.
- if the vector passed to it has no elements, it returns a 1-element
vector with the value 1.
In the second case, absent() tries to be smart about deriving labels of
the 1-element output vector from the input vector:
absent(nonexistent{job="myjob"}) => {job="myjob"}
absent(nonexistent{job="myjob",instance=~".*"}) => {job="myjob"}
absent(sum(nonexistent{job="myjob"})) => {}
That is, if the passed vector is a literal vector selector, it takes all
"=" label matchers as the basis for the output labels, but ignores all
non-equals or regex matchers. Also, if the passed vector results from a
non-selector expression, no labels can be derived.
Change-Id: I948505a1488d50265ab5692a3286bd7c8c70cd78
After many transformations, it doesn't make sense to keep the metric
names, since the result of the transformation is no longer that metric.
This drops the metric name after such transformations and makes the web
UI deal well with missing metric names.
This depends on the current branch on the following things:
- prometheus/client_golang needs to be at
e237cf15c6
in branch "julius/int-fingerprints" (to be merged with new storage)
- prometheus/promdash needs to be at
dd7691c9c2
Change-Id: Ib3c8cad8d647d9854e8c653c424b8c235ccc231d
This removes the dependancy on C leveldb and snappy.
It also takes care of fewer dependencies as they would
anyway not work on any non-Debian, non-Brew system.
Change-Id: Ia70dce1ba8a816a003587927e0b3a3f8ad2fd28c
Now only purge if there is something to purge.
Also, set savedFirstTime and archived time range appropriately.
(Which is needed for the optimization.)
Change-Id: Idcd33319a84def3ce0318d886f10c6800369e7f9
Fix the behavior if preload for non-existent series is requested.
Instead of returning an error (which triggers a panic further up),
simply count those incidents. They can happen regularly, we just want
to know if they happen too frequently because that would mean the
indexing is behind or broken.
Change-Id: I4b2d1b93c4146eeea897d188063cb9574a270f8b
The root cause was that after chunkDesc eviction, the offset between
memory representation of chunk layout (via chunkDescs in memory) was
shiftet against chunks as layed out on disk. Keeping the offset up to
date is by no means trivial, so this commit is pretty involved.
Also, found a race that for some reason didn't bite us so far:
Persisting chunks was completel unlocked, so if chunks were purged on
disk at the same time, disaster would strike. However, locking the
persisting of chunk revealed interesting dead locks. Basically, never
queue under the fp lock.
Change-Id: I1ea9e4e71024cabbc1f9601b28e74db0c5c55db8
Checkpointing interval is now a command line flag.
Along the way, several things were refactored.
- Restructure the way the storage is started and stopped..
- Number of series in checkpoint is now a uint64, not a varint.
(Breaks old checkpoints, needs wipe!)
- More consistent naming and order of methods.
Change-Id: I883d9170c9a608ee716bb0ab3d0ded8ca03760d9
Add gauge for chunks and chunkdescs in memory (backed by a global
variable to be used later not only for instrumentation but also for
memory management).
Refactored instrumentation code once more (instrumentation.go is back :).
Change-Id: Ife39947e22a48cac4982db7369c231947f446e17
In addition to the existing by-clause syntax:
sum(<expression>) by (<labels>) [keeping_extra]
...this allows the following new syntax:
sum by (<labels>) [keeping_extra] (<expression>)
Both orderings may be used in a single expression. It is up to the users
to establish guidelines around their usage.
Change-Id: Iba10c9cc5fb6ac62edfcf246d281473e82467992
Gracefully handle decimal values, by truncating them.
Limit amount of steps, to avoid accidentally pulling too much data.
This limit returns up to ~500kB per timeseries, and allows
for 60s granularity for a week and 1h granularity for a year.
Change-Id: Ie549fc24deb2eecbc6c5d1b6088a548a6b02e849
This allows the following expression syntaxes for selecting timeseries:
foo (already valid before)
foo{} (already valid before)
{job="prometheus"} (new, select all timeseries for job "prometheus")
Omitting both the metric name *and* any label matchers ("" or "{}") will
still yield a syntax error.
To get all timeseries, you could do:
{__name__=~".*"}
or, without relying on knowledge about __metric__:
{job=~".*"}
Change-Id: Ifee000b9ac0184ef6ced18411069c7f2699a2dda
- Staleness delta is no a proper function parameter and not replicated
from package ast.
- Named type 'chunks' replaced by explicit '[]chunk' to avoid confusion.
- For the same reason, replaced 'chunkDescs' by '[]*chunkDescs'.
- Verified that math.Modf is not a speed enhancement over conversion
(actually 5x slower).
- Renamed firstTimeField, lastTimeField into chunkFirstTime and
chunkLastTime.
- Verified unpin() is sufficiently goroutine-safe.
- Decided not to update archivedFingerprintToTimeRange upon series
truncation and added a rationale why.
Change-Id: I863b8d785e5ad9f71eb63e229845eacf1bed8534
- Head chunk persisting only happens in evictOlderThan, so do it
there. (With the previous code, it would never happen.)
- Raw accesses to chunkDesc.chunk are now done via isEvicted (with
locking).
Change-Id: I48b07b56dfea4899b50df159b4ea566954396fcd