The first sort in groupByFingerprint already ensures that all resulting sample
lists contain only one fingerprint. We also already assume that all
samples passed into AppendSamples (and thus groupByFingerprint) are
chronologically sorted within each fingerprint.
The extra chronological sort is thus superfluous. Furthermore, this
second sort didn't only sort chronologically, but also compared all
metric fingerprints again (although we already know that we're only
sorting within samples for the same fingerprint). This caused a huge
memory and runtime overhead.
In a heavily loaded real Prometheus, this brought down disk flush times
from ~9 minutes to ~1 minute.
OLD:
BenchmarkLevelDBAppendRepeatingValues 5 331391808 ns/op 44542953 B/op 597788 allocs/op
BenchmarkLevelDBAppendsRepeatingValues 5 329893512 ns/op 46968288 B/op 3104373 allocs/op
NEW:
BenchmarkLevelDBAppendRepeatingValues 5 299298635 ns/op 43329497 B/op 567616 allocs/op
BenchmarkLevelDBAppendsRepeatingValues 20 92204601 ns/op 1779454 B/op 70975 allocs/op
Change-Id: Ie2d8db3569b0102a18010f9e106e391fda7f7883
This was initially motivated by wanting to distribute the rule checker
tool under `tools/rule_checker`. However, this was not possible without
also distributing the LevelDB dynamic libraries because the tool
transitively depended on Levigo:
rule checker -> query layer -> tiered storage layer -> leveldb
This change separates external storage interfaces from the
implementation (tiered storage, leveldb storage, memory storage) by
putting them into separate packages:
- storage/metric: public, implementation-agnostic interfaces
- storage/metric/tiered: tiered storage implementation, including memory
and LevelDB storage.
I initially also considered splitting up the implementation into
separate packages for tiered storage, memory storage, and LevelDB
storage, but these are currently so intertwined that it would be another
major project in itself.
The query layers and most other parts of Prometheus now have notion of
the storage implementation anymore and just use whatever implementation
they get passed in via interfaces.
The rule_checker is now a static binary :)
Change-Id: I793bbf631a8648ca31790e7e772ecf9c2b92f7a0