Speed up disk flushes by removing unnecessary sort.

The first sort in groupByFingerprint already ensures that all resulting sample
lists contain only one fingerprint. We also already assume that all
samples passed into AppendSamples (and thus groupByFingerprint) are
chronologically sorted within each fingerprint.

The extra chronological sort is thus superfluous. Furthermore, this
second sort didn't only sort chronologically, but also compared all
metric fingerprints again (although we already know that we're only
sorting within samples for the same fingerprint). This caused a huge
memory and runtime overhead.

In a heavily loaded real Prometheus, this brought down disk flush times
from ~9 minutes to ~1 minute.

OLD:
BenchmarkLevelDBAppendRepeatingValues   5  331391808 ns/op  44542953 B/op   597788 allocs/op
BenchmarkLevelDBAppendsRepeatingValues  5  329893512 ns/op  46968288 B/op  3104373 allocs/op

NEW:
BenchmarkLevelDBAppendRepeatingValues   5  299298635 ns/op  43329497 B/op   567616 allocs/op
BenchmarkLevelDBAppendsRepeatingValues 20   92204601 ns/op   1779454 B/op    70975 allocs/op

Change-Id: Ie2d8db3569b0102a18010f9e106e391fda7f7883
This commit is contained in:
Julius Volz 2014-06-30 14:39:00 +02:00
parent 9410f15a73
commit 6579dc3e40
2 changed files with 6 additions and 22 deletions

View file

@ -28,7 +28,10 @@ type Persistence interface {
// closed when finished.
Close()
// Record a group of new samples in the storage layer.
// Record a group of new samples in the storage layer. Multiple samples for
// the same fingerprint need to be submitted in chronological order, from
// oldest to newest (both in the same call to AppendSamples and across
// multiple calls).
AppendSamples(clientmodel.Samples) error
// Get all of the metric fingerprints that are associated with the

View file

@ -16,7 +16,6 @@ package tiered
import (
"flag"
"fmt"
"sort"
"sync"
"time"
@ -258,9 +257,8 @@ func (l *LevelDBPersistence) AppendSample(sample *clientmodel.Sample) (err error
return
}
// groupByFingerprint collects all of the provided samples, groups them
// together by their respective metric fingerprint, and finally sorts
// them chronologically.
// groupByFingerprint collects all of the provided samples and groups them
// together by their respective metric fingerprint.
func groupByFingerprint(samples clientmodel.Samples) map[clientmodel.Fingerprint]clientmodel.Samples {
fingerprintToSamples := map[clientmodel.Fingerprint]clientmodel.Samples{}
@ -272,23 +270,6 @@ func groupByFingerprint(samples clientmodel.Samples) map[clientmodel.Fingerprint
fingerprintToSamples[*fingerprint] = samples
}
sortingSemaphore := make(chan bool, sortConcurrency)
doneSorting := sync.WaitGroup{}
for _, samples := range fingerprintToSamples {
doneSorting.Add(1)
sortingSemaphore <- true
go func(samples clientmodel.Samples) {
sort.Sort(samples)
<-sortingSemaphore
doneSorting.Done()
}(samples)
}
doneSorting.Wait()
return fingerprintToSamples
}