prometheus/cmd/promtool/backfill.go
Atibhi Agrawal b317b6ab9c
Backfill from OpenMetrics format (#8084)
* get parser working

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* import file created

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Find min and max ts

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* make two passes over file and write to tsdb

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* print error messages

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Fix Max and Min initializer

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Start with unit tests

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* reset file read

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* align blocks to two hour range

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Add cleanup test

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* remove .ds_store

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* add license to import_test

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Fix Circle CI error

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Refactor code
Move backfill from tsdb to promtool directory

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix gitignore

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Remove panic
Rename ContenType

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* adjust mint

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix return statement

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix go modules

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Added unit test for backfill

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix CI error

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Fix file handling

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Close DB

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Close directory

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Error Handling

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* inline err

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Fix command line flags

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* add spaces before func
fix pointers

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Add defer'd calls

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* move openmetrics.go content to backfill

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* changed args to flags

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* add tests for wrong OM files

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Added additional tests

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Add comment to warn of func reuse

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Make input required in main.go

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* defer blockwriter close

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix defer

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* defer

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Remove contentType

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* remove defer from backfilltest

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Fix defer remove in backfill_test

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* changes to fix CI errors

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix go.mod

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* change package name

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* assert->require

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* remove todo

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix format

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix todo

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix createblock

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix tests

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix defer

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix return

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* check err for anon func

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* change comments

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* update comment

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Fix for the Flush Bug

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix formatting, comments, names

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Print Blocks

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* cleanup

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* refactor test to take care of multiple samples

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* refactor tests

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* remove om

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* I dont know what I fixed

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Fix tests

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Fix tests, add test description, print blocks

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* commit after 5000 samples

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* reviews part 1

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Series Count

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix CI

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* remove extra func

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* make timestamp into sec

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Reviews 2

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Add Todo

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Fixes

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fixes reviews

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* =0

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* remove backfill.om

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* add global err var, remove stuff

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* change var name

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* sampleLimit pass as parameter

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Add test when number of samples greater than batch size

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Change name of batchsize

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* revert export

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* nits

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* remove

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* add comment, remove newline,consistent err

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Print Blocks

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* Modify comments

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* db.Querier

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* add sanity check , get maxt and mint

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* ci error

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* comment change

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* nits

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* NoError

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

* fix

Signed-off-by: aSquare14 <atibhi.a@gmail.com>
2020-11-26 10:37:06 +05:30

205 lines
5 KiB
Go

// Copyright 2020 The Prometheus Authors
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import (
"bufio"
"context"
"io"
"math"
"os"
"github.com/go-kit/kit/log"
"github.com/pkg/errors"
"github.com/prometheus/common/expfmt"
"github.com/prometheus/prometheus/pkg/labels"
"github.com/prometheus/prometheus/pkg/textparse"
"github.com/prometheus/prometheus/tsdb"
tsdb_errors "github.com/prometheus/prometheus/tsdb/errors"
)
// OpenMetricsParser implements textparse.Parser.
type OpenMetricsParser struct {
textparse.Parser
s *bufio.Scanner
}
// NewOpenMetricsParser returns an OpenMetricsParser reading from the provided reader.
func NewOpenMetricsParser(r io.Reader) *OpenMetricsParser {
return &OpenMetricsParser{s: bufio.NewScanner(r)}
}
// Next advances the parser to the next sample. It returns io.EOF if no
// more samples were read.
func (p *OpenMetricsParser) Next() (textparse.Entry, error) {
for p.s.Scan() {
line := p.s.Bytes()
line = append(line, '\n')
p.Parser = textparse.New(line, string(expfmt.FmtOpenMetrics))
if et, err := p.Parser.Next(); err != io.EOF {
return et, err
}
}
if err := p.s.Err(); err != nil {
return 0, err
}
return 0, io.EOF
}
func getMinAndMaxTimestamps(p textparse.Parser) (int64, int64, error) {
var maxt, mint int64 = math.MinInt64, math.MaxInt64
for {
entry, err := p.Next()
if err == io.EOF {
break
}
if err != nil {
return 0, 0, errors.Wrap(err, "next")
}
if entry != textparse.EntrySeries {
continue
}
_, ts, _ := p.Series()
if ts == nil {
return 0, 0, errors.Errorf("expected timestamp for series got none")
}
if *ts > maxt {
maxt = *ts
}
if *ts < mint {
mint = *ts
}
}
if maxt == math.MinInt64 {
maxt = 0
}
if mint == math.MaxInt64 {
mint = 0
}
return maxt, mint, nil
}
func createBlocks(input *os.File, mint, maxt int64, maxSamplesInAppender int, outputDir string) (returnErr error) {
blockDuration := tsdb.DefaultBlockDuration
mint = blockDuration * (mint / blockDuration)
db, err := tsdb.OpenDBReadOnly(outputDir, nil)
if err != nil {
return err
}
defer func() {
returnErr = tsdb_errors.NewMulti(returnErr, db.Close()).Err()
}()
for t := mint; t <= maxt; t = t + blockDuration {
err := func() error {
w, err := tsdb.NewBlockWriter(log.NewNopLogger(), outputDir, blockDuration)
if err != nil {
return errors.Wrap(err, "block writer")
}
defer func() {
err = tsdb_errors.NewMulti(err, w.Close()).Err()
}()
if _, err := input.Seek(0, 0); err != nil {
return errors.Wrap(err, "seek file")
}
ctx := context.Background()
app := w.Appender(ctx)
p := NewOpenMetricsParser(input)
tsUpper := t + blockDuration
samplesCount := 0
for {
e, err := p.Next()
if err == io.EOF {
break
}
if err != nil {
return errors.Wrap(err, "parse")
}
if e != textparse.EntrySeries {
continue
}
l := labels.Labels{}
p.Metric(&l)
_, ts, v := p.Series()
if ts == nil {
return errors.Errorf("expected timestamp for series %v, got none", l)
}
if *ts < t || *ts >= tsUpper {
continue
}
if _, err := app.Add(l, *ts, v); err != nil {
return errors.Wrap(err, "add sample")
}
samplesCount++
if samplesCount < maxSamplesInAppender {
continue
}
// If we arrive here, the samples count is greater than the maxSamplesInAppender.
// Therefore the old appender is committed and a new one is created.
// This prevents keeping too many samples lined up in an appender and thus in RAM.
if err := app.Commit(); err != nil {
return errors.Wrap(err, "commit")
}
app = w.Appender(ctx)
samplesCount = 0
}
if err := app.Commit(); err != nil {
return errors.Wrap(err, "commit")
}
if _, err := w.Flush(ctx); err != nil && err != tsdb.ErrNoSeriesAppended {
return errors.Wrap(err, "flush")
}
return nil
}()
if err != nil {
return errors.Wrap(err, "process blocks")
}
blocks, err := db.Blocks()
if err != nil {
return errors.Wrap(err, "get blocks")
}
if len(blocks) <= 0 {
continue
}
printBlocks(blocks[len(blocks)-1:], true)
}
return nil
}
func backfill(maxSamplesInAppender int, input *os.File, outputDir string) (err error) {
p := NewOpenMetricsParser(input)
maxt, mint, err := getMinAndMaxTimestamps(p)
if err != nil {
return errors.Wrap(err, "getting min and max timestamp")
}
return errors.Wrap(createBlocks(input, mint, maxt, maxSamplesInAppender, outputDir), "block creation")
}