Commit graph

92 commits

Author SHA1 Message Date
Mauro Stettler 04114aa2e6
add old chunk disk mapper back
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
2022-04-19 19:21:22 +00:00
Jesus Vazquez 48aa5cd096 Merge remote-tracking branch 'upstream/main' into jvp/merge-prometheus-main 2022-04-12 16:40:00 +02:00
Howie 1291ec7185
deleting *.tmp WAL files on startup (#10317)
* fix issue #10245

Signed-off-by: lihaowei <haoweili35@gmail.com>

* minor changes

Signed-off-by: lihaowei <haoweili35@gmail.com>

* review changes

Signed-off-by: lihaowei <haoweili35@gmail.com>

* minor changes

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-03-24 16:14:14 +05:30
cui fliter c9b56d1a49
all: fix some typos (#10389)
Signed-off-by: cuishuang <imcusg@gmail.com>
2022-03-03 12:03:07 +00:00
Mauro Stettler 0df3489275
Write chunks via queue, predicting the refs (#10051)
* Write chunks via queue, predicting the refs

Our load tests have shown that there is a latency spike in the
remote write handler whenever the head chunks need to be written,
because chunkDiskMapper.WriteChunk() blocks until the chunks are written
to disk.

This adds a queue to the chunk disk mapper which makes the WriteChunk()
method non-blocking unless the queue is full. Reads can still be served
from the queue.

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* address PR feeddback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* initialize metrics without .Add(0)

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* change isRunningMtx to normal lock

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* do not re-initialize chunkrefmap

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update metric outside of lock scope

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* add benchmark for adding job to chunk write queue

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* remove unnecessary "success" var

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* gofumpt -extra

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* avoid WithLabelValues call in addJob

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* format comments

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* addressing PR feedback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* rename cutExpectRef to cutAndExpectRef

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* use head.Init() instead of .initTime()

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* address PR feedback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* PR feedback

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update test according to PR feedback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* replace callbackWg -> awaitCb

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* better test of truncation with empty files

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* replace callbackWg -> awaitCb

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2022-01-10 13:36:45 +00:00
Mauro Stettler cc9ddf50c9
Write m-map chunks in async via a queue while predicting the m-map chunk refs (#57)
* write chunks via queue, predicting chunkrefs

* fixes

* linter

* cleanup

* better test coverage

* deduplicate operations when cutting file

* better comments

* fix race

* improvements based on PR feedback

* comments

* concurrent correctness test of write queue

* linter

* use isStarted flag in chunk write queue

* separate mutex for isStarted

* fixing tests

* fix race in test

* PR feedback

* add license headers

* PR feedback

* pr feedback

* dont recreate workerCtrl channel

* address PR feedback in high concurrency read and write test

* refactor the high concurrency write queue test

* pr feedback

* use require.Eventually

* typo

* Fixing comment

Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>

* use buffered chan for chunk write queue

* fix races

* address feedback on PR

* better comment

* less type casts

* dont reeimplement varint

* move mtx above property it protects

* improve tests by using wg instead of inspecting queue

* add comment

* add comment

* writeChunkF comment

* rename isStarted -> isRunning

* prevent panic when double stopping

* fix variable name in comment

* delete outdated comment

* naming and comment fixes in TestChunkWriteQueue_WrappingAroundSizeLimit

* use require.False instead of require.Equal(t, false

* always enable chunk write queue

* dont call callback with lock held

* undo rename of curFileSequence to curFileSeq

* fix accidental change

* fix comment

* Better syntax

Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>

* Better wording in comment

Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>

* better comments to explain the use of coordinator chans in test

* fix test which broke to the async writing of chunks

* undo previous renaming of shadowing variable

* rename var threadID to workerID

* rename variables based on PR feedback

* rename pos -> ts

* better implementation of whileNotCanceled

* update querySeriesRef before using it

* rename labels -> lbls

* initialize the chunk write queue operations metric with all possible labels

* dont return error from CutNewFile

* recreate chunk ref map regularly to free

* limit how often we recreate the chunk ref map

* fix typo in comment

* better comment

Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
2021-12-08 13:14:26 +05:30
Peter Štibraný f93d38aca7 Merge remote-tracking branch 'upstream/main' into merge-upstream 2021-11-19 12:11:26 +01:00
Darshan Chaudhary 9dcf8b2208
Add the ability to disable tsdb isolation (#9270)
* Disable isolation in isolation struct

Signed-off-by: darshanime <deathbullet@gmail.com>

* Run tsdb tests with isolation disabled

Signed-off-by: darshanime <deathbullet@gmail.com>

* Check for isolation disabled in isoState.Close()

Signed-off-by: darshanime <deathbullet@gmail.com>

* use t.Skip to skip isolation tests when disabled

Signed-off-by: darshanime <deathbullet@gmail.com>

* address review comments

Signed-off-by: darshanime <deathbullet@gmail.com>

* fix test for defaultIsolationState

Signed-off-by: darshanime <deathbullet@gmail.com>

* Change flag name. Set flag in DB. Do not init txRing. Close isoState.

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Test disabled isolation in CircleCI test_go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Skip isolation related tests in db_test.go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-19 15:41:32 +05:30
Peter Štibraný 294c155bb6 Merge remote-tracking branch 'upstream/main' into merge-upstream 2021-11-18 15:48:40 +01:00
Dieter Plaetinck 0fac9bb859
Add basic initial developer docs for TSDB (#9451)
* Add basic initial developer docs for TSDB

There's a decent amount of content already out there (blog posts,
conference talks, etc), but:
* when they get stale, they don't tend to get updated
* they still leave me with questions that I'ld like to answer
  for developers (like me) who want to use, or work with, TSDB

What I propose is developer docs inside the prometheus
repository.  Easy to find and harness the power of the community
to expand it and keep it up to date.

* perfect is the enemy of good.  Let's have a base and incrementally improve
* Markdown docs should be broad but not too deep.  Source code comments
  can complement them, and are the ideal place for implementation details.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* use example code that works out of the box

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* more docs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Update tsdb/docs/usage.md

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* final tweaks

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* workaround docs versioning issue

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Move example code to real executable, testable example.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* cleanup example test and make sure it always reproduces

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* obtain temp dir in a way that works with older Go versions

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Fix Ganesh's comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-17 15:51:27 +05:30
Marco Pracucci 6525385b30
Merge pull request #29 from grafana/add-jitter-to-chunk-end
Add jitter to head chunks flushing
2021-11-16 11:05:07 +01:00
Robert Fratto 72a9f7fee9
Share TSDB locker code with agent (#9623)
* share tsdb db locker code with agent

Closes #9616

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* add flag to disable lockfile for agent

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* use agentOnlySetting instead of PreAction

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: address review feedback

1. Rename Locker to DirLocker
2. Move DirLocker to tsdb/tsdbutil
3. Name metric using fmt.Sprintf
4. Refine error checking in DirLocker test

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: create test utilities to assert expected DirLocker behavior

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/tsdbutil: fix lint errors

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: fix windows test failure

Use new DB variable instead of overriding the old one.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2021-11-11 11:45:25 -05:00
beorn7 c954cd9d1d Move packages out of deprecated pkg directory
This creates a new `model` directory and moves all data-model related
packages over there:
  exemplar labels relabel rulefmt textparse timestamp value

All the others are more or less utilities and have been moved to `util`:
  gate logging modetimevfs pool runtime

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-09 08:03:10 +01:00
Dieter Plaetinck cda025b5b5
TSDB: demistify SeriesRefs and ChunkRefs (#9536)
* TSDB: demistify seriesRefs and ChunkRefs

The TSDB package contains many types of series and chunk references,
all shrouded in uint types.  Often the same uint value may
actually mean one of different types, in non-obvious ways.

This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.

Concretely:

* Use appropriately named types and document their semantics and
  relations.
* Make multiplexing and demuxing of types explicit
  (on the boundaries between concrete implementations and generic
  interfaces).
* Casting between different types should be free.  None of the changes
  should have any impact on how the code runs.

TODO: Implement BlockSeriesRef where appropriate (for a future PR)

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* agent: demistify seriesRefs and ChunkRefs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-11-06 15:40:04 +05:30
Marco Pracucci 229575de0c
Upgrade upstream Prometheus
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-03 10:25:35 +01:00
Mateusz Gozdek 1a6c2283a3 Format Go source files using 'gofumpt -w -s -extra'
Part of #9557

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Ganesh Vernekar 84f8307aa1
Merge remote-tracking branch 'upstream/main' into codesome/syncprom
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-09-17 15:58:45 +05:30
Darshan Chaudhary 1f688657bf
Call delete on head if interval overlaps (#9151)
* Call delete on head if interval overlaps

Signed-off-by: darshanime <deathbullet@gmail.com>

* Garbage collect tombstones during head gc

Signed-off-by: darshanime <deathbullet@gmail.com>

* Truncate tombstones before min time during head gc

Signed-off-by: darshanime <deathbullet@gmail.com>

* Lock less by deleting all keys in a single pass

Signed-off-by: darshanime <deathbullet@gmail.com>

* Pass map to DeleteTombstones

Signed-off-by: darshanime <deathbullet@gmail.com>

* Create new slice to replace old one

Signed-off-by: darshanime <deathbullet@gmail.com>
2021-09-16 12:20:03 +05:30
Marco Pracucci 9537ca8b84
Merged upstream
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-08-25 16:30:05 +02:00
Marco Pracucci fbe211d3a8
Do not break exported functions signatures (#6)
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-08-20 18:37:47 +02:00
Marco Pracucci 481299f4a5
Added series hash cache support to TSDB (#5)
* Added series hash cache support to TSDB

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed imports grouping

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-08-17 13:31:08 +00:00
Ganesh Vernekar ee7e0071d1
Snapshot in-memory chunks on shutdown for faster restarts (#7229)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-06 17:51:01 +01:00
Ganesh Vernekar 59d02b5ef0
tsdb: Block Head GC till pending readers are done reading (#9081)
* tsdb: Block Head GC till pending readers are done reading

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments 2

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix the exclusiveness of maxt in WaitForPendingReadersInTimeRange

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-20 14:17:20 +05:30
Martin Disibio 1bcd13d6b5
Exemplar resize (#8974)
* Create experimental circular buffer resize method, benchmarks

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* Optimize exemplar resize to only replay as many exemplars as needed

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* More comments, benchmark AddExemplar

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* optimizations

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* comment

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* Slight refactor of resize benchmark + make use of resize via runtime
reloadable storage config.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Some more config related changes.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address some review comments.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address more review comments.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Refactor to remove usage of noopExemplarStorage and avoid race condition
when resizing from Head code.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Fix or add comments to clarify some of the new behaviour.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* fix potential panics related to negative exemplar buffer lengths

Signed-off-by: Callum Styan <callumstyan@gmail.com>

Co-authored-by: Callum Styan <callumstyan@gmail.com>
2021-07-20 10:22:57 +05:30
Julien Duchesne 8855c2e626
Add prometheus_tsdb_clean_start metric (#8824)
Add cleanup of the lockfile when the db is cleanly closed

The metric describes the status of the lockfile on startup
0: Already existed
1: Did not exist
-1: Disabled

Therefore, if the min value over time of this metric is 0, that means that executions have exited uncleanly
We can then use that metric to have a much lower threshold on the crashlooping alert:

If the metric exists and it has been zero, two restarts is enough to trigger the alarm
If it does not exist (old prom version for example), the current five restarts threshold remains

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Change metric name + set unset value to -1

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Only check the last value of the clean start alert

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Fix test + nit

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>
2021-06-16 15:03:02 +05:30
Levi Harrison b5f6f8fb36 Switched to go-kit/log
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2021-06-11 12:28:36 -04:00
Levi Harrison 7bc11dcb06
React UI: Add Starting Screen (#8662)
* Added walreplay API endpoint

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added starting page to react-ui

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Documented the new endpoint

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed typos

Signed-off-by: Levi Harrison <git@leviharrison.dev>

Co-authored-by: Julius Volz <julius.volz@gmail.com>

* Removed logo

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed isResponding to isUnexpected

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed width of progress bar

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed width of progress bar

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added DB stats object

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Updated starting page to work with new fields

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (pt. 2)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (pt. 3)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (and also implementing a method this time) (pt. 4)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (and also implementing a method this time) (pt. 5)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed const to let

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Passing nil (pt. 6)

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Remove SetStats method

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added comma

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed api

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed to triple equals

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed data response types

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Don't return pointer

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Changed version

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed interface issue

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed pointer

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Fixed copying lock value error

Signed-off-by: Levi Harrison <git@leviharrison.dev>

Co-authored-by: Julius Volz <julius.volz@gmail.com>
2021-06-05 15:29:32 +01:00
Ben Ye 0a8912433a
allow compact series merger to be configurable (#8836)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2021-05-18 18:38:37 +02:00
nberkley f9e2dd0697
Add support for smaller block chunk segment allocations (#8478)
* Add support for --storage.tsdb.max-chunk-size to suport small chunks for space limited prometheus instances.

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update tsdb/compact.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update tsdb/db.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update cmd/prometheus/main.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Change naming scheme to

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Add a lower bound to --storage.tsdb.max-block-chunk-segment-size

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Update storage.md to explain what a chunk segment is

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Force tests

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

* Fix code style

Signed-off-by: Nathan Berkley <nberkley@tripadvisor.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-04-15 14:25:01 +05:30
Bryan Boreham c7a62b95ce
GetRef() now returns the label set (#8641)
The purpose of GetRef() is to allow Append() to be called without
the caller needing to copy the labels. To avoid a race where a series
is removed from TSDB between the calls to GetRef() and Append(), we
return TSDB's copy of the labels.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-03-24 15:24:58 +00:00
Bryan Boreham d614ae9ecf
[RFC] Add method to get reference number for TSDB Appender (#8600)
* Add method to get reference number for TSDB Appender

In situations where we need to copy labels before calling Add(),
GetRef() allows to check first, then call AddFast() in the case that the series
is already known.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Add explicit interface for GetRef() method

Suggested in code review by @bwplotka

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Rename OptionalGetRef to GetRef

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Simplify return value of GetRef()

0 can be relied on to mean 'no reference'

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-03-19 19:28:55 +00:00
Callum Styan 289ba11b79
Add circular in-memory exemplars storage (#6635)
* Add circular in-memory exemplars storage

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com>
Co-authored-by: Martin Disibio <mdisibio@gmail.com>

* Fix some comments, clean up exemplar metrics struct and exemplar tests.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Fix exemplar query api null vs empty array issue.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com>
Co-authored-by: Martin Disibio <mdisibio@gmail.com>
2021-03-16 15:17:45 +05:30
Arthur Silva Sens 6a3d55db0a
Rolling tombstones clean up (#8007)
* CleanupTombstones refactored, now reloading blocks after every compaction.

The goal is to remove deletable blocks after every compaction and, thus, decrease disk space used when cleaning tombstones.

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Protect DB against parallel reloads

Signed-off-by: ArthurSens <arthursens2005@gmail.com>

* Fix typos

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-02-17 11:02:43 +05:30
Dustin Hooten b9f0baf6ff
Combine NewHead() args into a HeadOptions struct (#8452)
* Combine NewHead() args into a HeadOptions struct

Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>

* remove overrides params

Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>

* address pr feedback

Signed-off-by: Dustin Hooten <dustinhooten@gmail.com>
2021-02-09 19:42:48 +05:30
Nguyen Le Vu Long fbe960f2c1
fix: remove pre-2.21 tmp blocks on start (#8353)
* fix: remove pre-2.21 tmp blocks on start

Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>

* fix: commenting

Signed-off-by: Nguyen Le Vu Long <vulongvn98@gmail.com>
2021-01-09 10:02:26 +01:00
Arthur Silva Sens 7e932637e3
Reload tsdb blocks every minute (#8340)
* Reload tsdb blocks every minute

Signed-off-by: ArthurSens <arthursens2005@gmail.com>

* Proteced tsdb with mutex locks

Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2021-01-07 13:00:08 +05:30
arthursens 8493704b9b Change seconds()*1000 to milliseconds()
Signed-off-by: arthursens <arthursens2005@gmail.com>
2020-12-25 10:45:23 -03:00
Ganesh Vernekar faa1554aa1
Don't call runtime.GC() after compaction (#8276)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-12-22 14:44:17 +00:00
Arthur Silva Sens 64a106c5dd
Logging added for when compaction takes more than the block time range (#8151)
* Logging added for when compaction takes more than the block time range

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Log only if no errors were already logged

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Log duration as human readable string

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Move logging from compactHead() to Compact()

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Compute duration of all head compactions plus wal truncation

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Remove named return added os first commits

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Address nits

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Change miliseconds to seconds to make fuzzit tests happy

Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2020-12-07 21:29:43 +00:00
Marco Pracucci db19e05d93
Add option to customise head chunks write buffer size (#8201)
* Add option to customise head chunks write buffer size

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed tests

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-11-19 18:30:47 +05:30
Bartlomiej Plotka 3d8826a3d4
MultiError: Refactored MultiError for more concise and safe usage. (#8066)
* MultiError: Refactored MultiError for more concise and safe usage.

* Less lines
* Goland IDE was marking every usage of old MultiError "potential nil" error
* It was easy to forgot using Err() when error was returned, now it's safely assured on compile time.

NOTE: Potentially I would rename package to merrors. (: In different PR.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed review comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix after rebase.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-10-28 15:24:58 +00:00
Ganesh Vernekar 3245b3267b
Don't use returned DB to close resources on TSDB startup error (#8113)
* Don't use returned DB to close resources on TSDB startup error

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add unit test and fix another panic

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comment

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-10-28 15:39:03 +05:30
Julien Pivotto 4e5b1722b3
Move away from testutil, refactor imports (#8087)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-22 11:00:08 +02:00
Arthur Silva Sens c5a832b394
Close resources after failing to startup TSDB (#8031)
* Close resources after failing to startup TSDB

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Return close error instead of logging

Signed-off-by: arthursens <arthursens2005@gmail.com>

* Change named return's name

Signed-off-by: arthursens <arthursens2005@gmail.com>
2020-10-21 20:38:28 +05:30
Brian Brazil fdf1c29224
Fix metric description of prometheus_tsdb_symbol_table_size_bytes. (#8080)
This is how much memory we use to load in the on-disk
symbol tables, not the size of the tables themselves.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-10-21 14:35:40 +01:00
Bartlomiej Plotka 2fe1e9fa93
Create a checkpoint only at the end of Compact call (#8067)
* Create a checkpoint only at the end of Compact call

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix Bartek's offline reviews

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Introduce TruncateInMemory and TruncateWAL

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Small enhancements and test fixing attempts

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add TestOneCheckpointPerCompactCall

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Don't truncate WAL on block compaction

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Simplified the algo.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Better protection around calling truncateWAL, truncate WAL on Head compaction error

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-10-19 20:57:08 +05:30
Julien Pivotto 59733b1a26
TSDB: use blocks instead of db.blocks in condition (#8068)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-19 16:51:54 +05:30
Gayathri Venkatesh 73e2ce1bd6
Do not ignore reload errors in compactHead (#8051)
* Modified unknownRefs to unknownRefs.Load()

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>

* Modified db.go

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>

* Revert "Modified unknownRefs to unknownRefs.Load()"

This reverts commit 79caf595fa9b9193878dc0dd9dec17d58851ae42.

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>

* Made changes to reload error in db.go

Signed-off-by: GayathriVenkatesh <gayaa2010@gmail.com>
2020-10-14 15:05:24 +05:30
Arthur Silva Sens 4f45e201cc
Promtool tsdb list now prints block sizes (#7993)
* promtool tsdb list now prints blocks' size

Signed-off-by: arthursens <arthursens2005@gmail.com>
2020-10-12 23:15:40 +02:00
johncming b521612042
tsdb: simplify code. (#7792)
Signed-off-by: johncming <johncming@yahoo.com>
2020-08-14 15:15:08 +05:30