Commit graph

544 commits

Author SHA1 Message Date
Shihao Xia 3696d7dedb fix potential goroutine leaks
Signed-off-by: Shihao Xia <charlesxsh@hotmail.com>
2021-12-17 18:35:30 -05:00
Julien Pivotto 27343277fa
Merge release-2.32 forward into main (#10032)
* storage: expose bug in iterators #10027

Signed-off-by: beorn7 <beorn@grafana.com>

* storage: fix bug #10027 in iterators' Seek method

Signed-off-by: beorn7 <beorn@grafana.com>

* Append reporting metrics without limit

If reporting metrics fails due to reaching the limit, this makes the
target appear as UP in the UI, but the metrics are missing.

This commit bypasses that limit for report metrics.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Remove check against cfg so interval/ timeout are always set (#10023) (#10031)

Signed-off-by: Nicholas Blott <blottn@tcd.ie>

Co-authored-by: Nicholas Blott <blottn@tcd.ie>

* Cut v2.32.1

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Apply suggestions from code review

Signed-off-by: Julius Volz <julius.volz@gmail.com>

Co-authored-by: Levi Harrison <git@leviharrison.dev>

Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Co-authored-by: Nicholas Blott <blottn@tcd.ie>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Co-authored-by: Levi Harrison <git@leviharrison.dev>
2021-12-17 23:18:38 +01:00
beorn7 0ede6ae321 storage: fix bug #10027 in iterators' Seek method
Signed-off-by: beorn7 <beorn@grafana.com>
2021-12-16 12:07:35 +01:00
beorn7 6f33ab2b35 Merge branch 'main' into sparsehistogram 2021-12-15 13:49:33 +01:00
Ganesh Vernekar 227d155e3f
Fix queries after a failed snapshot replay (#9980)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-12-09 15:15:27 +05:30
Ganesh Vernekar 05d4d97bcd
Fix queries after a failed snapshot replay (#9980)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-12-08 15:32:14 +00:00
Nick Pillitteri 084bd70708
Convert atomic Int64 to native type when logging value (#9938)
The atomic Int64 type wasn't able to be represented when logging
via go-kit log and ended up as `"unsupported value type"`.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
2021-12-06 22:25:22 +01:00
beorn7 e8e9155a11 Merge branch 'main' into sparsehistogram 2021-11-30 18:22:37 +01:00
beorn7 e4e24453fa Merge branch 'main' into beorn7/merge2 2021-11-30 17:19:06 +01:00
Robert Fratto 4cbddb41eb
tsdb/agent: Synchronize appender code with grafana/agent main (#9664)
* tsdb/agent: synchronize appender code with grafana/agent main

This commit synchronize the appender code with grafana/agent main. This
includes adding support for appending exemplars.

Closes #9610

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: fix build error

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: introduce some exemplar tests, refactor tests a little

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: address review feedback

- Re-use hash when creating a new series
- Fix typo in exemplar append error

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: remove unused AddFast method

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: close wal reader after test

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: add out-of-order tracking, change series TS in commit

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* address review feedback

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2021-11-30 21:14:40 +05:30
Björn Rabenstein b866db009b
storage: Fix and improve the Seek method of various iterators (#9878)
There was a subtle and nasty bug in listSeriesIterator.Seek.

In addition, the Seek call is defined to be a no-op if the current
position of the iterator is already pointing to a suitable
sample. This commit adds fast paths for this case to several
potentially expensive Seek calls.

Another bug was in concreteSeriesIterator.Seek. It always searched the
whole series and not from the current position of the iterator.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-29 15:17:56 +05:30
Björn Rabenstein 7e42acd3b1
tsdb: Rework iterators (#9877)
- Pick At... method via return value of Next/Seek.
- Do not clobber returned buckets.
- Add partial FloatHistogram suppert.

Note that the promql package is now _only_ dealing with
FloatHistograms, following the idea that PromQL only knows float
values.

As a byproduct, I have removed the histogramSeries metric. In my
understanding, series can have both float and histogram samples, so
that metric doesn't make sense anymore.

As another byproduct, I have converged the sampleBuf and the
histogramSampleBuf in memSeries into one. The sample type stored in
the sampleBuf has been extended to also contain histograms even before
this commit.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-29 13:24:23 +05:30
Ganesh Vernekar 26c0a433f5
Support appending different sample types to the same series (#9705)
* Support appending different sample types to the same series

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix build

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-26 17:43:27 +05:30
Bryan Boreham 1b74a3812e
Fix panic, out of order chunks, and race warning during WAL replay (#9856)
* Fix panic on WAL replay

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Refactor: introduce walSubsetProcessor

walSubsetProcessor packages up the `processWALSamples()` function and
its input and output channels, helping to clarify how these things
relate.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Refactor: extract more methods onto walSubsetProcessor

This makes the main logic easier to follow.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix race warning by locking processWALSamples

Although we have waited for the processor to finish, we still get a
warning from the race detector because it doesn't know how the different
parts relate.

Add a lock round each batch of samples, so the race detector can see
that we never access series owned by the processor outside of a lock.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Added test to reproduce issue 9859

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove redundant unit test

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix out of order chunks during WAL replay

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix nits

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
2021-11-25 13:36:14 +05:30
Oleg Zaytsev 5e746e4e88
Check postings bytes length when decoding (#9766)
Added validation to expected postings length compared to the bytes slice
length. With 32bit postings, we expect to have 4 bytes per each posting.
If the number doesn't add up, we know that the input data is not
compatible with our code (maybe it's cut, or padded with trash, or even
written in a different coded).

This is needed in downstream projects to correctly identify cached
postings written with an unknown codec, but it's also a good idea to
validate it here.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2021-11-24 15:26:37 +05:30
Darshan Chaudhary 9dcf8b2208
Add the ability to disable tsdb isolation (#9270)
* Disable isolation in isolation struct

Signed-off-by: darshanime <deathbullet@gmail.com>

* Run tsdb tests with isolation disabled

Signed-off-by: darshanime <deathbullet@gmail.com>

* Check for isolation disabled in isoState.Close()

Signed-off-by: darshanime <deathbullet@gmail.com>

* use t.Skip to skip isolation tests when disabled

Signed-off-by: darshanime <deathbullet@gmail.com>

* address review comments

Signed-off-by: darshanime <deathbullet@gmail.com>

* fix test for defaultIsolationState

Signed-off-by: darshanime <deathbullet@gmail.com>

* Change flag name. Set flag in DB. Do not init txRing. Close isoState.

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Test disabled isolation in CircleCI test_go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Skip isolation related tests in db_test.go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-19 15:41:32 +05:30
beorn7 5d4db805ac Merge branch 'main' into sparsehistogram 2021-11-17 19:57:31 +01:00
Dieter Plaetinck 067efc3725
clarify HeadChunkID type and usage (#9726)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-11-17 18:35:10 +05:30
Sunil Thaha a484a83d4a
fix: panic when checkpoint directory is empty (#9687)
Calling `wal.NewSegmentBufReader()` without any segments would cause a
`panic` resulting in prometheus crashing. This patch fixes the panic by
making segmentBufReader return a EOF if there are not segments.

This also means an empty checkpoint directory which should never be the
case unless it has been tampered with (or has issues due to the
underlying filesystem e.g. NFS) would be ignored by Prometheus and would
continue to run instead of the current behaviour which is to panic.

Fixes: https://github.com/prometheus/prometheus/issues/9605

Signed-off-by: Sunil Thaha <sthaha@redhat.com>
2021-11-17 16:39:04 +05:30
Dieter Plaetinck 0fac9bb859
Add basic initial developer docs for TSDB (#9451)
* Add basic initial developer docs for TSDB

There's a decent amount of content already out there (blog posts,
conference talks, etc), but:
* when they get stale, they don't tend to get updated
* they still leave me with questions that I'ld like to answer
  for developers (like me) who want to use, or work with, TSDB

What I propose is developer docs inside the prometheus
repository.  Easy to find and harness the power of the community
to expand it and keep it up to date.

* perfect is the enemy of good.  Let's have a base and incrementally improve
* Markdown docs should be broad but not too deep.  Source code comments
  can complement them, and are the ideal place for implementation details.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* use example code that works out of the box

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* more docs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Update tsdb/docs/usage.md

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* final tweaks

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* workaround docs versioning issue

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Move example code to real executable, testable example.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* cleanup example test and make sure it always reproduces

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* obtain temp dir in a way that works with older Go versions

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Fix Ganesh's comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-17 15:51:27 +05:30
beorn7 4c28d9fac7 Move to histogram.Histogram pointers
This is to avoid copying the many fields of a histogram.Histogram all
the time.

This also fixes a bunch of formerly broken tests.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-12 23:17:35 +01:00
Mauro Stettler 8a4f659126 fix error message
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
2021-11-12 21:45:46 +01:00
Robert Fratto 72a9f7fee9
Share TSDB locker code with agent (#9623)
* share tsdb db locker code with agent

Closes #9616

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* add flag to disable lockfile for agent

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* use agentOnlySetting instead of PreAction

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: address review feedback

1. Rename Locker to DirLocker
2. Move DirLocker to tsdb/tsdbutil
3. Name metric using fmt.Sprintf
4. Refine error checking in DirLocker test

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: create test utilities to assert expected DirLocker behavior

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/tsdbutil: fix lint errors

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: fix windows test failure

Use new DB variable instead of overriding the old one.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2021-11-11 11:45:25 -05:00
beorn7 f1065e44a4 model: String method for histogram.Histogram
This includes a regular bucket iterator and a string method for
histogram.Bucket.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-11 17:29:22 +01:00
Peter Štibraný 422e7839d4
Add more size checks when writing individual sections in the index. (#9710)
* Add more size checks when writing individual sections in the index.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Use uint and add comment about it.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
2021-11-11 15:44:28 +05:30
Mateusz Gozdek 83086aee00 tsdb/agent: use unique registry per tests
So tests can run in parallel.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-11 01:37:24 +01:00
Mateusz Gozdek 2f312ff4c5 tsdb: mark TestTombstoneCleanRetentionLimitsRace test as slow
It takes over 100 seconds to execute this test, so I'd consider it as
slow.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-11 01:37:24 +01:00
Levi Harrison 7400e07fa9
Close DB in Agent tests (#9630)
* Close agent db in tests

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Close first DB before opening second

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Use seperate variables for different DBs?

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Close remote storage

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Fix closing of stuff

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Remove the build flags after a rebase

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix closing of stuff 2

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-10 20:05:54 +05:30
Mateusz Gozdek f4650c27e7 tsdb/wal: fix flaky TestReaderFuzz* tests
It seems sometimes you can get error like:

                Error:          Not equal:
                                expected: []byte(nil)
                                actual  : []byte{}

                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1,2 +1,3 @@
                                -([]uint8) <nil>
                                +([]uint8) {
                                +}

This commit does what bytes.Equal does to silence those differences. I'm
not sure if this is a correct solution or just covering up the actual bug.

Closes #9574

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-09 14:32:20 +01:00
Björn Rabenstein 4c56a193c5
Merge pull request #9478 from prometheus/beorn7/pkg-deprecation
Move packages out of deprecated pkg directory
2021-11-09 11:09:16 +01:00
曹明 a0d31c28fc tsdb: Add windows arm64 support.
Signed-off-by: 曹明 <caoming1@kingsoft.com>
2021-11-09 11:07:27 +01:00
Mateusz Gozdek b319b14431
tsdb/chunks: preallocate at least some space on non-Windows systems (#9581)
To avoid potential chunk corruption read, which I am not sure why is
happening.

Closes #9561.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-09 13:47:00 +05:30
beorn7 c954cd9d1d Move packages out of deprecated pkg directory
This creates a new `model` directory and moves all data-model related
packages over there:
  exemplar labels relabel rulefmt textparse timestamp value

All the others are more or less utilities and have been moved to `util`:
  gate logging modetimevfs pool runtime

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-09 08:03:10 +01:00
beorn7 a1e595edac Fix two trivial lint warnings
Not sure why those show up for me locally but not if run by the CI.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-08 22:32:13 +01:00
beorn7 8f92c90897 Add TODOs and some minor tweaks
Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-07 17:12:04 +01:00
Dieter Plaetinck cda025b5b5
TSDB: demistify SeriesRefs and ChunkRefs (#9536)
* TSDB: demistify seriesRefs and ChunkRefs

The TSDB package contains many types of series and chunk references,
all shrouded in uint types.  Often the same uint value may
actually mean one of different types, in non-obvious ways.

This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.

Concretely:

* Use appropriately named types and document their semantics and
  relations.
* Make multiplexing and demuxing of types explicit
  (on the boundaries between concrete implementations and generic
  interfaces).
* Casting between different types should be free.  None of the changes
  should have any impact on how the code runs.

TODO: Implement BlockSeriesRef where appropriate (for a future PR)

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* agent: demistify seriesRefs and ChunkRefs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-11-06 15:40:04 +05:30
johncming b882d2b7c7
tsdb/wal: Avoid writing closed channel. (#9566)
Signed-off-by: johncming <johncming@yahoo.com>
2021-11-06 15:11:06 +05:30
chenlujjj d18e42c650
refine comments of Checkpoint function (#9655)
Signed-off-by: chenlujjj <953546398@qq.com>
2021-11-06 15:09:16 +05:30
Marco Pracucci 309b094b92
Optimized MemPostings.EnsureOrder() (#9673)
* Optimizes MemPostings.EnsureOrder()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Ignore linter warning

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-05 10:01:23 +00:00
Ganesh Vernekar c8b267efd6
Get histograms from TSDB to the rate() function implementation
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-03 19:04:18 +05:30
Marco Pracucci 9f5ff5b269
Allow to disable trimming when querying TSDB (#9647)
* Allow to disable trimming when querying TSDB

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Addressed review comments

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added unit test

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Renamed TrimDisabled to DisableTrimming

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-03 15:38:34 +05:30
Marco Pracucci edd05d7010
Add Head.AppendableMinValidTime() (#9643)
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-03 13:09:54 +05:30
Mateusz Gozdek b7bdf6fab2 Fix imports formatting
According to
2829908806 (r58457095).

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Mateusz Gozdek 1a6c2283a3 Format Go source files using 'gofumpt -w -s -extra'
Part of #9557

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Julien Pivotto 6e1d6edb33
Exclude agent from windows tests (#9645)
We are aware of the issue, but while we are working on it,
having main tests broken is an annoyance.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-11-02 13:58:51 +01:00
chenlujjj 660329d5b3
add tombstoneFormatVersionSize & tombstonesCRCSize constants (#9625)
Signed-off-by: chenlujjj <953546398@qq.com>
2021-11-01 16:05:19 +05:30
Praveen Ghuge 64d9b41998
Use testing.T.TempDir() instead of ioutil.TempDir() in tsdb/wal unit tests (#9602)
Signed-off-by: Praveen Ghuge <praveen.ghuge@outlook.com>
2021-11-01 12:28:18 +05:30
Robert Fratto bc72a718c4
Initial draft of prometheus-agent (#8785)
* Initial draft of prometheus-agent

This commit introduces a new binary, prometheus-agent, based on the
Grafana Agent code. It runs a WAL-only version of prometheus without the
TSDB, alerting, or rule evaluations. It is intended to be used to
remote_write to Prometheus or another remote_write receiver.

By default, prometheus-agent will listen on port 9095 to not collide
with the prometheus default of 9090.

Truncation of the WAL cooperates on a best-effort case with Remote
Write. Every time the WAL is truncated, the minimum timestamp of data to
truncate is determined by the lowest sent timestamp of all samples
across all remote_write endpoints. This gives loose guarantees that data
from the WAL will not try to be removed until the maximum sample
lifetime passes or remote_write starts functionining.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* add tests for Prometheus agent (#22)

* add tests for Prometheus agent

* add tests for Prometheus agent

* rearranged tests as per the review comments

* update tests for Agent

* changes as per code review comments

Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>

* incremental changes to prometheus agent

Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>

* changes as per code review comments

Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>

* Commit feedback from code review

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Port over some comments from grafana/agent

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Rename agent.Storage to agent.DB for tsdb consistency

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Consolidate agentMode ifs in cmd/prometheus/main.go

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Document PreAction usage requirements better for agent mode flags

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* remove unnecessary defaultListenAddr

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* `go fmt ./tsdb/agent` and fix lint errors

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

Co-authored-by: SriKrishna Paparaju <paparaju@gmail.com>
2021-10-29 16:25:05 +01:00
Xiaochao Dong c2d1c85857
close tsdb.head in test case (#9580)
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2021-10-26 11:36:25 +05:30
Furkan Türkal 0c07663b70
fix: possible race on shared variables in test (#9470)
Fixes #9433

Signed-off-by: Furkan <furkan.turkal@trendyol.com>
2021-10-25 18:44:40 +05:30
Dieter Plaetinck d5bfbe3114
improve bstream comments and doc (#9560)
* improve bstream comments and doc

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-10-25 18:44:15 +05:30
Julien Pivotto 73255e15f6 Address golint failures from revive
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-10-23 00:53:11 +02:00
Serge Catudal 8c3eca84db
Fix remote write receiver endpoint for exemplars (#9414)
Signed-off-by: Serge Catudal <serge.catudal@gmail.com>
2021-10-21 22:58:40 +02:00
beorn7 a9008f5423 Merge branch 'main' into sparsehistogram 2021-10-19 17:14:23 +02:00
beorn7 4998b9750f chunkenc: Bugfix and naming tweaks
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 15:38:32 +02:00
beorn7 78ef9c6359 chunkenc: make xor reading more DRY
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 15:28:33 +02:00
beorn7 4a1b84f8b2 chunkenc: make xor writing more DRY
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 15:28:33 +02:00
Björn Rabenstein 3704c6c20a
Merge pull request #9533 from prometheus/beorn7/sparsehistogram
tsdb: Complete chunk format documentation
2021-10-19 13:51:46 +02:00
beorn7 1a4e54cfbb tsdb: Complete chunk format documentation
This also tweaks and fixes a few things done previously.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 13:51:30 +02:00
beorn7 0876d57aea chunkenc: Add test for chunk layout encoding
And fix a bug exposed by it...

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-18 19:37:24 +02:00
beorn7 ad9b4c2b68 Fix typos
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-18 15:44:13 +02:00
beorn7 fe50d6fc14 Update chunk layout documentation
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-15 23:18:41 +02:00
beorn7 ed33aea392 Avoid redundant varint decoding in chunk appender construction
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-15 20:33:14 +02:00
beorn7 d31bb75dc4 Use VarbitUint rather than VarbitInt to encode len(spans)
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-15 15:27:32 +02:00
beorn7 3179215a59 Encode zero threshold first
This guaranees that the zero threshold is byte-aligned. Not sure if
that helps in any way, but at least it won't harm.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-14 14:55:21 +02:00
beorn7 c5522677bf Improve encoding of zero threshold
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-14 14:47:26 +02:00
beorn7 7093b089f2 Use more varbit in histogram chunks
This adds bit buckets for larger numbers to varbit encoding and also
an unsigned version of varbit encoding.

Then, varbit encoding is used for all the histogram chunk data instead
of varint.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-13 20:03:35 +02:00
Björn Rabenstein 7309c20e7e
Merge pull request #9500 from codesome/resettests
Add unit test for counter reset header
2021-10-13 18:19:21 +02:00
Ganesh Vernekar dcaf568279
Metadata -> Layout renaming
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-13 20:27:48 +05:30
Ganesh Vernekar 4e206c7c77
Fix reviews
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-13 20:23:31 +05:30
Dieter Plaetinck d5afe0a577
TSDB: Use a dedicated head chunk reference type (#9501)
* Use dedicated Ref type

Throughout the code base, there are reference types masked as
regular integers.  Let's use dedicated types.  They are
equivalent, but clearer semantically.
This also makes it trivial to find where they are used,
and from uses, find the centralized docs.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* postpone some work until after possible return

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* clarify

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* rename feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* skip header is up to caller

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-10-13 17:44:32 +05:30
Ganesh Vernekar 85e6686f84
Add unit test for counter reset header
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-13 15:57:42 +05:30
Björn Rabenstein 311673d62e
Save on slice allocations in histogramIterator (#9494)
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-13 14:43:49 +05:30
Björn Rabenstein c450c01eb9
Remove obsolete TODOs about metadata (#9490)
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-12 17:00:30 +05:30
beorn7 7a8bb8222c Style cleanup of all the changes in sparsehistogram so far
A lot of this code was hacked together, literally during a
hackathon. This commit intends not to change the code substantially,
but just make the code obey the usual style practices.

A (possibly incomplete) list of areas:

* Generally address linter warnings.

* The `pgk` directory is deprecated as per dev-summit. No new packages should
  be added to it. I moved the new `pkg/histogram` package to `model`
  anticipating what's proposed in #9478.

* Make the naming of the Sparse Histogram more consistent. Including
  abbreviations, there were just too many names for it: SparseHistogram,
  Histogram, Histo, hist, his, shs, h. The idea is to call it "Histogram" in
  general. Only add "Sparse" if it is needed to avoid confusion with
  conventional Histograms (which is rare because the TSDB really has no notion
  of conventional Histograms). Use abbreviations only in local scope, and then
  really abbreviate (not just removing three out of seven letters like in
  "Histo"). This is in the spirit of
  https://github.com/golang/go/wiki/CodeReviewComments#variable-names

* Several other minor name changes.

* A lot of formatting of doc comments. For one, following
  https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences
  , but also layout question, anticipating how things will look like
  when rendered by `godoc` (even where `godoc` doesn't render them
  right now because they are for unexported types or not a doc comment
  at all but just a normal code comment - consistency is queen!).

* Re-enabled `TestQueryLog` and `TestEndopints` (they pass now,
  leaving them disabled was presumably an oversight).

* Bucket iterator for histogram.Histogram is now created with a
  method.

* HistogramChunk.iterator now allows iterator recycling. (I think
  @dieterbe only commented it out because he was confused by the
  question in the comment.)

* HistogramAppender.Append panics now because we decided to treat
  staleness marker differently.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-11 13:02:03 +02:00
beorn7 fd5ea4e0b5 Merge branch 'main' into sparsehistogram 2021-10-07 23:16:42 +02:00
Ganesh Vernekar 5d4dc7e413
Convert the header into an enum
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-07 19:53:24 +05:30
Ganesh Vernekar 175ef4ebcf
Add a NotCounterReset flag
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-06 16:06:34 +05:30
Ganesh Vernekar a280b6c2da
Fix review comments
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-06 15:28:10 +05:30
Ganesh Vernekar 10d4cb6dc0
Merge remote-tracking branch 'upstream/main' into release-2.30-merge 2021-10-05 20:35:14 +05:30
Ganesh Vernekar 9d81b2d610
Fix tests
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-05 14:21:24 +05:30
Ganesh Vernekar 59d77ff16d
Fix build
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-05 12:57:49 +05:30
Ganesh Vernekar 10bc6e80ee
Fix panic on failed snapshot replay and don't hard fail replay on disabled exemplars (#9438)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-05 10:51:25 +05:30
Ganesh Vernekar eb9931e961
Add info about counter resets in chunk meta
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-04 18:44:12 +05:30
Ganesh Vernekar b30db03f35
Cut v2.30.2 (#9426)
* Don't error on overlapping m-mapped chunks during WAL replay (#9381)

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Reduce log level during WAL replay on overlapping m-map chunks (#9425)

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Cut v2.30.2

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-01 17:00:22 +05:30
Ganesh Vernekar a7d499e19a
Reduce log level during WAL replay on overlapping m-map chunks (#9425)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-01 15:33:29 +05:30
Ganesh Vernekar 8c597e5166
Don't error on overlapping m-mapped chunks during WAL replay (#9381)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-01 14:34:12 +05:30
Ganesh Vernekar 1dd22ed655
Support stale samples for sparse histograms (#9352)
* Support stale samples for sparse histograms

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Don't cut a new chunk for every stale sample

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Update comments for HistoAppender.Appendable

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-01 13:41:51 +05:30
Bryan Boreham 1fb3c1b598
Replace calls to strings.Compare (#9397)
< is clearer and faster. As the documentation says,
"Basically no one should use strings.Compare."

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-09-27 17:33:53 +05:30
Ganesh Vernekar 2bcd9f2f69
Link 2 more TSDB blog posts in tsdb/README.md (#9371)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-09-21 21:35:33 +05:30
Darshan Chaudhary 1f688657bf
Call delete on head if interval overlaps (#9151)
* Call delete on head if interval overlaps

Signed-off-by: darshanime <deathbullet@gmail.com>

* Garbage collect tombstones during head gc

Signed-off-by: darshanime <deathbullet@gmail.com>

* Truncate tombstones before min time during head gc

Signed-off-by: darshanime <deathbullet@gmail.com>

* Lock less by deleting all keys in a single pass

Signed-off-by: darshanime <deathbullet@gmail.com>

* Pass map to DeleteTombstones

Signed-off-by: darshanime <deathbullet@gmail.com>

* Create new slice to replace old one

Signed-off-by: darshanime <deathbullet@gmail.com>
2021-09-16 12:20:03 +05:30
Ganesh Vernekar 30534e99d9
Take snapshot only after closing the WAL (#9328)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-09-13 18:30:41 +05:30
Ganesh Vernekar 8944520ccc
Fix deletion of old snapshots (#9314)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-09-08 19:53:44 +05:30
Bryan Boreham 2327236bb5
Decrement active_appenders metric when no samples added (#9230)
* Decrement active_appenders metric when no samples added

Also add a test that the metric is incremented and decremented as
expected with and without samples.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix comment

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-09-08 14:49:58 +05:30
Bryan Boreham 87d909df4a
Remove symbols map from TSDB head (#9301)
This saves memory, effort and locking.

Since every symbol is also added to postings, `Symbols()` can be
implemented there instead. This now has to build a map for
deduplication, but `Symbols()` is only called for compaction, and `gc()`
used to rebuild the symbols map after every compaction so not an
additional cost.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-09-08 14:48:48 +05:30
Callum Styan 93886d8417
Fix div by 0 panic is resize function. (#9286)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2021-09-02 11:08:05 -07:00
Ganesh Vernekar 1315d8ecb6
Remove query hacks in the API and fix metrics (#9275)
* Remove query hacks in the API and fix metrics

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Tests for the metrics

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Better way to count series on restart

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-31 17:01:19 +05:30
Ganesh Vernekar 35b1a82594
Exemplars in snapshot (#9255)
* Exemplars in snapshot

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add docs

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-30 19:34:38 +05:30
Ganesh Vernekar eeace6bcab
Add couple of metrics to track sparse histograms in TSDB (#9271)
* Add couple of metrics to track sparse histograms in TSDB

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix Beorn's comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-30 19:08:44 +05:30
Levi Harrison 06afe6162c
Also ignore func1
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2021-08-28 22:42:22 -04:00