Commit graph

532 commits

Author SHA1 Message Date
Ganesh Vernekar 26c0a433f5
Support appending different sample types to the same series (#9705)
* Support appending different sample types to the same series

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix build

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-26 17:43:27 +05:30
Bryan Boreham 1b74a3812e
Fix panic, out of order chunks, and race warning during WAL replay (#9856)
* Fix panic on WAL replay

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Refactor: introduce walSubsetProcessor

walSubsetProcessor packages up the `processWALSamples()` function and
its input and output channels, helping to clarify how these things
relate.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Refactor: extract more methods onto walSubsetProcessor

This makes the main logic easier to follow.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix race warning by locking processWALSamples

Although we have waited for the processor to finish, we still get a
warning from the race detector because it doesn't know how the different
parts relate.

Add a lock round each batch of samples, so the race detector can see
that we never access series owned by the processor outside of a lock.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Added test to reproduce issue 9859

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove redundant unit test

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix out of order chunks during WAL replay

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix nits

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
2021-11-25 13:36:14 +05:30
Oleg Zaytsev 5e746e4e88
Check postings bytes length when decoding (#9766)
Added validation to expected postings length compared to the bytes slice
length. With 32bit postings, we expect to have 4 bytes per each posting.
If the number doesn't add up, we know that the input data is not
compatible with our code (maybe it's cut, or padded with trash, or even
written in a different coded).

This is needed in downstream projects to correctly identify cached
postings written with an unknown codec, but it's also a good idea to
validate it here.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2021-11-24 15:26:37 +05:30
Darshan Chaudhary 9dcf8b2208
Add the ability to disable tsdb isolation (#9270)
* Disable isolation in isolation struct

Signed-off-by: darshanime <deathbullet@gmail.com>

* Run tsdb tests with isolation disabled

Signed-off-by: darshanime <deathbullet@gmail.com>

* Check for isolation disabled in isoState.Close()

Signed-off-by: darshanime <deathbullet@gmail.com>

* use t.Skip to skip isolation tests when disabled

Signed-off-by: darshanime <deathbullet@gmail.com>

* address review comments

Signed-off-by: darshanime <deathbullet@gmail.com>

* fix test for defaultIsolationState

Signed-off-by: darshanime <deathbullet@gmail.com>

* Change flag name. Set flag in DB. Do not init txRing. Close isoState.

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Test disabled isolation in CircleCI test_go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Skip isolation related tests in db_test.go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-19 15:41:32 +05:30
beorn7 5d4db805ac Merge branch 'main' into sparsehistogram 2021-11-17 19:57:31 +01:00
Dieter Plaetinck 067efc3725
clarify HeadChunkID type and usage (#9726)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-11-17 18:35:10 +05:30
Sunil Thaha a484a83d4a
fix: panic when checkpoint directory is empty (#9687)
Calling `wal.NewSegmentBufReader()` without any segments would cause a
`panic` resulting in prometheus crashing. This patch fixes the panic by
making segmentBufReader return a EOF if there are not segments.

This also means an empty checkpoint directory which should never be the
case unless it has been tampered with (or has issues due to the
underlying filesystem e.g. NFS) would be ignored by Prometheus and would
continue to run instead of the current behaviour which is to panic.

Fixes: https://github.com/prometheus/prometheus/issues/9605

Signed-off-by: Sunil Thaha <sthaha@redhat.com>
2021-11-17 16:39:04 +05:30
Dieter Plaetinck 0fac9bb859
Add basic initial developer docs for TSDB (#9451)
* Add basic initial developer docs for TSDB

There's a decent amount of content already out there (blog posts,
conference talks, etc), but:
* when they get stale, they don't tend to get updated
* they still leave me with questions that I'ld like to answer
  for developers (like me) who want to use, or work with, TSDB

What I propose is developer docs inside the prometheus
repository.  Easy to find and harness the power of the community
to expand it and keep it up to date.

* perfect is the enemy of good.  Let's have a base and incrementally improve
* Markdown docs should be broad but not too deep.  Source code comments
  can complement them, and are the ideal place for implementation details.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* use example code that works out of the box

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* more docs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Update tsdb/docs/usage.md

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* final tweaks

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* workaround docs versioning issue

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Move example code to real executable, testable example.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* cleanup example test and make sure it always reproduces

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* obtain temp dir in a way that works with older Go versions

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Fix Ganesh's comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-17 15:51:27 +05:30
beorn7 4c28d9fac7 Move to histogram.Histogram pointers
This is to avoid copying the many fields of a histogram.Histogram all
the time.

This also fixes a bunch of formerly broken tests.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-12 23:17:35 +01:00
Mauro Stettler 8a4f659126 fix error message
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
2021-11-12 21:45:46 +01:00
Robert Fratto 72a9f7fee9
Share TSDB locker code with agent (#9623)
* share tsdb db locker code with agent

Closes #9616

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* add flag to disable lockfile for agent

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* use agentOnlySetting instead of PreAction

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: address review feedback

1. Rename Locker to DirLocker
2. Move DirLocker to tsdb/tsdbutil
3. Name metric using fmt.Sprintf
4. Refine error checking in DirLocker test

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: create test utilities to assert expected DirLocker behavior

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/tsdbutil: fix lint errors

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: fix windows test failure

Use new DB variable instead of overriding the old one.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2021-11-11 11:45:25 -05:00
beorn7 f1065e44a4 model: String method for histogram.Histogram
This includes a regular bucket iterator and a string method for
histogram.Bucket.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-11 17:29:22 +01:00
Peter Štibraný 422e7839d4
Add more size checks when writing individual sections in the index. (#9710)
* Add more size checks when writing individual sections in the index.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Use uint and add comment about it.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
2021-11-11 15:44:28 +05:30
Mateusz Gozdek 83086aee00 tsdb/agent: use unique registry per tests
So tests can run in parallel.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-11 01:37:24 +01:00
Mateusz Gozdek 2f312ff4c5 tsdb: mark TestTombstoneCleanRetentionLimitsRace test as slow
It takes over 100 seconds to execute this test, so I'd consider it as
slow.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-11 01:37:24 +01:00
Levi Harrison 7400e07fa9
Close DB in Agent tests (#9630)
* Close agent db in tests

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Close first DB before opening second

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Use seperate variables for different DBs?

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Close remote storage

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Fix closing of stuff

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Remove the build flags after a rebase

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix closing of stuff 2

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-10 20:05:54 +05:30
Mateusz Gozdek f4650c27e7 tsdb/wal: fix flaky TestReaderFuzz* tests
It seems sometimes you can get error like:

                Error:          Not equal:
                                expected: []byte(nil)
                                actual  : []byte{}

                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1,2 +1,3 @@
                                -([]uint8) <nil>
                                +([]uint8) {
                                +}

This commit does what bytes.Equal does to silence those differences. I'm
not sure if this is a correct solution or just covering up the actual bug.

Closes #9574

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-09 14:32:20 +01:00
Björn Rabenstein 4c56a193c5
Merge pull request #9478 from prometheus/beorn7/pkg-deprecation
Move packages out of deprecated pkg directory
2021-11-09 11:09:16 +01:00
曹明 a0d31c28fc tsdb: Add windows arm64 support.
Signed-off-by: 曹明 <caoming1@kingsoft.com>
2021-11-09 11:07:27 +01:00
Mateusz Gozdek b319b14431
tsdb/chunks: preallocate at least some space on non-Windows systems (#9581)
To avoid potential chunk corruption read, which I am not sure why is
happening.

Closes #9561.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-09 13:47:00 +05:30
beorn7 c954cd9d1d Move packages out of deprecated pkg directory
This creates a new `model` directory and moves all data-model related
packages over there:
  exemplar labels relabel rulefmt textparse timestamp value

All the others are more or less utilities and have been moved to `util`:
  gate logging modetimevfs pool runtime

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-09 08:03:10 +01:00
beorn7 a1e595edac Fix two trivial lint warnings
Not sure why those show up for me locally but not if run by the CI.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-08 22:32:13 +01:00
beorn7 8f92c90897 Add TODOs and some minor tweaks
Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-07 17:12:04 +01:00
Dieter Plaetinck cda025b5b5
TSDB: demistify SeriesRefs and ChunkRefs (#9536)
* TSDB: demistify seriesRefs and ChunkRefs

The TSDB package contains many types of series and chunk references,
all shrouded in uint types.  Often the same uint value may
actually mean one of different types, in non-obvious ways.

This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.

Concretely:

* Use appropriately named types and document their semantics and
  relations.
* Make multiplexing and demuxing of types explicit
  (on the boundaries between concrete implementations and generic
  interfaces).
* Casting between different types should be free.  None of the changes
  should have any impact on how the code runs.

TODO: Implement BlockSeriesRef where appropriate (for a future PR)

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* agent: demistify seriesRefs and ChunkRefs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-11-06 15:40:04 +05:30
johncming b882d2b7c7
tsdb/wal: Avoid writing closed channel. (#9566)
Signed-off-by: johncming <johncming@yahoo.com>
2021-11-06 15:11:06 +05:30
chenlujjj d18e42c650
refine comments of Checkpoint function (#9655)
Signed-off-by: chenlujjj <953546398@qq.com>
2021-11-06 15:09:16 +05:30
Marco Pracucci 309b094b92
Optimized MemPostings.EnsureOrder() (#9673)
* Optimizes MemPostings.EnsureOrder()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Ignore linter warning

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-05 10:01:23 +00:00
Ganesh Vernekar c8b267efd6
Get histograms from TSDB to the rate() function implementation
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-03 19:04:18 +05:30
Marco Pracucci 9f5ff5b269
Allow to disable trimming when querying TSDB (#9647)
* Allow to disable trimming when querying TSDB

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Addressed review comments

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added unit test

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Renamed TrimDisabled to DisableTrimming

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-03 15:38:34 +05:30
Marco Pracucci edd05d7010
Add Head.AppendableMinValidTime() (#9643)
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-03 13:09:54 +05:30
Mateusz Gozdek b7bdf6fab2 Fix imports formatting
According to
2829908806 (r58457095).

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Mateusz Gozdek 1a6c2283a3 Format Go source files using 'gofumpt -w -s -extra'
Part of #9557

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Julien Pivotto 6e1d6edb33
Exclude agent from windows tests (#9645)
We are aware of the issue, but while we are working on it,
having main tests broken is an annoyance.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-11-02 13:58:51 +01:00
chenlujjj 660329d5b3
add tombstoneFormatVersionSize & tombstonesCRCSize constants (#9625)
Signed-off-by: chenlujjj <953546398@qq.com>
2021-11-01 16:05:19 +05:30
Praveen Ghuge 64d9b41998
Use testing.T.TempDir() instead of ioutil.TempDir() in tsdb/wal unit tests (#9602)
Signed-off-by: Praveen Ghuge <praveen.ghuge@outlook.com>
2021-11-01 12:28:18 +05:30
Robert Fratto bc72a718c4
Initial draft of prometheus-agent (#8785)
* Initial draft of prometheus-agent

This commit introduces a new binary, prometheus-agent, based on the
Grafana Agent code. It runs a WAL-only version of prometheus without the
TSDB, alerting, or rule evaluations. It is intended to be used to
remote_write to Prometheus or another remote_write receiver.

By default, prometheus-agent will listen on port 9095 to not collide
with the prometheus default of 9090.

Truncation of the WAL cooperates on a best-effort case with Remote
Write. Every time the WAL is truncated, the minimum timestamp of data to
truncate is determined by the lowest sent timestamp of all samples
across all remote_write endpoints. This gives loose guarantees that data
from the WAL will not try to be removed until the maximum sample
lifetime passes or remote_write starts functionining.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* add tests for Prometheus agent (#22)

* add tests for Prometheus agent

* add tests for Prometheus agent

* rearranged tests as per the review comments

* update tests for Agent

* changes as per code review comments

Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>

* incremental changes to prometheus agent

Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>

* changes as per code review comments

Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>

* Commit feedback from code review

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Port over some comments from grafana/agent

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Rename agent.Storage to agent.DB for tsdb consistency

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Consolidate agentMode ifs in cmd/prometheus/main.go

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Document PreAction usage requirements better for agent mode flags

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* remove unnecessary defaultListenAddr

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* `go fmt ./tsdb/agent` and fix lint errors

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

Co-authored-by: SriKrishna Paparaju <paparaju@gmail.com>
2021-10-29 16:25:05 +01:00
Xiaochao Dong c2d1c85857
close tsdb.head in test case (#9580)
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2021-10-26 11:36:25 +05:30
Furkan Türkal 0c07663b70
fix: possible race on shared variables in test (#9470)
Fixes #9433

Signed-off-by: Furkan <furkan.turkal@trendyol.com>
2021-10-25 18:44:40 +05:30
Dieter Plaetinck d5bfbe3114
improve bstream comments and doc (#9560)
* improve bstream comments and doc

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-10-25 18:44:15 +05:30
Julien Pivotto 73255e15f6 Address golint failures from revive
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-10-23 00:53:11 +02:00
Serge Catudal 8c3eca84db
Fix remote write receiver endpoint for exemplars (#9414)
Signed-off-by: Serge Catudal <serge.catudal@gmail.com>
2021-10-21 22:58:40 +02:00
beorn7 a9008f5423 Merge branch 'main' into sparsehistogram 2021-10-19 17:14:23 +02:00
beorn7 4998b9750f chunkenc: Bugfix and naming tweaks
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 15:38:32 +02:00
beorn7 78ef9c6359 chunkenc: make xor reading more DRY
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 15:28:33 +02:00
beorn7 4a1b84f8b2 chunkenc: make xor writing more DRY
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 15:28:33 +02:00
Björn Rabenstein 3704c6c20a
Merge pull request #9533 from prometheus/beorn7/sparsehistogram
tsdb: Complete chunk format documentation
2021-10-19 13:51:46 +02:00
beorn7 1a4e54cfbb tsdb: Complete chunk format documentation
This also tweaks and fixes a few things done previously.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 13:51:30 +02:00
beorn7 0876d57aea chunkenc: Add test for chunk layout encoding
And fix a bug exposed by it...

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-18 19:37:24 +02:00
beorn7 ad9b4c2b68 Fix typos
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-18 15:44:13 +02:00
beorn7 fe50d6fc14 Update chunk layout documentation
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-15 23:18:41 +02:00
beorn7 ed33aea392 Avoid redundant varint decoding in chunk appender construction
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-15 20:33:14 +02:00
beorn7 d31bb75dc4 Use VarbitUint rather than VarbitInt to encode len(spans)
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-15 15:27:32 +02:00
beorn7 3179215a59 Encode zero threshold first
This guaranees that the zero threshold is byte-aligned. Not sure if
that helps in any way, but at least it won't harm.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-14 14:55:21 +02:00
beorn7 c5522677bf Improve encoding of zero threshold
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-14 14:47:26 +02:00
beorn7 7093b089f2 Use more varbit in histogram chunks
This adds bit buckets for larger numbers to varbit encoding and also
an unsigned version of varbit encoding.

Then, varbit encoding is used for all the histogram chunk data instead
of varint.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-13 20:03:35 +02:00
Björn Rabenstein 7309c20e7e
Merge pull request #9500 from codesome/resettests
Add unit test for counter reset header
2021-10-13 18:19:21 +02:00
Ganesh Vernekar dcaf568279
Metadata -> Layout renaming
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-13 20:27:48 +05:30
Ganesh Vernekar 4e206c7c77
Fix reviews
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-13 20:23:31 +05:30
Dieter Plaetinck d5afe0a577
TSDB: Use a dedicated head chunk reference type (#9501)
* Use dedicated Ref type

Throughout the code base, there are reference types masked as
regular integers.  Let's use dedicated types.  They are
equivalent, but clearer semantically.
This also makes it trivial to find where they are used,
and from uses, find the centralized docs.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* postpone some work until after possible return

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* clarify

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* rename feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* skip header is up to caller

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-10-13 17:44:32 +05:30
Ganesh Vernekar 85e6686f84
Add unit test for counter reset header
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-13 15:57:42 +05:30
Björn Rabenstein 311673d62e
Save on slice allocations in histogramIterator (#9494)
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-13 14:43:49 +05:30
Björn Rabenstein c450c01eb9
Remove obsolete TODOs about metadata (#9490)
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-12 17:00:30 +05:30
beorn7 7a8bb8222c Style cleanup of all the changes in sparsehistogram so far
A lot of this code was hacked together, literally during a
hackathon. This commit intends not to change the code substantially,
but just make the code obey the usual style practices.

A (possibly incomplete) list of areas:

* Generally address linter warnings.

* The `pgk` directory is deprecated as per dev-summit. No new packages should
  be added to it. I moved the new `pkg/histogram` package to `model`
  anticipating what's proposed in #9478.

* Make the naming of the Sparse Histogram more consistent. Including
  abbreviations, there were just too many names for it: SparseHistogram,
  Histogram, Histo, hist, his, shs, h. The idea is to call it "Histogram" in
  general. Only add "Sparse" if it is needed to avoid confusion with
  conventional Histograms (which is rare because the TSDB really has no notion
  of conventional Histograms). Use abbreviations only in local scope, and then
  really abbreviate (not just removing three out of seven letters like in
  "Histo"). This is in the spirit of
  https://github.com/golang/go/wiki/CodeReviewComments#variable-names

* Several other minor name changes.

* A lot of formatting of doc comments. For one, following
  https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences
  , but also layout question, anticipating how things will look like
  when rendered by `godoc` (even where `godoc` doesn't render them
  right now because they are for unexported types or not a doc comment
  at all but just a normal code comment - consistency is queen!).

* Re-enabled `TestQueryLog` and `TestEndopints` (they pass now,
  leaving them disabled was presumably an oversight).

* Bucket iterator for histogram.Histogram is now created with a
  method.

* HistogramChunk.iterator now allows iterator recycling. (I think
  @dieterbe only commented it out because he was confused by the
  question in the comment.)

* HistogramAppender.Append panics now because we decided to treat
  staleness marker differently.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-11 13:02:03 +02:00
beorn7 fd5ea4e0b5 Merge branch 'main' into sparsehistogram 2021-10-07 23:16:42 +02:00
Ganesh Vernekar 5d4dc7e413
Convert the header into an enum
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-07 19:53:24 +05:30
Ganesh Vernekar 175ef4ebcf
Add a NotCounterReset flag
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-06 16:06:34 +05:30
Ganesh Vernekar a280b6c2da
Fix review comments
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-06 15:28:10 +05:30
Ganesh Vernekar 10d4cb6dc0
Merge remote-tracking branch 'upstream/main' into release-2.30-merge 2021-10-05 20:35:14 +05:30
Ganesh Vernekar 9d81b2d610
Fix tests
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-05 14:21:24 +05:30
Ganesh Vernekar 59d77ff16d
Fix build
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-05 12:57:49 +05:30
Ganesh Vernekar 10bc6e80ee
Fix panic on failed snapshot replay and don't hard fail replay on disabled exemplars (#9438)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-05 10:51:25 +05:30
Ganesh Vernekar eb9931e961
Add info about counter resets in chunk meta
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-04 18:44:12 +05:30
Ganesh Vernekar b30db03f35
Cut v2.30.2 (#9426)
* Don't error on overlapping m-mapped chunks during WAL replay (#9381)

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Reduce log level during WAL replay on overlapping m-map chunks (#9425)

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Cut v2.30.2

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-01 17:00:22 +05:30
Ganesh Vernekar a7d499e19a
Reduce log level during WAL replay on overlapping m-map chunks (#9425)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-01 15:33:29 +05:30
Ganesh Vernekar 8c597e5166
Don't error on overlapping m-mapped chunks during WAL replay (#9381)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-01 14:34:12 +05:30
Ganesh Vernekar 1dd22ed655
Support stale samples for sparse histograms (#9352)
* Support stale samples for sparse histograms

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Don't cut a new chunk for every stale sample

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Update comments for HistoAppender.Appendable

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-01 13:41:51 +05:30
Bryan Boreham 1fb3c1b598
Replace calls to strings.Compare (#9397)
< is clearer and faster. As the documentation says,
"Basically no one should use strings.Compare."

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-09-27 17:33:53 +05:30
Ganesh Vernekar 2bcd9f2f69
Link 2 more TSDB blog posts in tsdb/README.md (#9371)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-09-21 21:35:33 +05:30
Darshan Chaudhary 1f688657bf
Call delete on head if interval overlaps (#9151)
* Call delete on head if interval overlaps

Signed-off-by: darshanime <deathbullet@gmail.com>

* Garbage collect tombstones during head gc

Signed-off-by: darshanime <deathbullet@gmail.com>

* Truncate tombstones before min time during head gc

Signed-off-by: darshanime <deathbullet@gmail.com>

* Lock less by deleting all keys in a single pass

Signed-off-by: darshanime <deathbullet@gmail.com>

* Pass map to DeleteTombstones

Signed-off-by: darshanime <deathbullet@gmail.com>

* Create new slice to replace old one

Signed-off-by: darshanime <deathbullet@gmail.com>
2021-09-16 12:20:03 +05:30
Ganesh Vernekar 30534e99d9
Take snapshot only after closing the WAL (#9328)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-09-13 18:30:41 +05:30
Ganesh Vernekar 8944520ccc
Fix deletion of old snapshots (#9314)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-09-08 19:53:44 +05:30
Bryan Boreham 2327236bb5
Decrement active_appenders metric when no samples added (#9230)
* Decrement active_appenders metric when no samples added

Also add a test that the metric is incremented and decremented as
expected with and without samples.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix comment

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-09-08 14:49:58 +05:30
Bryan Boreham 87d909df4a
Remove symbols map from TSDB head (#9301)
This saves memory, effort and locking.

Since every symbol is also added to postings, `Symbols()` can be
implemented there instead. This now has to build a map for
deduplication, but `Symbols()` is only called for compaction, and `gc()`
used to rebuild the symbols map after every compaction so not an
additional cost.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-09-08 14:48:48 +05:30
Callum Styan 93886d8417
Fix div by 0 panic is resize function. (#9286)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2021-09-02 11:08:05 -07:00
Ganesh Vernekar 1315d8ecb6
Remove query hacks in the API and fix metrics (#9275)
* Remove query hacks in the API and fix metrics

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Tests for the metrics

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Better way to count series on restart

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-31 17:01:19 +05:30
Ganesh Vernekar 35b1a82594
Exemplars in snapshot (#9255)
* Exemplars in snapshot

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add docs

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-30 19:34:38 +05:30
Ganesh Vernekar eeace6bcab
Add couple of metrics to track sparse histograms in TSDB (#9271)
* Add couple of metrics to track sparse histograms in TSDB

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix Beorn's comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-30 19:08:44 +05:30
Levi Harrison 06afe6162c
Also ignore func1
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2021-08-28 22:42:22 -04:00
Julien Pivotto d5676fb9e0
Merge pull request #9254 from prometheus/superq/go1.17
Build with Go 1.17 / npm 7 / node 16
2021-08-28 18:36:42 +02:00
SuperQ e167a45c65
Add new Go build tags.
Add new go:build comments based on 1.17 formatting[0].

[0]: https://golang.org/doc/go1.17#gofmt

Signed-off-by: SuperQ <superq@gmail.com>
2021-08-27 10:24:14 +02:00
Callum Styan cc55e57c1b
Fix a data race in the loadWAL function caused by reusing the same error var in multiple goroutines (#9259)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2021-08-27 11:49:34 +05:30
Ganesh Vernekar 8a5d8c15e3
Do not replay checkpoint if it is covered by snapshot (#9226)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-25 21:48:55 +05:30
Ganesh Vernekar c373200b75
Cut a new chunk on counter resets for any bucket
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-24 20:02:08 +05:30
Bryan Boreham 9dfdc3eb36
Speed up BenchmarkPostings_Stats (#9213)
The previous code re-used series IDs 1-1000 many times over, so a lot
of time was spent ensuring the lists of series were in ascending order.
The intended use of `MemPostings.Add()` is that all series IDs are
unique, and changing the benchmark to do this lets it finish ten times
faster.

(It doesn't affect the benchmark result, just the setup code)

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-08-18 10:27:21 +01:00
Ganesh Vernekar 328a74ca36
Fix bugs and add enhancements to the chunk snapshot (#9185)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-17 18:08:16 +01:00
Ganesh Vernekar eedb86783e
Fix queries on blocks for sparse histograms and add unit test (#9209)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-16 18:52:29 +05:30
Ganesh Vernekar 42f576aa18
Add test for sparse histogram compaction (#9208)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-16 16:32:23 +05:30
Jupiter 84ab705318
32 should better be replaced by "symbolFactor" (#9203)
Signed-off-by: tanghengjian <1040104807@qq.com>
2021-08-13 16:38:53 +05:30
Marco Pracucci 84e786ebc1
Fixed Decoder.Series() error checking (#9201)
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-08-13 16:11:41 +05:30
Julien Pivotto cab96a06ef
Merge release 2.29 in main (#9196)
* PromQL: Fix start and end keywords masking label and metric names

This commit fixes an issue with the "at modifier" that introduced two
new keywords: `start` and `end`. In grouping options and in metric
names, these keywords took precedence over metric or label names, so
that those metrics and labels could no longer be referenced.

Signed-off-by: Clayton Peters <clayton.peters@man.com>

* Add in additional tests for metrics and/or labels called start/end.

Signed-off-by: Clayton Peters <clayton.peters@man.com>

* *: Cut 2.29.0-rc.0

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

* VERSION: bump to 2.29.0-rc.0

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

* Remove experimental wording on size-based retention

Followup of #9004

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Fix PR reference in changelog

Signed-off-by: George Brighton <george@gebn.co.uk>

* Describe EC2 availability zone IDs at most once per refresh (#9142)

Signed-off-by: George Brighton <george@gebn.co.uk>

* Describe EC2 availability zones at most once per SD load

Closes #9142.

Signed-off-by: George Brighton <george@gebn.co.uk>

* Incorporate feedback

Signed-off-by: George Brighton <george@gebn.co.uk>

* Integrate feedback

Signed-off-by: George Brighton <george@gebn.co.uk>

* Add a compatibility note for macOS users.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* *: Cut v2.29.0-rc.1

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

* Fix `kuma_sd` targetgroup reporting (#9157)

* Bundle all xDS targets into a single group

Signed-off-by: austin ce <austin.cawley@gmail.com>

* *: cut v2.29.0-rc.2

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

* Rename links

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* bump codemirror-promql to 0.17.0

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* *: cut v2.29.0

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

* tsdb: align atomically accessed int64 (#9192)

This prevents a panic in 32-bit archs:
https://pkg.go.dev/sync/atomic#pkg-note-BUG

Fixed #9190

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Release 2.29.1 (#9193)

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

Co-authored-by: Clayton Peters <clayton.peters@man.com>
Co-authored-by: Frederic Branczyk <fbranczyk@gmail.com>
Co-authored-by: George Brighton <george@gebn.co.uk>
Co-authored-by: Austin Cawley-Edwards <austin.cawley@gmail.com>
Co-authored-by: Levi Harrison <git@leviharrison.dev>
Co-authored-by: Augustin Husson <husson.augustin@gmail.com>
2021-08-12 18:38:06 +02:00