Commit graph

478 commits

Author SHA1 Message Date
beorn7 5d4db805ac Merge branch 'main' into sparsehistogram 2021-11-17 19:57:31 +01:00
Dieter Plaetinck 067efc3725
clarify HeadChunkID type and usage (#9726)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-11-17 18:35:10 +05:30
Sunil Thaha a484a83d4a
fix: panic when checkpoint directory is empty (#9687)
Calling `wal.NewSegmentBufReader()` without any segments would cause a
`panic` resulting in prometheus crashing. This patch fixes the panic by
making segmentBufReader return a EOF if there are not segments.

This also means an empty checkpoint directory which should never be the
case unless it has been tampered with (or has issues due to the
underlying filesystem e.g. NFS) would be ignored by Prometheus and would
continue to run instead of the current behaviour which is to panic.

Fixes: https://github.com/prometheus/prometheus/issues/9605

Signed-off-by: Sunil Thaha <sthaha@redhat.com>
2021-11-17 16:39:04 +05:30
Dieter Plaetinck 0fac9bb859
Add basic initial developer docs for TSDB (#9451)
* Add basic initial developer docs for TSDB

There's a decent amount of content already out there (blog posts,
conference talks, etc), but:
* when they get stale, they don't tend to get updated
* they still leave me with questions that I'ld like to answer
  for developers (like me) who want to use, or work with, TSDB

What I propose is developer docs inside the prometheus
repository.  Easy to find and harness the power of the community
to expand it and keep it up to date.

* perfect is the enemy of good.  Let's have a base and incrementally improve
* Markdown docs should be broad but not too deep.  Source code comments
  can complement them, and are the ideal place for implementation details.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* use example code that works out of the box

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* more docs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Update tsdb/docs/usage.md

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* final tweaks

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* workaround docs versioning issue

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Move example code to real executable, testable example.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* cleanup example test and make sure it always reproduces

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* obtain temp dir in a way that works with older Go versions

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Fix Ganesh's comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-17 15:51:27 +05:30
beorn7 4c28d9fac7 Move to histogram.Histogram pointers
This is to avoid copying the many fields of a histogram.Histogram all
the time.

This also fixes a bunch of formerly broken tests.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-12 23:17:35 +01:00
Mauro Stettler 8a4f659126 fix error message
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
2021-11-12 21:45:46 +01:00
Robert Fratto 72a9f7fee9
Share TSDB locker code with agent (#9623)
* share tsdb db locker code with agent

Closes #9616

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* add flag to disable lockfile for agent

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* use agentOnlySetting instead of PreAction

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: address review feedback

1. Rename Locker to DirLocker
2. Move DirLocker to tsdb/tsdbutil
3. Name metric using fmt.Sprintf
4. Refine error checking in DirLocker test

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: create test utilities to assert expected DirLocker behavior

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/tsdbutil: fix lint errors

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: fix windows test failure

Use new DB variable instead of overriding the old one.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2021-11-11 11:45:25 -05:00
beorn7 f1065e44a4 model: String method for histogram.Histogram
This includes a regular bucket iterator and a string method for
histogram.Bucket.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-11 17:29:22 +01:00
Peter Štibraný 422e7839d4
Add more size checks when writing individual sections in the index. (#9710)
* Add more size checks when writing individual sections in the index.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Use uint and add comment about it.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
2021-11-11 15:44:28 +05:30
Mateusz Gozdek 83086aee00 tsdb/agent: use unique registry per tests
So tests can run in parallel.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-11 01:37:24 +01:00
Mateusz Gozdek 2f312ff4c5 tsdb: mark TestTombstoneCleanRetentionLimitsRace test as slow
It takes over 100 seconds to execute this test, so I'd consider it as
slow.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-11 01:37:24 +01:00
Levi Harrison 7400e07fa9
Close DB in Agent tests (#9630)
* Close agent db in tests

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Close first DB before opening second

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Use seperate variables for different DBs?

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Close remote storage

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Fix closing of stuff

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Remove the build flags after a rebase

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix closing of stuff 2

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-10 20:05:54 +05:30
Mateusz Gozdek f4650c27e7 tsdb/wal: fix flaky TestReaderFuzz* tests
It seems sometimes you can get error like:

                Error:          Not equal:
                                expected: []byte(nil)
                                actual  : []byte{}

                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1,2 +1,3 @@
                                -([]uint8) <nil>
                                +([]uint8) {
                                +}

This commit does what bytes.Equal does to silence those differences. I'm
not sure if this is a correct solution or just covering up the actual bug.

Closes #9574

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-09 14:32:20 +01:00
Björn Rabenstein 4c56a193c5
Merge pull request #9478 from prometheus/beorn7/pkg-deprecation
Move packages out of deprecated pkg directory
2021-11-09 11:09:16 +01:00
曹明 a0d31c28fc tsdb: Add windows arm64 support.
Signed-off-by: 曹明 <caoming1@kingsoft.com>
2021-11-09 11:07:27 +01:00
Mateusz Gozdek b319b14431
tsdb/chunks: preallocate at least some space on non-Windows systems (#9581)
To avoid potential chunk corruption read, which I am not sure why is
happening.

Closes #9561.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-09 13:47:00 +05:30
beorn7 c954cd9d1d Move packages out of deprecated pkg directory
This creates a new `model` directory and moves all data-model related
packages over there:
  exemplar labels relabel rulefmt textparse timestamp value

All the others are more or less utilities and have been moved to `util`:
  gate logging modetimevfs pool runtime

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-09 08:03:10 +01:00
beorn7 a1e595edac Fix two trivial lint warnings
Not sure why those show up for me locally but not if run by the CI.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-08 22:32:13 +01:00
beorn7 8f92c90897 Add TODOs and some minor tweaks
Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-07 17:12:04 +01:00
Dieter Plaetinck cda025b5b5
TSDB: demistify SeriesRefs and ChunkRefs (#9536)
* TSDB: demistify seriesRefs and ChunkRefs

The TSDB package contains many types of series and chunk references,
all shrouded in uint types.  Often the same uint value may
actually mean one of different types, in non-obvious ways.

This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.

Concretely:

* Use appropriately named types and document their semantics and
  relations.
* Make multiplexing and demuxing of types explicit
  (on the boundaries between concrete implementations and generic
  interfaces).
* Casting between different types should be free.  None of the changes
  should have any impact on how the code runs.

TODO: Implement BlockSeriesRef where appropriate (for a future PR)

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* agent: demistify seriesRefs and ChunkRefs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-11-06 15:40:04 +05:30
johncming b882d2b7c7
tsdb/wal: Avoid writing closed channel. (#9566)
Signed-off-by: johncming <johncming@yahoo.com>
2021-11-06 15:11:06 +05:30
chenlujjj d18e42c650
refine comments of Checkpoint function (#9655)
Signed-off-by: chenlujjj <953546398@qq.com>
2021-11-06 15:09:16 +05:30
Marco Pracucci 309b094b92
Optimized MemPostings.EnsureOrder() (#9673)
* Optimizes MemPostings.EnsureOrder()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Ignore linter warning

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-05 10:01:23 +00:00
Ganesh Vernekar c8b267efd6
Get histograms from TSDB to the rate() function implementation
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-03 19:04:18 +05:30
Marco Pracucci 9f5ff5b269
Allow to disable trimming when querying TSDB (#9647)
* Allow to disable trimming when querying TSDB

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Addressed review comments

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added unit test

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Renamed TrimDisabled to DisableTrimming

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-03 15:38:34 +05:30
Marco Pracucci edd05d7010
Add Head.AppendableMinValidTime() (#9643)
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2021-11-03 13:09:54 +05:30
Mateusz Gozdek b7bdf6fab2 Fix imports formatting
According to
2829908806 (r58457095).

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Mateusz Gozdek 1a6c2283a3 Format Go source files using 'gofumpt -w -s -extra'
Part of #9557

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
Julien Pivotto 6e1d6edb33
Exclude agent from windows tests (#9645)
We are aware of the issue, but while we are working on it,
having main tests broken is an annoyance.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-11-02 13:58:51 +01:00
chenlujjj 660329d5b3
add tombstoneFormatVersionSize & tombstonesCRCSize constants (#9625)
Signed-off-by: chenlujjj <953546398@qq.com>
2021-11-01 16:05:19 +05:30
Praveen Ghuge 64d9b41998
Use testing.T.TempDir() instead of ioutil.TempDir() in tsdb/wal unit tests (#9602)
Signed-off-by: Praveen Ghuge <praveen.ghuge@outlook.com>
2021-11-01 12:28:18 +05:30
Robert Fratto bc72a718c4
Initial draft of prometheus-agent (#8785)
* Initial draft of prometheus-agent

This commit introduces a new binary, prometheus-agent, based on the
Grafana Agent code. It runs a WAL-only version of prometheus without the
TSDB, alerting, or rule evaluations. It is intended to be used to
remote_write to Prometheus or another remote_write receiver.

By default, prometheus-agent will listen on port 9095 to not collide
with the prometheus default of 9090.

Truncation of the WAL cooperates on a best-effort case with Remote
Write. Every time the WAL is truncated, the minimum timestamp of data to
truncate is determined by the lowest sent timestamp of all samples
across all remote_write endpoints. This gives loose guarantees that data
from the WAL will not try to be removed until the maximum sample
lifetime passes or remote_write starts functionining.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* add tests for Prometheus agent (#22)

* add tests for Prometheus agent

* add tests for Prometheus agent

* rearranged tests as per the review comments

* update tests for Agent

* changes as per code review comments

Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>

* incremental changes to prometheus agent

Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>

* changes as per code review comments

Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>

* Commit feedback from code review

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Port over some comments from grafana/agent

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Rename agent.Storage to agent.DB for tsdb consistency

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Consolidate agentMode ifs in cmd/prometheus/main.go

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* Document PreAction usage requirements better for agent mode flags

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* remove unnecessary defaultListenAddr

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* `go fmt ./tsdb/agent` and fix lint errors

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

Co-authored-by: SriKrishna Paparaju <paparaju@gmail.com>
2021-10-29 16:25:05 +01:00
Xiaochao Dong c2d1c85857
close tsdb.head in test case (#9580)
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2021-10-26 11:36:25 +05:30
Furkan Türkal 0c07663b70
fix: possible race on shared variables in test (#9470)
Fixes #9433

Signed-off-by: Furkan <furkan.turkal@trendyol.com>
2021-10-25 18:44:40 +05:30
Dieter Plaetinck d5bfbe3114
improve bstream comments and doc (#9560)
* improve bstream comments and doc

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-10-25 18:44:15 +05:30
Julien Pivotto 73255e15f6 Address golint failures from revive
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-10-23 00:53:11 +02:00
Serge Catudal 8c3eca84db
Fix remote write receiver endpoint for exemplars (#9414)
Signed-off-by: Serge Catudal <serge.catudal@gmail.com>
2021-10-21 22:58:40 +02:00
beorn7 a9008f5423 Merge branch 'main' into sparsehistogram 2021-10-19 17:14:23 +02:00
beorn7 4998b9750f chunkenc: Bugfix and naming tweaks
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 15:38:32 +02:00
beorn7 78ef9c6359 chunkenc: make xor reading more DRY
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 15:28:33 +02:00
beorn7 4a1b84f8b2 chunkenc: make xor writing more DRY
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 15:28:33 +02:00
Björn Rabenstein 3704c6c20a
Merge pull request #9533 from prometheus/beorn7/sparsehistogram
tsdb: Complete chunk format documentation
2021-10-19 13:51:46 +02:00
beorn7 1a4e54cfbb tsdb: Complete chunk format documentation
This also tweaks and fixes a few things done previously.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-19 13:51:30 +02:00
beorn7 0876d57aea chunkenc: Add test for chunk layout encoding
And fix a bug exposed by it...

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-18 19:37:24 +02:00
beorn7 ad9b4c2b68 Fix typos
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-18 15:44:13 +02:00
beorn7 fe50d6fc14 Update chunk layout documentation
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-15 23:18:41 +02:00
beorn7 ed33aea392 Avoid redundant varint decoding in chunk appender construction
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-15 20:33:14 +02:00
beorn7 d31bb75dc4 Use VarbitUint rather than VarbitInt to encode len(spans)
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-15 15:27:32 +02:00
beorn7 3179215a59 Encode zero threshold first
This guaranees that the zero threshold is byte-aligned. Not sure if
that helps in any way, but at least it won't harm.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-14 14:55:21 +02:00
beorn7 c5522677bf Improve encoding of zero threshold
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-14 14:47:26 +02:00