Commit graph

123 commits

Author SHA1 Message Date
Callum Styan 788dfc8cb4 fix minor lint issue + use labels Range function since it looks like
the tests fail to do `range labels.Labels` on CI

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2023-11-23 11:55:56 -08:00
Callum Styan a3c6904243 more cleanup, mostly linting fixes
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2023-11-20 17:28:05 -08:00
Callum Styan c2c45d12cb More cleanup
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2023-11-20 16:37:03 -08:00
Callum Styan 1842c99f2f Add bytes slice (instead of slice of 32bit vars) format for testing
Co-authored-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2023-11-13 21:53:44 -08:00
Callum Styan ed0c3b25c7 refactor new version flag to make it easier to pick a specific format
instead of having multiple flags, plus add new formats for testing

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2023-11-13 15:41:01 -08:00
Nicolás Pazos 4bdb865ae3 minimally-tested exemplar support for rw 1.1
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
2023-11-09 17:00:16 -03:00
Nicolás Pazos 34f3f11ee2 remove all code from previous interning approach
the 'minimized' version is now the only v1.1 version

Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
2023-11-09 17:00:16 -03:00
Nicolás Pazos e2a5ea5397 Use unsafe []byte->string cast to reuse buffer
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
2023-11-09 15:08:49 -03:00
Callum Styan 2935eab409 update tests
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2023-11-09 10:35:43 -03:00
Callum Styan 3e5facc0c0 add functionality for new minimized remote write request format
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2023-11-09 10:35:43 -03:00
Nicolás Pazos 46b84ab3fb lint 2023-11-09 10:18:12 -03:00
Nicolás Pazos 6f21272ca7 fix NewWriteClient and change new flags wording 2023-11-09 10:18:12 -03:00
Nicolás Pazos c1593fd35a Improve sender benchmarks and some allocations 2023-11-09 10:18:12 -03:00
Nicolás Pazos 3be59f0ca6 add sender-side tests and fix failing ones 2023-11-09 10:18:12 -03:00
Nicolás Pazos ed34405d68 remove some comented code 2023-11-09 10:18:12 -03:00
Callum Styan f1992591b2 Implement code paths for new proto format
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2023-11-09 10:18:12 -03:00
Callum Styan 7a862d265f replace snappy encoding library
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2023-11-09 10:18:12 -03:00
Paschalis Tsilias c173cd57c9
Add a header to count retried remote write requests (#12729)
Header name is `Retry-Attempt`, only set when >0.

Signed-off-by: Marc Tuduri <marctc@protonmail.com>
Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
2023-09-20 11:11:03 +01:00
Jeanette Tan e9a1e26ab7 Perform integer/float histogram type checking on conversions, and use a consistent method for determining integer vs float histogram
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-04-22 02:27:15 +08:00
beorn7 c3c7d44d84 lint: Adjust to the lint warnings raised by current versions of golint-ci
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.

There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).

I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.

I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:10:10 +02:00
Marc Tudurí 721f33dbb0
histograms: Add remote-write support for Float Histograms (#11817)
* adapt code.go and write_handler.go to support float histograms
* adapt watcher.go to support float histograms
* wip adapt queue_manager.go to support float histograms
* address comments for metrics in queue_manager.go
* set test cases for queue manager
* use same counts for histograms and float histograms
* refactor createHistograms tests
* fix float histograms ref in watcher_test.go
* address PR comments

Signed-off-by: Marc Tuduri <marctc@protonmail.com>
2023-01-13 16:39:20 +05:30
Bryan Boreham 047585360b Update package storage/remote tests for new labels.Labels type
Use ScratchBuilder to create labels.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-12-19 15:22:09 +00:00
Ganesh Vernekar 648be89822
Merge remote-tracking branch 'upstream/main' into fix-conflict
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-10-12 14:20:02 +05:30
Jesus Vazquez 775d90d5f8
TSDB: Rename wal package to wlog (#11352)
The wlog.WL type can now be used to create a Write Ahead Log or a Write
Behind Log.

Before the prefix for wbl metrics was
'prometheus_tsdb_out_of_order_wal_' and has been replaced with
'prometheus_tsdb_out_of_order_wbl_'.

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Signed-off-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2022-10-10 20:38:46 +05:30
Jesus Vazquez e934d0f011 Merge 'main' into sparsehistogram
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
2022-10-05 22:14:49 +02:00
Bryan Boreham 3029320ce6 storage/remote: in tests use labels.FromStrings
And a few cases of `EmptyLabels()`.
Replacing code which assumes the internal structure of `Labels`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-09-09 13:34:49 +02:00
Ganesh Vernekar f540c1dbd3
Add support for histograms in WAL checkpointing (#11210)
* Add support for histograms in WAL checkpointing

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-08-29 17:38:36 +05:30
Levi Harrison 0db6b072bc
Export histogramToHistogramProto() (#11046)
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2022-07-21 10:12:50 -04:00
Levi Harrison 08f3ddb864
Sparse histogram remote-write support (#11001) 2022-07-14 09:13:12 -04:00
Leonardo Zamariola 3326df42bb
Removing global state modification on unit tests (fix #10033 #10034) (#10935)
* Removing global state modification on unit tests (fix #10033 #10034)

The config.DefaultRemoteReadConfig and config.DefaultRemoteWriteConfig
instances hold global state. Unit tests were changing their url.URL reference
globally causing false positives when tests were ran through package.
Two helper functions were created to copy those global values instead of changing
them in place to fix null point when running unit tests by method instead of
by package.

Signed-off-by: Leonardo Zamariola <leonardo.zamariola@gmail.com>

* Fixing pull request suggestions

Copying by value from default config

Signed-off-by: Leonardo Zamariola <leonardo.zamariola@gmail.com>
2022-06-30 10:20:16 -06:00
Matthieu MOREL e2ede285a2
refactor: move from io/ioutil to io and os packages (#10528)
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir

Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
2022-04-27 11:24:36 +02:00
Chris Marchbanks a11e73edda
Fix a deadlock between Batch and FlushAndShutdown (#10608)
If FlushAndShutdown is called with a full batchQueue, and then Batch is
called rather than the normal path of reading from a queue a deadlock
might be encountered. Rather than having FlushAndShutdown having
blocking code while holding a lock retry sending the batch every second.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2022-04-20 20:50:41 +02:00
beorn7 79376c1e94 Merge branch 'release-2.33' into beorn7/release 2022-03-08 17:42:49 +01:00
Chris Marchbanks afdc1decac
Write a test that reproduces the deadlock
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2022-03-07 17:15:51 -07:00
Julien Pivotto b0d70557b7
Merge pull request #10285 from prometheus/release-2.33 2022-02-12 00:02:24 +01:00
Chris Marchbanks bfb1500a38
Fix deadlock when stopping a shard (#10279)
If a queue is stopped and one of its shards happens to hit the
batch_send_deadline at the same time a deadlock can occur where stop
holds the mutex and will not release it until the send is finished, but
the send needs the mutex to retrieve the most recent batch. This is
fixed by using a second mutex just for writing.

In addition, the test I wrote exposed a case where during shutdown a
batch could be sent twice due to concurrent calls to queue.Batch() and
queue.FlushAndShutdown(). Protect these with a mutex as well.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2022-02-11 07:07:41 -07:00
Eng Zer Jun 3e67654d37
refactor: use T.TempDir() and B.TempDir to create temporary directory
The directory created by `T.TempDir()` and `B.TempDir()` is
automatically removed when the test and all its subtests complete.

Reference: https://pkg.go.dev/testing#T.TempDir
Reference: https://pkg.go.dev/testing#B.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-01-22 18:57:30 +08:00
Bryan Boreham 954c0e8020 remote_write: round desired shards up before check
Previously we would reject an increase from 2 to 2.5 as being
within 30%; by rounding up first we see this as an increase from 2 to 3.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-01-10 09:57:37 +00:00
Bryan Boreham 6d01ce8c4d remote_write: shard up more when backlogged
Change the coefficient from 1% to 5%, so instead of targetting to clear
the backlog in 100s we target 20s.

Update unit test to reflect the new behaviour.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-01-10 09:57:37 +00:00
Bryan Boreham d588b14d9c remote_write: detailed test for shard calculation
Drive the input parameters to `calculateDesiredShards()` very precisely,
to illustrate some questionable behaviour marked with `?!`.

See https://github.com/prometheus/prometheus/issues/9178,
https://github.com/prometheus/prometheus/issues/9207,

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-01-10 09:57:37 +00:00
Shihao Xia c3e7bfb813 add proper exit for loop
Signed-off-by: Shihao Xia <charlesxsh@hotmail.com>
2021-12-29 23:48:11 -05:00
Bryan Boreham bd6436605d Review feedback
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-12-09 14:40:44 +00:00
Bryan Boreham c478d6477a remote-write: benchmark just sending, on 20 shards
Previously BenchmarkSampleDelivery spent a lot of effort checking each
sample had arrived, so was largely showing the performance of test-only
code.

Increase the number of shards to be more realistic for a large
workload.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-12-03 14:02:10 +00:00
Matheus Alcantara e673805d67
storage/remote: use t.TempDir instead of ioutil.TempDir on tests (#9811)
Signed-off-by: Matheus Alcantara <matheusssilv97@gmail.com>
2021-11-19 15:21:45 -05:00
beorn7 c954cd9d1d Move packages out of deprecated pkg directory
This creates a new `model` directory and moves all data-model related
packages over there:
  exemplar labels relabel rulefmt textparse timestamp value

All the others are more or less utilities and have been moved to `util`:
  gate logging modetimevfs pool runtime

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-09 08:03:10 +01:00
Dieter Plaetinck cda025b5b5
TSDB: demistify SeriesRefs and ChunkRefs (#9536)
* TSDB: demistify seriesRefs and ChunkRefs

The TSDB package contains many types of series and chunk references,
all shrouded in uint types.  Often the same uint value may
actually mean one of different types, in non-obvious ways.

This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.

Concretely:

* Use appropriately named types and document their semantics and
  relations.
* Make multiplexing and demuxing of types explicit
  (on the boundaries between concrete implementations and generic
  interfaces).
* Casting between different types should be free.  None of the changes
  should have any impact on how the code runs.

TODO: Implement BlockSeriesRef where appropriate (for a future PR)

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* agent: demistify seriesRefs and ChunkRefs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-11-06 15:40:04 +05:30
Mateusz Gozdek 1a6c2283a3 Format Go source files using 'gofumpt -w -s -extra'
Part of #9557

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-02 19:52:34 +01:00
lzhfromustc d42be7be76
test:Fix two potential goroutine leaks (#8964)
Signed-off-by: lzhfromustc <lzhfromustc@gmail.com>
2021-10-29 15:44:32 -07:00
jinglina ed24e51e7c
remove redundant type conversion (#9126)
Signed-off-by: jinglina <jinglinax@163.com>
2021-07-28 13:33:46 +05:30
Bryan Boreham 60804c5a09
remote_write: reduce blocking from garbage-collect of series (#9109)
* Refactor: pass segment-reading function as param

To allow a different implementation to be used when garbage-collecting.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* remote_write: reduce blocking from GC of series

Add a method `UpdateSeriesSegment()` which is used together with
`SeriesReset()` to garbage-collect old series. This allows us to
split the lock around queueManager series data and avoid blocking
`Append()` while reading series from the last checkpoint.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Cosmetic: review feedback on comments

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* remote-write benchmark: include GC of series

Reduce the total number of samples per iteration from 5000*5000
(25 million) which is too big for my laptop, to 1*10000.

Extend `createTimeseries()` to add additional labels, so that the
queue manager is doing more realistic work.

Move the Append() call to a background goroutine - this works because
TestWriteClient uses a WaitGroup to signal completion.

Call `StoreSeries()` and `SeriesReset()` while adding samples, to
simulate the garbage-collection that wal.Watcher does.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Change BenchmarkSampleDelivery to call UpdateSeriesSegment

This matches what Watcher.garbageCollectSeries() is doing now

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-07-27 13:21:48 -07:00