If a queue is stopped and one of its shards happens to hit the
batch_send_deadline at the same time a deadlock can occur where stop
holds the mutex and will not release it until the send is finished, but
the send needs the mutex to retrieve the most recent batch. This is
fixed by using a second mutex just for writing.
In addition, the test I wrote exposed a case where during shutdown a
batch could be sent twice due to concurrent calls to queue.Batch() and
queue.FlushAndShutdown(). Protect these with a mutex as well.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
Previously we would reject an increase from 2 to 2.5 as being
within 30%; by rounding up first we see this as an increase from 2 to 3.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Change the coefficient from 1% to 5%, so instead of targetting to clear
the backlog in 100s we target 20s.
Update unit test to reflect the new behaviour.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Previously BenchmarkSampleDelivery spent a lot of effort checking each
sample had arrived, so was largely showing the performance of test-only
code.
Increase the number of shards to be more realistic for a large
workload.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This creates a new `model` directory and moves all data-model related
packages over there:
exemplar labels relabel rulefmt textparse timestamp value
All the others are more or less utilities and have been moved to `util`:
gate logging modetimevfs pool runtime
Signed-off-by: beorn7 <beorn@grafana.com>
* TSDB: demistify seriesRefs and ChunkRefs
The TSDB package contains many types of series and chunk references,
all shrouded in uint types. Often the same uint value may
actually mean one of different types, in non-obvious ways.
This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.
Concretely:
* Use appropriately named types and document their semantics and
relations.
* Make multiplexing and demuxing of types explicit
(on the boundaries between concrete implementations and generic
interfaces).
* Casting between different types should be free. None of the changes
should have any impact on how the code runs.
TODO: Implement BlockSeriesRef where appropriate (for a future PR)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* feedback
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* agent: demistify seriesRefs and ChunkRefs
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* Refactor: pass segment-reading function as param
To allow a different implementation to be used when garbage-collecting.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* remote_write: reduce blocking from GC of series
Add a method `UpdateSeriesSegment()` which is used together with
`SeriesReset()` to garbage-collect old series. This allows us to
split the lock around queueManager series data and avoid blocking
`Append()` while reading series from the last checkpoint.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Cosmetic: review feedback on comments
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* remote-write benchmark: include GC of series
Reduce the total number of samples per iteration from 5000*5000
(25 million) which is too big for my laptop, to 1*10000.
Extend `createTimeseries()` to add additional labels, so that the
queue manager is doing more realistic work.
Move the Append() call to a background goroutine - this works because
TestWriteClient uses a WaitGroup to signal completion.
Call `StoreSeries()` and `SeriesReset()` while adding samples, to
simulate the garbage-collection that wal.Watcher does.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Change BenchmarkSampleDelivery to call UpdateSeriesSegment
This matches what Watcher.garbageCollectSeries() is doing now
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Added MaxSamplesPerSend
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Added tests
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fixed order of require
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Added docs
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* writes -> writesReceived
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Improved send loop
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Write exemplars to the WAL and send them over remote write.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Update example for exemplars, print data in a more obvious format.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Add metrics for remote write of exemplars.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Fix incorrect slices passed to send in remote write.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* We need to unregister the new metrics.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address review comments
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Order of exemplar append vs write exemplar to WAL needs to change.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Several fixes to prevent sending uninitialized or incorrect samples with an exemplar. Fix dropping exemplar for missing series. Add tests for queue_manager sending exemplars
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Store both samples and exemplars in the same timeseries buffer to remove the alloc when building final request, keep sub-slices in separate buffers for re-use
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Condense sample/exemplar delivery tests to parameterized sub-tests
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Rename test methods for clarity now that they also handle exemplars
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Rename counter variable. Fix instances where metrics were not updated correctly
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Add exemplars to LoadWAL benchmark
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* last exemplars timestamp metric needs to convert value to seconds with
ms precision
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Process exemplar records in a separate go routine when loading the WAL.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address review comments related to clarifying comments and variable
names. Also refactor sample/exemplar to enqueue prompb types.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Regenerate types proto with comments, update protoc version again.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Put remote write of exemplars behind a feature flag.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address some of Ganesh's review comments.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Move exemplar remote write feature flag to a config file field.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address Bartek's review comments.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Don't allocate exemplar buffers in queue_manager if we're not going to
send exemplars over remote write.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Add ValidateExemplar function, validate exemplars when appending to head
and log them all to WAL before adding them to exemplar storage.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address more reivew comments from Ganesh.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Add exemplar total label length check.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address a few last review comments
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Martin Disibio <mdisibio@gmail.com>
* Testify: move to require
Moving testify to require to fail tests early in case of errors.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* More moves
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Refactor test assertions
This pull request gets rid of assert.True where possible to use
fine-grained assertions.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Refactor global vars to avoid failure with run test more than once.
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
* Register highestRecvTimestamp metric.
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
* Use local interner vars.
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
* Declare interner in write storage.
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
* storage: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* tsdb: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* web: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* notifier: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* cmd: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* scripts: Verify that we are not using restricted packages
It checks that we are not directly importing 'sync/atomic'.
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* Reorganise imports in blocks
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* notifier/test: Apply PR suggestions
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* storage/remote: avoid storing references on newEntry
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* Revert "scripts: Verify that we are not using restricted packages"
This reverts commit 278d32748e.
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* web: Group imports accordingly
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* Fixed nits introduced by https://github.com/prometheus/prometheus/pull/7334
* Added ChunkQueryable implementation to fanout and readyStorage.
* Added more comments.
* Changed NewVerticalChunkSeriesMerger to CompactingChunkSeriesMerger, removed tiny interface by reusing VerticalSeriesMergeFunc for overlapping algorithm for
both chunks and series, for both querying and compacting (!) + made sure duplicates are merged.
* Added ErrChunkSeriesSet
* Added Samples interface for seamless []promb.Sample to []tsdbutil.Sample conversion.
* Deprecating non chunks serieset based StreamChunkedReadResponses, added chunk one.
* Improved tests.
* Split remote client into Write (old storage) and read.
* Queryable client is now SampleAndChunkQueryable. Since we cannot use nice QueryableFunc I moved
all config based options to sampleAndChunkQueryableClient to aboid boilerplate.
In next commit: Changes for TSDB.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Right now any new metrics added for remote write need to be added to
both the QueueManager struct, and the queueManagerMetrics struct.
Instead, use the queueManagerMetrics struct directly from QueueManager.
The newQueueManagerMetrics constructor will now create the metrics for a
specific queue with name and endpoint pre-populated, and a new copy of
the struct will be created specifically for each queue.
This also fixes a bug where prometheus_remote_storage_sent_bytes_total
is not being unregistered after a queue is changed.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
* Fix bug with WAL watcher and Live Reader metrics usage.
Calling NewXMetrics when creating a Watcher or LiveReader results in a
registration error, which we're ignoring, and as a result other than the
first Watcher/Reader created, we had no metrics for either. So we would
only have metrics like Watcher Records Read for the first remote write
config in a users config file.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Change createTimeseries to take values for number of series and number
of samples per series.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Take num of samples to expect in expectSampleCount instead of array of
samples.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Add field to TestStorageClient to ignore samples sent waitgroup for
potential tests where we don't care about delivery of all samples.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Fix up tests a little bit.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
The integral accumulator in the remote write sharding code is just a
second way of keeping track of the number of samples pending. Remove
integralAccumulator and use the samplesPending value we already
calculate to calculate the number of shards.
This has the added benefit of fixing a bug where the integralAccumulator
was not being initialized correctly due to not taking into account the
number of ticks being counted, causing the integralAccumulator initial
value to be off by an order of magnitude in some cases.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
* Track remote write queues via a map so we don't care about index.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Support a job name for remote write/read so we can differentiate between
them using the name.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Remote write/read has Name to not confuse the meaning of the field with
scrape job names.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Split queue/client label into remote_name and url labels.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Don't allow for duplicate remote write/read configs.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Ensure we restart remote write queues if the hash of their config has
not changed, but the remote name has changed.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Include name in remote read/write config hashes, simplify duplicates
check, update test accordingly.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Refactor calculateDesiredShards + don't reshard if we're having issues
sending samples.
* Track lastSendTimestamp via an int64 with atomic add/load, add a test
for reshard calculation.
* Simplify conditional for skipping resharding, add samplesIn/Out to shard
testcase struct.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
The desired shards calculation now properly keeps track of the rate of
pending samples, and uses the previously unused integralAccumulator to
adjust for missing information in the desired shards calculation.
Also, configure more capacity for each shard. The default 10 capacity
causes shards to block on each other while
sending remote requests. Default to a 500 sample capacity and explain in
the documentation that having more capacity will help throughput.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>