It took a `Labels` where the memory could be re-used, but in practice
this hardly ever benefitted. Especially after converting `relabel.Process`
to `relabel.ProcessBuilder`.
Comparing the parameter to `nil` was a bug; `EmptyLabels` is not `nil`
so the slice was reallocated multiple times by `append`.
Lastly `Builder.Labels()` now estimates that the final size will depend
on labels added and deleted.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
During remote write, we call url.String() twice:
- to add the Endpoint() to the span
- to actually know where whe should send the request
This value does not change over time, and it's not really that
lightweight to calculate. I wrote this simple benchmark:
func BenchmarkURLString(b *testing.B) {
u, err := url.Parse("https://remote.write.com/api/v1")
require.NoError(b, err)
b.Run("string", func(b *testing.B) {
count := 0
for i := 0; i < b.N; i++ {
count += len(u.String())
}
})
}
And the results are ~200ns/op, 80B/op, 3 allocs/op.
Yes, we're going to go to the network here, which is a huge amount of
resources compared to this, but still, on agents that send 500 requests
per second, that is 1500 wasteful allocations per second.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
In storage/remote, try converting to RecoverableError using errors.As,
instead of through direct casting.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
As far as I understand it, we'd never expect to receive a nil span,
and remote.spansProtoToSpans would panic if we received a nil span.
Marking the fields as non-nullable also means the generated Golang
code doesn't use pointers for these fields, reducing allocations.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
This is an optimization on the existing append in OOOChunk.
What we've been doing so far is find the place inside the out-of-order
slice where the new sample should go in and then place it there and move
any samples to the right if necessary. This is OK but requires a binary
search every time the slice is bigger than 0.
The optimization is opinionated and suggests that although out-of-order
samples can be out-of-order amongst themselves they'll probably be in
order thus we can probably optimistically append at the end and if not
do the binary search.
OOOChunks are capped to 30 samples by default so this is a small
optimization but everything adds up, specially if you handle many active
timeseries with out-of-order samples.
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Signed-off-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
* adapt code.go and write_handler.go to support float histograms
* adapt watcher.go to support float histograms
* wip adapt queue_manager.go to support float histograms
* address comments for metrics in queue_manager.go
* set test cases for queue manager
* use same counts for histograms and float histograms
* refactor createHistograms tests
* fix float histograms ref in watcher_test.go
* address PR comments
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
Extends Appender.AppendHistogram function to accept the FloatHistogram. TSDB supports appending, querying, WAL replay, for this new type of histogram.
Signed-off-by: Marc Tudurí <marctc@protonmail.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
`QueueManager.externalLabels` becomes a slice rather than a `Labels` so
we can index into it when doing the merge operation.
Note we avoid calling `Labels.Len()` in `labelProtosToLabels()`.
It isn't necessary - `append()` will enlarge the buffer and we're
expecting to re-use it many times.
Also, we now validate protobuf input before converting to Labels.
This way we can detect errors first, and we don't place unnecessary
requirements on the Labels structure.
Re-do seriesFilter using labels.Builder (albeit N^2).
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Re-use previous memory if it is already of the correct type.
In `NewListSeries` we hoist the conversion to an interface value out
so it only allocates once.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Patterned after `Chunk.Iterator()`: pass the old iterator in so it
can be re-used to avoid allocating a new object.
(This commit does not do any re-use; it is just changing all the method
signatures so re-use is possible in later commits.)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
errors.Unwrap() actually dangerously returns nil if the error does not have an
Unwrap() method, which is the case in at least one of these places where I
noticed that no error was being logged at all when it should have.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* remote/read_handler: pool input to Marshal()
Use a sync.Pool to reuse byte slices between calls to Marshal() in the
remote read handler.
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
* remote: add microbenchmark for remote read handler
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
The wlog.WL type can now be used to create a Write Ahead Log or a Write
Behind Log.
Before the prefix for wbl metrics was
'prometheus_tsdb_out_of_order_wal_' and has been replaced with
'prometheus_tsdb_out_of_order_wbl_'.
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Signed-off-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
And a few cases of `EmptyLabels()`.
Replacing code which assumes the internal structure of `Labels`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Append metadata to the WAL
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Remove extra whitespace; Reword some docstrings and comments
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Use RLock() for hasNewMetadata check
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Use single byte for metric type in RefMetadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Update proposed WAL format for single-byte type metadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Implementa MetadataAppender interface for the Agent
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Address first round of review comments
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Amend description of metadata in wal.md
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Correct key used to retrieve metadata from cache
When we're setting metadata entries in the scrapeCace, we're using the
p.Help(), p.Unit(), p.Type() helpers, which retrieve the series name and
use it as the cache key. When checking for cache entries though, we used
p.Series() as the key, which included the metric name _with_ its labels.
That meant that we were never actually hitting the cache. We're fixing
this by utiling the __name__ internal label for correctly getting the
cache entries after they've been set by setHelp(), setType() or
setUnit().
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Put feature behind a feature flag
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix AppendMetadata docstring
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Reorder WAL format document
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Change error message of AppendMetadata; Fix access of s.meta in AppendMetadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Reuse temporary buffer in Metadata encoder
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Only keep latest metadata for each refID during checkpointing
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix test that's referencing decoding metadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Avoid creating metadata block if no new metadata are present
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Add tests for corrupt metadata block and relevant record type
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix CR comments
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Extract logic about changing metadata in an anonymous function
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Implement new proposed WAL format and amend relevant tests
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Use 'const' for metadata field names
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Apply metadata to head memSeries in Commit, not in AppendMetadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Add docstring and rename extracted helper in scrape.go
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Add tests for tsdb-related cases
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix linter issues vol1
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix linter issues vol2
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix Windows test by closing WAL reader files
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Use switch instead of two if statements in metadata decoding
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix review comments around TestMetadata* tests
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Add code for replaying WAL; test correctness of in-memory data after a replay
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Remove scrape-loop related code from PR
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Address first round of comments
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Simplify tests by sorting slices before comparison
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix test to use separate transactions
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Empty out buffer and record slices after encoding latest metadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix linting issue
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Update calculation for DroppedMetadata metric
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Rename MetadataAppender interface and AppendMetadata method to MetadataUpdater/UpdateMetadata
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Reuse buffer when encoding latest metadata for each series
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Fix review comments; Check all returned error values using two helpers
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Simplify use of helpers
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
* Satisfy linter
Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
Note: This is deliberately an incompatible change. Since we have never
used histograms in remote read/write yet, there is no point in keeping
compatibility. This _is_, however, compatible to the state in the main
branch.
This commit flattens the bucket message into top-level fields. This
has the disadvantage of now having two triples of fields prefixed with
`negative_...` or `positive_...`. However, with this structure, we
save one tag on the wire. And, perhaps more importantly, we mirror the
structure of the `histogram.Histogram` Go type.
This commit also adjusts `repeated` fields to use names in the plural
form, as it is also the case for the fields that already existed.
This also adds a doc comment to `HistogramProtoToHistogram` and
changes its return type to a pointer (which is more convenient and
probably more efficient).
Signed-off-by: beorn7 <beorn@grafana.com>
* Removing global state modification on unit tests (fix#10033#10034)
The config.DefaultRemoteReadConfig and config.DefaultRemoteWriteConfig
instances hold global state. Unit tests were changing their url.URL reference
globally causing false positives when tests were ran through package.
Two helper functions were created to copy those global values instead of changing
them in place to fix null point when running unit tests by method instead of
by package.
Signed-off-by: Leonardo Zamariola <leonardo.zamariola@gmail.com>
* Fixing pull request suggestions
Copying by value from default config
Signed-off-by: Leonardo Zamariola <leonardo.zamariola@gmail.com>
* Rename walDir parameter to dir
Signed-off-by: Matej Gera <matejgera@gmail.com>
* Improve NewQueueManager comment
Signed-off-by: Matej Gera <matejgera@gmail.com>
"Labels is a sorted set of labels. Order has to be guaranteed upon
instantiation." says the comment, so fix all the tests that break this
rule.
For `BenchmarkLabelValuesWithMatchers()` and
`BenchmarkHeadLabelValuesWithMatchers()` the amount of work done changes
significantly if you put the labels in order, because all series refs
get neatly partitioned by the `tens` label, so I renamed the labels
to maintain the previous behaviour.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir
Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
If FlushAndShutdown is called with a full batchQueue, and then Batch is
called rather than the normal path of reading from a queue a deadlock
might be encountered. Rather than having FlushAndShutdown having
blocking code while holding a lock retry sending the batch every second.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
Do not block when trying to write a batch to the queue. This can cause
appends to lock forever if the only thing reading from the queue needs
the mutex to write. Instead, if batchQueue is full pop the sample that
was just added from the partial batch and return false. The code doing
the appending already handles retries with backoff.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
If a queue is stopped and one of its shards happens to hit the
batch_send_deadline at the same time a deadlock can occur where stop
holds the mutex and will not release it until the send is finished, but
the send needs the mutex to retrieve the most recent batch. This is
fixed by using a second mutex just for writing.
In addition, the test I wrote exposed a case where during shutdown a
batch could be sent twice due to concurrent calls to queue.Batch() and
queue.FlushAndShutdown(). Protect these with a mutex as well.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
Previously we would reject an increase from 2 to 2.5 as being
within 30%; by rounding up first we see this as an increase from 2 to 3.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Change the coefficient from 1% to 5%, so instead of targetting to clear
the backlog in 100s we target 20s.
Update unit test to reflect the new behaviour.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
We have an alert that fires when prometheus_remote_storage_highest_timestamp_in_seconds - prometheus_remote_storage_queue_highest_sent_timestamp_seconds
becomes too high. But we have an agent that fires this when the remote "rate-limits" the user.
This is because prometheus_remote_storage_queue_highest_sent_timestamp_seconds doesn't get updated
when the remote sends a 429.
I think we should update the metrics, and the change I made makes sense. Because if the requests fails
because of connectivity issues, etc. we will never exit the `sendWriteRequestWithBackoff` function. It only
exits the function when there is a non-recoverable error, like a bad status code, and in that case, I think
the metric needs to be updated.
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
* storage: expose bug in iterators #10027
Signed-off-by: beorn7 <beorn@grafana.com>
* storage: fix bug #10027 in iterators' Seek method
Signed-off-by: beorn7 <beorn@grafana.com>
* Append reporting metrics without limit
If reporting metrics fails due to reaching the limit, this makes the
target appear as UP in the UI, but the metrics are missing.
This commit bypasses that limit for report metrics.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Remove check against cfg so interval/ timeout are always set (#10023) (#10031)
Signed-off-by: Nicholas Blott <blottn@tcd.ie>
Co-authored-by: Nicholas Blott <blottn@tcd.ie>
* Cut v2.32.1
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Apply suggestions from code review
Signed-off-by: Julius Volz <julius.volz@gmail.com>
Co-authored-by: Levi Harrison <git@leviharrison.dev>
Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Co-authored-by: Nicholas Blott <blottn@tcd.ie>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Co-authored-by: Levi Harrison <git@leviharrison.dev>
Previously BenchmarkSampleDelivery spent a lot of effort checking each
sample had arrived, so was largely showing the performance of test-only
code.
Increase the number of shards to be more realistic for a large
workload.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Right now the values for enqueuedSamples and enqueuedExemplars is never
subtracted leading to inflated values for failedSamples/failedExemplars
when a hard shutdown of a shard occurs.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
Channels can cause bottlenecks and tons of context switches when reading
hundreds of thousands of samples per second from a single queue.
Instead, pre-batch the samples to amortize the cost of the concurrency
overhead.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
There was a subtle and nasty bug in listSeriesIterator.Seek.
In addition, the Seek call is defined to be a no-op if the current
position of the iterator is already pointing to a suitable
sample. This commit adds fast paths for this case to several
potentially expensive Seek calls.
Another bug was in concreteSeriesIterator.Seek. It always searched the
whole series and not from the current position of the iterator.
Signed-off-by: beorn7 <beorn@grafana.com>
- Pick At... method via return value of Next/Seek.
- Do not clobber returned buckets.
- Add partial FloatHistogram suppert.
Note that the promql package is now _only_ dealing with
FloatHistograms, following the idea that PromQL only knows float
values.
As a byproduct, I have removed the histogramSeries metric. In my
understanding, series can have both float and histogram samples, so
that metric doesn't make sense anymore.
As another byproduct, I have converged the sampleBuf and the
histogramSampleBuf in memSeries into one. The sample type stored in
the sampleBuf has been extended to also contain histograms even before
this commit.
Signed-off-by: beorn7 <beorn@grafana.com>
This is to avoid copying the many fields of a histogram.Histogram all
the time.
This also fixes a bunch of formerly broken tests.
Signed-off-by: beorn7 <beorn@grafana.com>
This creates a new `model` directory and moves all data-model related
packages over there:
exemplar labels relabel rulefmt textparse timestamp value
All the others are more or less utilities and have been moved to `util`:
gate logging modetimevfs pool runtime
Signed-off-by: beorn7 <beorn@grafana.com>
* TSDB: demistify seriesRefs and ChunkRefs
The TSDB package contains many types of series and chunk references,
all shrouded in uint types. Often the same uint value may
actually mean one of different types, in non-obvious ways.
This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.
Concretely:
* Use appropriately named types and document their semantics and
relations.
* Make multiplexing and demuxing of types explicit
(on the boundaries between concrete implementations and generic
interfaces).
* Casting between different types should be free. None of the changes
should have any impact on how the code runs.
TODO: Implement BlockSeriesRef where appropriate (for a future PR)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* feedback
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
* agent: demistify seriesRefs and ChunkRefs
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
By holding a `proto.Buffer` per shard and passing it down to where
marshalling is done, we avoid creating a lot of garbage.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Initial draft of prometheus-agent
This commit introduces a new binary, prometheus-agent, based on the
Grafana Agent code. It runs a WAL-only version of prometheus without the
TSDB, alerting, or rule evaluations. It is intended to be used to
remote_write to Prometheus or another remote_write receiver.
By default, prometheus-agent will listen on port 9095 to not collide
with the prometheus default of 9090.
Truncation of the WAL cooperates on a best-effort case with Remote
Write. Every time the WAL is truncated, the minimum timestamp of data to
truncate is determined by the lowest sent timestamp of all samples
across all remote_write endpoints. This gives loose guarantees that data
from the WAL will not try to be removed until the maximum sample
lifetime passes or remote_write starts functionining.
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* add tests for Prometheus agent (#22)
* add tests for Prometheus agent
* add tests for Prometheus agent
* rearranged tests as per the review comments
* update tests for Agent
* changes as per code review comments
Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>
* incremental changes to prometheus agent
Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>
* changes as per code review comments
Signed-off-by: SriKrishna Paparaju <paparaju@gmail.com>
* Commit feedback from code review
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* Port over some comments from grafana/agent
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* Rename agent.Storage to agent.DB for tsdb consistency
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* Consolidate agentMode ifs in cmd/prometheus/main.go
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* Document PreAction usage requirements better for agent mode flags
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* remove unnecessary defaultListenAddr
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* `go fmt ./tsdb/agent` and fix lint errors
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
Co-authored-by: SriKrishna Paparaju <paparaju@gmail.com>
A lot of this code was hacked together, literally during a
hackathon. This commit intends not to change the code substantially,
but just make the code obey the usual style practices.
A (possibly incomplete) list of areas:
* Generally address linter warnings.
* The `pgk` directory is deprecated as per dev-summit. No new packages should
be added to it. I moved the new `pkg/histogram` package to `model`
anticipating what's proposed in #9478.
* Make the naming of the Sparse Histogram more consistent. Including
abbreviations, there were just too many names for it: SparseHistogram,
Histogram, Histo, hist, his, shs, h. The idea is to call it "Histogram" in
general. Only add "Sparse" if it is needed to avoid confusion with
conventional Histograms (which is rare because the TSDB really has no notion
of conventional Histograms). Use abbreviations only in local scope, and then
really abbreviate (not just removing three out of seven letters like in
"Histo"). This is in the spirit of
https://github.com/golang/go/wiki/CodeReviewComments#variable-names
* Several other minor name changes.
* A lot of formatting of doc comments. For one, following
https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences
, but also layout question, anticipating how things will look like
when rendered by `godoc` (even where `godoc` doesn't render them
right now because they are for unexported types or not a doc comment
at all but just a normal code comment - consistency is queen!).
* Re-enabled `TestQueryLog` and `TestEndopints` (they pass now,
leaving them disabled was presumably an oversight).
* Bucket iterator for histogram.Histogram is now created with a
method.
* HistogramChunk.iterator now allows iterator recycling. (I think
@dieterbe only commented it out because he was confused by the
question in the comment.)
* HistogramAppender.Append panics now because we decided to treat
staleness marker differently.
Signed-off-by: beorn7 <beorn@grafana.com>
We are re-enabling HTTP 2 again. There has been a few bugfixes upstream
in go, and we have also enabled ReadIdleTimeout.
Fix#7588Fix#9068
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>