* Add exemplar record case
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* recordType() -> Type.String()
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Rename walDir parameter to dir
Signed-off-by: Matej Gera <matejgera@gmail.com>
* Improve NewQueueManager comment
Signed-off-by: Matej Gera <matejgera@gmail.com>
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir
Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
* Run gofumpt on all files
Getting golangci-lint errors when building on my laptop, possibly because I have newer version of gofumpt then what it was formatted with.
Run gofumpt -w -extra on all files as it will be needed in the future anyway.
* Update golangci-lint to v1.44.2
v1.44.0 upgraded gofumpt so bumping version in CI will help keep formatting correct for everyone
* Address golangci-lint error
Getting 'error-strings: error strings should not be capitalized or end with punctuation or a newline' from revive here.
Drop new line.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
This creates a new `model` directory and moves all data-model related
packages over there:
exemplar labels relabel rulefmt textparse timestamp value
All the others are more or less utilities and have been moved to `util`:
gate logging modetimevfs pool runtime
Signed-off-by: beorn7 <beorn@grafana.com>
* Refactor: pass segment-reading function as param
To allow a different implementation to be used when garbage-collecting.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* remote_write: reduce blocking from GC of series
Add a method `UpdateSeriesSegment()` which is used together with
`SeriesReset()` to garbage-collect old series. This allows us to
split the lock around queueManager series data and avoid blocking
`Append()` while reading series from the last checkpoint.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Cosmetic: review feedback on comments
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* remote-write benchmark: include GC of series
Reduce the total number of samples per iteration from 5000*5000
(25 million) which is too big for my laptop, to 1*10000.
Extend `createTimeseries()` to add additional labels, so that the
queue manager is doing more realistic work.
Move the Append() call to a background goroutine - this works because
TestWriteClient uses a WaitGroup to signal completion.
Call `StoreSeries()` and `SeriesReset()` while adding samples, to
simulate the garbage-collection that wal.Watcher does.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Change BenchmarkSampleDelivery to call UpdateSeriesSegment
This matches what Watcher.garbageCollectSeries() is doing now
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Write exemplars to the WAL and send them over remote write.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Update example for exemplars, print data in a more obvious format.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Add metrics for remote write of exemplars.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Fix incorrect slices passed to send in remote write.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* We need to unregister the new metrics.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address review comments
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Order of exemplar append vs write exemplar to WAL needs to change.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Several fixes to prevent sending uninitialized or incorrect samples with an exemplar. Fix dropping exemplar for missing series. Add tests for queue_manager sending exemplars
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Store both samples and exemplars in the same timeseries buffer to remove the alloc when building final request, keep sub-slices in separate buffers for re-use
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Condense sample/exemplar delivery tests to parameterized sub-tests
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Rename test methods for clarity now that they also handle exemplars
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Rename counter variable. Fix instances where metrics were not updated correctly
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
* Add exemplars to LoadWAL benchmark
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* last exemplars timestamp metric needs to convert value to seconds with
ms precision
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Process exemplar records in a separate go routine when loading the WAL.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address review comments related to clarifying comments and variable
names. Also refactor sample/exemplar to enqueue prompb types.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Regenerate types proto with comments, update protoc version again.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Put remote write of exemplars behind a feature flag.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address some of Ganesh's review comments.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Move exemplar remote write feature flag to a config file field.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address Bartek's review comments.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Don't allocate exemplar buffers in queue_manager if we're not going to
send exemplars over remote write.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Add ValidateExemplar function, validate exemplars when appending to head
and log them all to WAL before adding them to exemplar storage.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address more reivew comments from Ganesh.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Add exemplar total label length check.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address a few last review comments
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Martin Disibio <mdisibio@gmail.com>
As we're looking to expand what's in the WAL,
having old Prometheus servers ignore the new record types
rather than treating them as corruption allows for better
upgrade/downgrade paths.
Adjust some tests accordingly, so they're still testing what they're
meant to test.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Chore: Log segment number when segment read failed
To manually fix the WAL files, it is good to know where the corrupt
happened so we should log the segment number when the read failed.
Related Issue #7506
Signed-off-by: gaston.qiu <gaston.qiu@umbocv.com>
* Fix bug with WAL watcher and Live Reader metrics usage.
Calling NewXMetrics when creating a Watcher or LiveReader results in a
registration error, which we're ignoring, and as a result other than the
first Watcher/Reader created, we had no metrics for either. So we would
only have metrics like Watcher Records Read for the first remote write
config in a users config file.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
The WAL Watcher replays a checkpoint after it is created in order to
garbage collect series that no longer exist in the WAL. Currently the
garbage collection process is done serially with reading from the tip of
the WAL which can cause large delays in writing samples to remote
storage just after compaction occurs.
This also fixes a memory leak where dropped series are not cleaned up as
part of the SeriesReset process.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>