* Removing global state modification on unit tests (fix#10033#10034)
The config.DefaultRemoteReadConfig and config.DefaultRemoteWriteConfig
instances hold global state. Unit tests were changing their url.URL reference
globally causing false positives when tests were ran through package.
Two helper functions were created to copy those global values instead of changing
them in place to fix null point when running unit tests by method instead of
by package.
Signed-off-by: Leonardo Zamariola <leonardo.zamariola@gmail.com>
* Fixing pull request suggestions
Copying by value from default config
Signed-off-by: Leonardo Zamariola <leonardo.zamariola@gmail.com>
Relates to @bboreham optimization in https://github.com/prometheus/prometheus/pull/10859
Bryan did reduce the sleep time improving the deltas on the benchmark by
quite a lot. However I've been working on a similar implementation for
out of order and I noticed that we actually get into this method
thousands of times.
@ywwg had the brilliant idea of not always sleeping before the select
but actually make it a case in the select so we only sleep if we need
to.
The benchmark deltas are amazing
```
❯ benchstat old_implementation.txt new_implementation_using_time_after.txt
name old time/op new time/op delta
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=0,mmappedChunkT=0-8 521ms ±25% 253ms ± 6% -51.47% (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=36,mmappedChunkT=0-8 773ms ± 3% 369ms ±31% -52.23% (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=72,mmappedChunkT=0-8 592ms ±28% 297ms ±28% -49.80% (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=360,mmappedChunkT=0-8 547ms ± 2% 999ms ±187% ~ (p=0.690 n=5+5)
LoadWAL/batches=10,seriesPerBatch=10000,samplesPerSeries=50,exemplarsPerSeries=0,mmappedChunkT=0-8 11.3s ± 4% 1.3s ±44% -88.48% (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=10000,samplesPerSeries=50,exemplarsPerSeries=2,mmappedChunkT=0-8 11.1s ± 1% 1.2s ±20% -89.08% (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=0,mmappedChunkT=0-8 1.24s ± 3% 0.18s ± 7% -85.76% (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=2,mmappedChunkT=0-8 1.24s ± 2% 0.18s ± 5% -85.24% (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=5,mmappedChunkT=0-8 1.23s ± 5% 0.27s ±33% -77.73% (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=24,mmappedChunkT=0-8 1.28s ± 1% 0.36s ± 7% -71.51% (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=0,mmappedChunkT=3800-8 12.1s ± 1% 3.1s ± 6% -74.33% (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=2,mmappedChunkT=3800-8 12.1s ± 1% 3.4s ± 4% -71.94% (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=5,mmappedChunkT=3800-8 12.1s ± 1% 3.8s ±17% -68.35% (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=24,mmappedChunkT=3800-8 12.4s ± 1% 4.0s ±18% -67.71% (p=0.008 n=5+5)
```
Benchmarked on Linux
```
goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/tsdb
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
```
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
* add description for __meta_kubernetes_endpoints_label_* and __meta_kubernetes_endpoints_labelpresent_*
Signed-off-by: renzheng.wang <wangrzneu@gmail.com>
* enable ui module publication
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* use main changelog of Prometheus to reflect the changes of the packages
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* ignore changelog and license in the libs
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* replace perses references
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
* Job queue
This PR reimplements chan chunkWriteJob with custom buffered queue that should use less memory, because it doesn't preallocate entire buffer for maximum queue size at once. Instead it allocates individual "segments" with smaller size.
As elements are added to the queue, they fill individual segments. When elements are removed from the queue (and segments), empty segments can be thrown away. This doesn't change memory usage of the queue when it's full, but should decrease its memory footprint when it's empty (queue will keep max 1 segment in such case).
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
* Modify test to work with low resolution timer.
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
* Improve comments.
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
This follow a simple function-based approach to access the count and
sum fields of a native Histogram. It might be more elegant to
implement “accessors” via the dot operator, as considered in the
brainstorming doc [1]. However, that would require the introduction of
a whole new concept in PromQL. For the PoC, we should be fine with the
function-based approch. Even the obvious inefficiencies (rate'ing a
whole histogram twice when we only want to rate each the count and the
sum once) could be optimized behind the scenes.
Note that the function-based approach elegantly solves the problem of
detecting counter resets in the sum of observations in the case of
negative observations. (Since the whole native Histogram is rate'd,
the counter reset is detected for the Histogram as a whole.)
We will decide later if an “accessor” approach is really needed. It
would change the example expression for average duration in
functions.md from
histogram_sum(rate(http_request_duration_seconds[10m]))
/
histogram_count(rate(http_request_duration_seconds[10m]))
to
rate(http_request_duration_seconds.sum[10m])
/
rate(http_request_duration_seconds.count[10m])
[1]: https://docs.google.com/document/d/1ch6ru8GKg03N02jRjYriurt-CZqUVY09evPg6yKTA1s/edit
Signed-off-by: beorn7 <beorn@grafana.com>
* Avoid gaps in in-order data after restart with out-of-order enabled
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix tests, do the temporary patch only if OOO is enabled
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Avoid Peter's confusion
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Use latest OutOfOrderTimeWindow
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
This implementation is based on this design doc:
https://docs.google.com/document/d/1Kppm7qL9C-BJB1j6yb6-9ObG3AbdZnFUBYPNNWwDBYM/edit?usp=sharing
This commit adds support to accept out-of-order ("OOO") sample into the TSDB
up to a configurable time allowance. If OOO is enabled, overlapping querying
are automatically enabled.
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Dieter Plaetinck <dieter@grafana.com>
I'd like to unwrap errors returned from rulefmt but both Error and WrappedError types are missing Unwrap() method.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>