prometheus/tsdb/querier.go
Nicolás Pazos aa3513fc89
remote write 2.0: sync with main branch (#13510)
* consoles: exclude iowait and steal from CPU Utilisation

'iowait' and 'steal' indicate specific idle/wait states, which shouldn't
be counted into CPU Utilisation. Also see
https://github.com/prometheus-operator/kube-prometheus/pull/796 and
https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/667.

Per the iostat man page:

%idle
    Show the percentage of time that the CPU or CPUs were idle and the
    system did not have an outstanding disk I/O request.

%iowait
     Show the percentage of time that the CPU or CPUs were idle during
     which the system had an outstanding disk I/O request.

%steal
     Show the percentage of time spent in involuntary wait by the
     virtual CPU or CPUs while the hypervisor was servicing another
     virtual processor.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>

* tsdb: shrink txRing with smaller integers

4 billion active transactions ought to be enough for anyone.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* tsdb: create isolation transaction slice on demand

When Prometheus restarts it creates every series read in from the WAL,
but many of those series will be finished, and never receive any more
samples. By defering allocation of the txRing slice to when it is first
needed, we save 32 bytes per stale series.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* add cluster variable to Overview dashboard

Signed-off-by: Erik Sommer <ersotech@posteo.de>

* promql: simplify Native Histogram arithmetics

Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>

* Cut 2.49.0-rc.0 (#13270)

* Cut 2.49.0-rc.0

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Removed the duplicate.

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Add unit protobuf parser

Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it>

* Go on adding protobuf parsing for unit

Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it>

* ui: create a reproduction for https://github.com/prometheus/prometheus/issues/13292

Signed-off-by: machine424 <ayoubmrini424@gmail.com>

* Get conditional right

Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it>

* Get VM Scale Set NIC (#13283)

Calling `*armnetwork.InterfacesClient.Get()` doesn't work for Scale Set
VM NIC, because these use a different Resource ID format.

Use `*armnetwork.InterfacesClient.GetVirtualMachineScaleSetNetworkInterface()`
instead.  This needs both the scale set name and the instance ID, so
add an `InstanceID` field to the `virtualMachine` struct.  `InstanceID`
is empty for a VM that isn't a ScaleSetVM.

Signed-off-by: Daniel Nicholls <daniel.nicholls@resdiary.com>

* Cut v2.49.0-rc.1

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Delete debugging lines, amend error message for unit

Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it>

* Correct order in error message

Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it>

* Consider storage.ErrTooOldSample as non-retryable

Signed-off-by: Daniel Kerbel <nmdanny@gmail.com>

* scrape_test.go: Increase scrape interval in TestScrapeLoopCache to reduce potential flakiness

Signed-off-by: machine424 <ayoubmrini424@gmail.com>

* Avoid creating string for suffix, consider counters without _total suffix

Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it>

* build(deps): bump github.com/prometheus/client_golang

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.17.0 to 1.18.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.17.0...v1.18.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump actions/setup-node from 3.8.1 to 4.0.1

Bumps [actions/setup-node](https://github.com/actions/setup-node) from 3.8.1 to 4.0.1.
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](5e21ff4d9b...b39b52d121)

---
updated-dependencies:
- dependency-name: actions/setup-node
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* scripts: sort file list in embed directive

Otherwise the resulting string depends on find, which afaict depends on
the underlying filesystem. A stable file list make it easier to detect
UI changes in downstreams that need to track UI assets.

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* Fix DataTableProps['data'] for resultType string

Signed-off-by: Kevin Mingtarja <kevin.mingtarja@gmail.com>

* Fix handling of scalar and string in isHeatmapData

Signed-off-by: Kevin Mingtarja <kevin.mingtarja@gmail.com>

* build(deps): bump github.com/influxdata/influxdb

Bumps [github.com/influxdata/influxdb](https://github.com/influxdata/influxdb) from 1.11.2 to 1.11.4.
- [Release notes](https://github.com/influxdata/influxdb/releases)
- [Commits](https://github.com/influxdata/influxdb/compare/v1.11.2...v1.11.4)

---
updated-dependencies:
- dependency-name: github.com/influxdata/influxdb
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump github.com/prometheus/prometheus

Bumps [github.com/prometheus/prometheus](https://github.com/prometheus/prometheus) from 0.48.0 to 0.48.1.
- [Release notes](https://github.com/prometheus/prometheus/releases)
- [Changelog](https://github.com/prometheus/prometheus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/prometheus/compare/v0.48.0...v0.48.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/prometheus
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump client_golang to v1.18.0 (#13373)

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Drop old inmemory samples (#13002)

* Drop old inmemory samples

Co-authored-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Avoid copying timeseries when the feature is disabled

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Run gofmt

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Clarify docs

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Add more logging info

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Remove loggers

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* optimize function and add tests

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Simplify filter

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* rename var

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Update help info from metrics

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* use metrics to keep track of drop elements during buildWriteRequest

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* rename var in tests

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* pass time.Now as parameter

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Change buildwriterequest during retries

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Revert "Remove loggers"

This reverts commit 54f91dfcae20488944162335ab4ad8be459df1ab.

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* use log level debug for loggers

Signed-off-by: Marc Tuduri <marctc@protonmail.com>

* Fix linter

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Remove noisy debug-level logs; add 'reason' label to drop metrics

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Remove accidentally committed files

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Propagate logger to buildWriteRequest to log dropped data

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Fix docs comment

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Make drop reason more specific

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Remove unnecessary pass of logger

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Use snake_case for reason label

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

* Fix dropped samples metric

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>

---------

Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
Signed-off-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com>
Co-authored-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com>

* fix(discovery): allow requireUpdate util to timeout in discovery/file/file_test.go.

The loop ran indefinitely if the condition isn't met.

Before, each iteration created a new timer channel which was always outpaced by
the other timer channel with smaller duration.

minor detail: There was a memory leak: resources of the ~10 previous timers were
constantly kept. With the fix, we may keep the resources of one timer around for defaultWait
but this isn't worth the changes to make it right.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>

* Merge pull request #13371 from kevinmingtarja/fix-isHeatmapData

ui: fix handling of scalar and string in isHeatmapData

* tsdb/{index,compact}: allow using custom postings encoding format (#13242)

* tsdb/{index,compact}: allow using custom postings encoding format

We would like to experiment with a different postings encoding format in
Thanos so in this change I am proposing adding another argument to
`NewWriter` which would allow users to change the format if needed.
Also, wire the leveled compactor so that it would be possible to change
the format there too.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* tsdb/compact: use a struct for leveled compactor options

As discussed on Slack, let's use a struct for the options in leveled
compactor.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* tsdb: make changes after Bryan's review

- Make changes less intrusive
- Turn the postings encoder type into a function
- Add NewWriterWithEncoder()

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

---------

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Cut 2.49.0-rc.2

Signed-off-by: bwplotka <bwplotka@gmail.com>

* build(deps): bump actions/setup-go from 3.5.0 to 5.0.0 in /scripts (#13362)

Bumps [actions/setup-go](https://github.com/actions/setup-go) from 3.5.0 to 5.0.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](6edd4406fa...0c52d547c9)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump github/codeql-action from 2.22.8 to 3.22.12 (#13358)

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.22.8 to 3.22.12.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](407ffafae6...012739e508)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* put @nexucis has a release shepherd (#13383)

Signed-off-by: Augustin Husson <augustin.husson@amadeus.com>

* Add analyze histograms command to promtool (#12331)

Add `query analyze` command to promtool

This command analyzes the buckets of classic and native histograms,
based on data queried from the Prometheus query API, i.e. it
doesn't require direct access to the TSDB files.

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

* included instance in all necessary descriptions

Signed-off-by: Erik Sommer <ersotech@posteo.de>

* tsdb/compact: fix passing merge func

Fixing a very small logical problem I've introduced :(.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* tsdb: add enable overlapping compaction

This functionality is needed in downstream projects because they have a
separate component that does compaction.

Upstreaming
7c8e9a2a76/tsdb/compact.go (L323-L325).

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Cut 2.49.0

Signed-off-by: bwplotka <bwplotka@gmail.com>

* promtool: allow setting multiple matchers to "promtool tsdb dump" command. (#13296)

Conditions are ANDed inside the same matcher but matchers are ORed

Including unit tests for "promtool tsdb dump".

Refactor some matchers scraping utils.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>

* Fixed changelog

Signed-off-by: bwplotka <bwplotka@gmail.com>

* tsdb/main: wire "EnableOverlappingCompaction" to tsdb.Options (#13398)

This added the https://github.com/prometheus/prometheus/pull/13393
"EnableOverlappingCompaction" parameter to the compactor code but not to
the tsdb.Options. I forgot about that. Add it to `tsdb.Options` too and
set it to `true` in Prometheus.

Copy/paste the description from
https://github.com/prometheus/prometheus/pull/13393#issuecomment-1891787986

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Issue #13268: fix quality value in accept header

Signed-off-by: Kumar Kalpadiptya Roy <kalpadiptya.roy@outlook.com>

* Cut 2.49.1 with scrape q= bugfix.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Cut 2.49.1 web package.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Restore more efficient version of NewPossibleNonCounterInfo annotation (#13022)

Restore more efficient version of NewPossibleNonCounterInfo annotation

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

* Fix regressions introduced by #13242

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* fix slice copy in 1.20 (#13389)

The slices package is added to the standard library in Go 1.21;
we need to import from the exp area to maintain compatibility with Go 1.20.

Signed-off-by: tyltr <tylitianrui@126.com>

* Docs: Query Basics: link to rate (#10538)

Co-authored-by: Julien Pivotto <roidelapluie@o11y.eu>

* chore(kubernetes): check preconditions earlier and avoid unnecessary checks or iterations

Signed-off-by: machine424 <ayoubmrini424@gmail.com>

* Examples: link to `rate` for new users (#10535)

* Examples: link to `rate` for new users

Signed-off-by: Ted Robertson 10043369+tredondo@users.noreply.github.com
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>

* promql: use natural sort in sort_by_label and sort_by_label_desc (#13411)

These functions are intended for humans, as robots can already sort the results
however they please. Humans like things sorted "naturally":

* https://blog.codinghorror.com/sorting-for-humans-natural-sort-order/

A similar thing has been done to Grafana, which is also used by humans:

* https://github.com/grafana/grafana/pull/78024
* https://github.com/grafana/grafana/pull/78494

Signed-off-by: Ivan Babrou <github@ivan.computer>

* TestLabelValuesWithMatchers: Add test case

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* remove  obsolete build tag

Signed-off-by: tyltr <tylitianrui@126.com>

* Upgrade some golang dependencies for resty 2.11

Signed-off-by: Israel Blancas <iblancasa@gmail.com>

* Native Histograms: support `native_histogram_min_bucket_factor` in scrape_config (#13222)

Native Histograms: support native_histogram_min_bucket_factor in scrape_config

---------

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>

* Add warnings for histogramRate applied with isCounter not matching counter/gauge histogram (#13392)

Add warnings for histogramRate applied with isCounter not matching counter/gauge histogram

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

* Minor fixes to otlp vendor update script

Signed-off-by: Goutham <gouthamve@gmail.com>

* build(deps): bump github.com/hetznercloud/hcloud-go/v2

Bumps [github.com/hetznercloud/hcloud-go/v2](https://github.com/hetznercloud/hcloud-go) from 2.4.0 to 2.6.0.
- [Release notes](https://github.com/hetznercloud/hcloud-go/releases)
- [Changelog](https://github.com/hetznercloud/hcloud-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/hetznercloud/hcloud-go/compare/v2.4.0...v2.6.0)

---
updated-dependencies:
- dependency-name: github.com/hetznercloud/hcloud-go/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Enhanced visibility for `promtool test rules` with JSON colored formatting (#13342)

* Added diff flag for unit test to improvise readability & debugging

Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>

* Removed blank spaces

Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>

* Fixed linting error

Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>

* Added cli flags to documentation

Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>

* Revert unrrelated linting fixes

Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>

* Fixed review suggestions

Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>

* Cleanup

Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>

* Updated flag description

Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>

* Updated flag description

Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>

---------

Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>

* storage: skip merging when no remote storage configured

Prometheus is hard-coded to use a fanout storage between TSDB and
a remote storage which by default is empty.
This change detects the empty storage and skips merging between
result sets, which would make `Select()` sort results.

Bottom line: we skip a sort unless there really is some remote storage
configured.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Remove csmarchbanks from remote write owners (#13432)

I have not had the time to keep up with remote write and have no plans
to work on it in the near future so I am withdrawing my maintainership
of that part of the codebase. I continue to focus on client_python.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>

* add more context cancellation check at evaluation time

Signed-off-by: Ben Ye <benye@amazon.com>

* Optimize label values with matchers by taking shortcuts (#13426)

Don't calculate postings beforehand: we may not need them. If all
matchers are for the requested label, we can just filter its values.

Also, if there are no values at all, no need to run any kind of
logic.

Also add more labelValuesWithMatchers benchmarks

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add automatic memory limit handling

Enable automatic detection of memory limits and configure GOMEMLIMIT to
match.
* Also includes a flag to allow controlling the reserved ratio.

Signed-off-by: SuperQ <superq@gmail.com>

* Update OSSF badge link (#13433)

Provide a more user friendly interface

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

* SD Managers taking over responsibility for registration of debug metrics (#13375)

SD Managers take over responsibility for SD metrics registration

---------

Signed-off-by: Paulin Todev <paulin.todev@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Björn Rabenstein <github@rabenste.in>

* Optimize histogram iterators (#13340)

Optimize histogram iterators

Histogram iterators allocate new objects in the AtHistogram and
AtFloatHistogram methods, which makes calculating rates over long
ranges expensive.

In #13215 we allowed an existing object to be reused
when converting an integer histogram to a float histogram. This commit follows
the same idea and allows injecting an existing object in the AtHistogram and
AtFloatHistogram methods. When the injected value is nil, iterators allocate
new histograms, otherwise they populate and return the injected object.

The commit also adds a CopyTo method to Histogram and FloatHistogram which
is used in the BufferedIterator to overwrite items in the ring instead of making
new copies.

Note that a specialized HPoint pool is needed for all of this to work 
(`matrixSelectorHPool`).

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>

* doc: Mark `mad_over_time` as experimental (#13440)

We forgot to do that in
https://github.com/prometheus/prometheus/pull/13059

Signed-off-by: beorn7 <beorn@grafana.com>

* Change metric label for Puppetdb from 'http' to 'puppetdb'

Signed-off-by: Paulin Todev <paulin.todev@gmail.com>

* mirror metrics.proto change & generate code

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>

* TestHeadLabelValuesWithMatchers: Add test case (#13414)

Add test case to TestHeadLabelValuesWithMatchers, while fixing a couple
of typos in other test cases. Also enclosing some implicit sub-tests in a
`t.Run` call to make them explicitly sub-tests.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* update all go dependencies (#13438)

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* build(deps): bump the k8s-io group with 2 updates (#13454)

Bumps the k8s-io group with 2 updates: [k8s.io/api](https://github.com/kubernetes/api) and [k8s.io/client-go](https://github.com/kubernetes/client-go).


Updates `k8s.io/api` from 0.28.4 to 0.29.1
- [Commits](https://github.com/kubernetes/api/compare/v0.28.4...v0.29.1)

Updates `k8s.io/client-go` from 0.28.4 to 0.29.1
- [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md)
- [Commits](https://github.com/kubernetes/client-go/compare/v0.28.4...v0.29.1)

---
updated-dependencies:
- dependency-name: k8s.io/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: k8s-io
- dependency-name: k8s.io/client-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: k8s-io
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump the go-opentelemetry-io group with 1 update (#13453)

Bumps the go-opentelemetry-io group with 1 update: [go.opentelemetry.io/collector/semconv](https://github.com/open-telemetry/opentelemetry-collector).


Updates `go.opentelemetry.io/collector/semconv` from 0.92.0 to 0.93.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-collector/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-collector/blob/main/CHANGELOG-API.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-collector/compare/v0.92.0...v0.93.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/collector/semconv
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: go-opentelemetry-io
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump actions/upload-artifact from 3.1.3 to 4.0.0 (#13355)

Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 3.1.3 to 4.0.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](a8a3f3ad30...c7d193f32e)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump bufbuild/buf-push-action (#13357)

Bumps [bufbuild/buf-push-action](https://github.com/bufbuild/buf-push-action) from 342fc4cdcf29115a01cf12a2c6dd6aac68dc51e1 to a654ff18effe4641ebea4a4ce242c49800728459.
- [Release notes](https://github.com/bufbuild/buf-push-action/releases)
- [Commits](342fc4cdcf...a654ff18ef)

---
updated-dependencies:
- dependency-name: bufbuild/buf-push-action
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Labels: Add DropMetricName function, used in PromQL (#13446)

This function is called very frequently when executing PromQL functions,
and we can do it much more efficiently inside Labels.

In the common case that `__name__` comes first in the labels, we simply
re-point to start at the next label, which is nearly free.

`DropMetricName` is now so cheap I removed the cache - benchmarks show
everything still goes faster.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* tsdb: simplify internal series delete function (#13261)

Lifting an optimisation from Agent code, `seriesHashmap.del` can use
the unique series reference, doesn't need to check Labels.
Also streamline the logic for deleting from `unique` and `conflicts` maps,
and add some comments to help the next person.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* otlptranslator/update-copy.sh: Fix sed command lines

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Rollback k8s.io requirements (#13462)

Rollback k8s.io Go modules to v0.28.6 to avoid forcing upgrade of Go to
1.21. This allows us to keep compatibility with the currently supported
upstream Go releases.

Signed-off-by: SuperQ <superq@gmail.com>

* Make update-copy.sh work for both OSX and GNU sed

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Name @beorn7 and @krajorama as maintainers for native histograms

I have been the de-facto maintainer for native histograms from the
beginning. So let's put this into MAINTAINERS.md.

In addition, I hereby proposose George Krajcsovits AKA Krajo as a
co-maintainer. He has contributed a lot of native histogram code, but
more importantly, he has contributed substantially to reviewing other
contributors' native histogram code, up to a point where I was merely
rubberstamping the PRs he had already reviewed. I'm confident that he
is ready to to be granted commit rights as outlined in the
"Maintainers" section of the governance:
https://prometheus.io/governance/#maintainers

According to the same section of the governance, I will announce the
proposed change on the developers mailing list and will give some time
for lazy consensus before merging this PR.

Signed-off-by: beorn7 <beorn@grafana.com>

* ui/fix: correct url handling for stacked graphs (#13460)

Signed-off-by: Yury Moladau <yurymolodov@gmail.com>

* tsdb: use cheaper Mutex on series

Mutex is 8 bytes; RWMutex is 24 bytes and much more complicated. Since
`RLock` is only used in two places, `UpdateMetadata` and `Delete`,
neither of which are hotspots, we should use the cheaper one.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix last_over_time for native histograms

The last_over_time retains a histogram sample without making a copy.
This sample is now coming from the buffered iterator used for windowing functions,
and can be reused for reading subsequent samples as the iterator progresses.

I would propose copying the sample in the last_over_time function, similar to
how it is done for rate, sum_over_time and others.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Implementation

NOTE:
Rebased from main after refactor in #13014

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Add feature flag

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Refactor concurrency control

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Optimising dependencies/dependents funcs to not produce new slices each request

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Refactoring

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Rename flag

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Refactoring for performance, and to allow controller to be overridden

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Block until all rules, both sync & async, have completed evaluating
Updated & added tests
Review feedback nits
Return empty map if not indeterminate
Use highWatermark to track inflight requests counter
Appease the linter
Clarify feature flag

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Fix typo in CLI flag description

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed auto-generated doc

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improve doc

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Simplify the design to update concurrency controller once the rule evaluation has done

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add more test cases to TestDependenciesEdgeCases

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added more test cases to TestDependenciesEdgeCases

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved RuleConcurrencyController interface doc

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Introduced sequentialRuleEvalController

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove superfluous nil check in Group.metrics

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* api: Serialize discovered and target labels into JSON directly (#13469)

Converted maps into labels.Labels to avoid a lot of copying of data which leads to very high memory consumption while opening the /service-discovery endpoint in the Prometheus UI

Signed-off-by: Leegin <114397475+Leegin-darknight@users.noreply.github.com>

* api: Serialize discovered labels into JSON directly in dropped targets (#13484)

Converted maps into labels.Labels to avoid a lot of copying of data which leads to very high memory consumption while opening the /service-discovery endpoint in the Prometheus UI

Signed-off-by: Leegin <114397475+Leegin-darknight@users.noreply.github.com>

* Add ShardedPostings() support to TSDB (#10421)

This PR is a reference implementation of the proposal described in #10420.

In addition to what described in #10420, in this PR I've introduced labels.StableHash(). The idea is to offer an hashing function which doesn't change over time, and that's used by query sharding in order to get a stable behaviour over time. The implementation of labels.StableHash() is the hashing function used by Prometheus before stringlabels, and what's used by Grafana Mimir for query sharding (because built before stringlabels was a thing).

Follow up work
As mentioned in #10420, if this PR is accepted I'm also open to upload another foundamental piece used by Grafana Mimir query sharding to accelerate the query execution: an optional, configurable and fast in-memory cache for the series hashes.

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* storage/remote: document why two benchmarks are skipped

One was silently doing nothing; one was doing something but the work
didn't go up linearly with iteration count.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Pod status changes not discovered by Kube Endpoints SD (#13337)

* fix(discovery/kubernetes/endpoints): react to changes on Pods because some modifications can occur on them without triggering an update on the related Endpoints (The Pod phase changing from Pending to Running e.g.).

---------

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
Co-authored-by: Guillermo Sanchez Gavier <gsanchez@newrelic.com>

* Small improvements, add const, remove copypasta (#8106)

Signed-off-by: Mikhail Fesenko <proggga@gmail.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>

* Proposal to improve FPointSlice and HPointSlice allocation. (#13448)

* Reusing points slice from previous series when the slice is under utilized
* Adding comments on the bench test

Signed-off-by: Alan Protasio <alanprot@gmail.com>

* lint

Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>

* go mod tidy

Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>

---------

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: Erik Sommer <ersotech@posteo.de>
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Arianna Vespri <arianna.vespri@yahoo.it>
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
Signed-off-by: Daniel Nicholls <daniel.nicholls@resdiary.com>
Signed-off-by: Daniel Kerbel <nmdanny@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
Signed-off-by: Kevin Mingtarja <kevin.mingtarja@gmail.com>
Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
Signed-off-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Augustin Husson <augustin.husson@amadeus.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Kumar Kalpadiptya Roy <kalpadiptya.roy@outlook.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: tyltr <tylitianrui@126.com>
Signed-off-by: Ted Robertson 10043369+tredondo@users.noreply.github.com
Signed-off-by: Ivan Babrou <github@ivan.computer>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Israel Blancas <iblancasa@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Signed-off-by: Goutham <gouthamve@gmail.com>
Signed-off-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
Signed-off-by: Ben Ye <benye@amazon.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: SuperQ <superq@gmail.com>
Signed-off-by: Ben Kochie <superq@gmail.com>
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
Signed-off-by: Paulin Todev <paulin.todev@gmail.com>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: Augustin Husson <husson.augustin@gmail.com>
Signed-off-by: Yury Moladau <yurymolodov@gmail.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Leegin <114397475+Leegin-darknight@users.noreply.github.com>
Signed-off-by: Mikhail Fesenko <proggga@gmail.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Co-authored-by: Julian Wiedmann <jwi@linux.ibm.com>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Erik Sommer <ersotech@posteo.de>
Co-authored-by: Linas Medziunas <linas.medziunas@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Arianna Vespri <arianna.vespri@yahoo.it>
Co-authored-by: machine424 <ayoubmrini424@gmail.com>
Co-authored-by: daniel-resdiary <109083091+daniel-resdiary@users.noreply.github.com>
Co-authored-by: Daniel Kerbel <nmdanny@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jan Fajerski <jfajersk@redhat.com>
Co-authored-by: Kevin Mingtarja <kevin.mingtarja@gmail.com>
Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com>
Co-authored-by: Marc Tudurí <marctc@protonmail.com>
Co-authored-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Co-authored-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Co-authored-by: Augustin Husson <husson.augustin@gmail.com>
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
Co-authored-by: zenador <zenador@users.noreply.github.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: Ben Kochie <superq@gmail.com>
Co-authored-by: Kumar Kalpadiptya Roy <kalpadiptya.roy@outlook.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: tyltr <tylitianrui@126.com>
Co-authored-by: Ted Robertson <10043369+tredondo@users.noreply.github.com>
Co-authored-by: Julien Pivotto <roidelapluie@o11y.eu>
Co-authored-by: Matthias Loibl <mail@matthiasloibl.com>
Co-authored-by: Ivan Babrou <github@ivan.computer>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Israel Blancas <iblancasa@gmail.com>
Co-authored-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Goutham <gouthamve@gmail.com>
Co-authored-by: Rewanth Tammana <22347290+rewanthtammana@users.noreply.github.com>
Co-authored-by: Chris Marchbanks <csmarchbanks@gmail.com>
Co-authored-by: Ben Ye <benye@amazon.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Matthieu MOREL <matthieu.morel35@gmail.com>
Co-authored-by: Paulin Todev <paulin.todev@gmail.com>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: Yury Molodov <yurymolodov@gmail.com>
Co-authored-by: Danny Kopping <danny.kopping@grafana.com>
Co-authored-by: Leegin <114397475+Leegin-darknight@users.noreply.github.com>
Co-authored-by: Guillermo Sanchez Gavier <gsanchez@newrelic.com>
Co-authored-by: Mikhail Fesenko <proggga@gmail.com>
Co-authored-by: Alan Protasio <alanprot@gmail.com>
2024-02-02 10:38:50 -08:00

1324 lines
37 KiB
Go

// Copyright 2017 The Prometheus Authors
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package tsdb
import (
"context"
"errors"
"fmt"
"math"
"strings"
"unicode/utf8"
"github.com/oklog/ulid"
"golang.org/x/exp/slices"
"github.com/prometheus/prometheus/model/histogram"
"github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/storage"
"github.com/prometheus/prometheus/tsdb/chunkenc"
"github.com/prometheus/prometheus/tsdb/chunks"
tsdb_errors "github.com/prometheus/prometheus/tsdb/errors"
"github.com/prometheus/prometheus/tsdb/index"
"github.com/prometheus/prometheus/tsdb/tombstones"
"github.com/prometheus/prometheus/util/annotations"
)
// Bitmap used by func isRegexMetaCharacter to check whether a character needs to be escaped.
var regexMetaCharacterBytes [16]byte
// isRegexMetaCharacter reports whether byte b needs to be escaped.
func isRegexMetaCharacter(b byte) bool {
return b < utf8.RuneSelf && regexMetaCharacterBytes[b%16]&(1<<(b/16)) != 0
}
func init() {
for _, b := range []byte(`.+*?()|[]{}^$`) {
regexMetaCharacterBytes[b%16] |= 1 << (b / 16)
}
}
type blockBaseQuerier struct {
blockID ulid.ULID
index IndexReader
chunks ChunkReader
tombstones tombstones.Reader
closed bool
mint, maxt int64
}
func newBlockBaseQuerier(b BlockReader, mint, maxt int64) (*blockBaseQuerier, error) {
indexr, err := b.Index()
if err != nil {
return nil, fmt.Errorf("open index reader: %w", err)
}
chunkr, err := b.Chunks()
if err != nil {
indexr.Close()
return nil, fmt.Errorf("open chunk reader: %w", err)
}
tombsr, err := b.Tombstones()
if err != nil {
indexr.Close()
chunkr.Close()
return nil, fmt.Errorf("open tombstone reader: %w", err)
}
if tombsr == nil {
tombsr = tombstones.NewMemTombstones()
}
return &blockBaseQuerier{
blockID: b.Meta().ULID,
mint: mint,
maxt: maxt,
index: indexr,
chunks: chunkr,
tombstones: tombsr,
}, nil
}
func (q *blockBaseQuerier) LabelValues(ctx context.Context, name string, matchers ...*labels.Matcher) ([]string, annotations.Annotations, error) {
res, err := q.index.SortedLabelValues(ctx, name, matchers...)
return res, nil, err
}
func (q *blockBaseQuerier) LabelNames(ctx context.Context, matchers ...*labels.Matcher) ([]string, annotations.Annotations, error) {
res, err := q.index.LabelNames(ctx, matchers...)
return res, nil, err
}
func (q *blockBaseQuerier) Close() error {
if q.closed {
return errors.New("block querier already closed")
}
errs := tsdb_errors.NewMulti(
q.index.Close(),
q.chunks.Close(),
q.tombstones.Close(),
)
q.closed = true
return errs.Err()
}
type blockQuerier struct {
*blockBaseQuerier
}
// NewBlockQuerier returns a querier against the block reader and requested min and max time range.
func NewBlockQuerier(b BlockReader, mint, maxt int64) (storage.Querier, error) {
q, err := newBlockBaseQuerier(b, mint, maxt)
if err != nil {
return nil, err
}
return &blockQuerier{blockBaseQuerier: q}, nil
}
func (q *blockQuerier) Select(ctx context.Context, sortSeries bool, hints *storage.SelectHints, ms ...*labels.Matcher) storage.SeriesSet {
mint := q.mint
maxt := q.maxt
disableTrimming := false
sharded := hints != nil && hints.ShardCount > 0
p, err := PostingsForMatchers(ctx, q.index, ms...)
if err != nil {
return storage.ErrSeriesSet(err)
}
if sharded {
p = q.index.ShardedPostings(p, hints.ShardIndex, hints.ShardCount)
}
if sortSeries {
p = q.index.SortedPostings(p)
}
if hints != nil {
mint = hints.Start
maxt = hints.End
disableTrimming = hints.DisableTrimming
if hints.Func == "series" {
// When you're only looking up metadata (for example series API), you don't need to load any chunks.
return newBlockSeriesSet(q.index, newNopChunkReader(), q.tombstones, p, mint, maxt, disableTrimming)
}
}
return newBlockSeriesSet(q.index, q.chunks, q.tombstones, p, mint, maxt, disableTrimming)
}
// blockChunkQuerier provides chunk querying access to a single block database.
type blockChunkQuerier struct {
*blockBaseQuerier
}
// NewBlockChunkQuerier returns a chunk querier against the block reader and requested min and max time range.
func NewBlockChunkQuerier(b BlockReader, mint, maxt int64) (storage.ChunkQuerier, error) {
q, err := newBlockBaseQuerier(b, mint, maxt)
if err != nil {
return nil, err
}
return &blockChunkQuerier{blockBaseQuerier: q}, nil
}
func (q *blockChunkQuerier) Select(ctx context.Context, sortSeries bool, hints *storage.SelectHints, ms ...*labels.Matcher) storage.ChunkSeriesSet {
mint := q.mint
maxt := q.maxt
disableTrimming := false
sharded := hints != nil && hints.ShardCount > 0
if hints != nil {
mint = hints.Start
maxt = hints.End
disableTrimming = hints.DisableTrimming
}
p, err := PostingsForMatchers(ctx, q.index, ms...)
if err != nil {
return storage.ErrChunkSeriesSet(err)
}
if sharded {
p = q.index.ShardedPostings(p, hints.ShardIndex, hints.ShardCount)
}
if sortSeries {
p = q.index.SortedPostings(p)
}
return NewBlockChunkSeriesSet(q.blockID, q.index, q.chunks, q.tombstones, p, mint, maxt, disableTrimming)
}
func findSetMatches(pattern string) []string {
// Return empty matches if the wrapper from Prometheus is missing.
if len(pattern) < 6 || pattern[:4] != "^(?:" || pattern[len(pattern)-2:] != ")$" {
return nil
}
escaped := false
sets := []*strings.Builder{{}}
init := 4
end := len(pattern) - 2
// If the regex is wrapped in a group we can remove the first and last parentheses
if pattern[init] == '(' && pattern[end-1] == ')' {
init++
end--
}
for i := init; i < end; i++ {
if escaped {
switch {
case isRegexMetaCharacter(pattern[i]):
sets[len(sets)-1].WriteByte(pattern[i])
case pattern[i] == '\\':
sets[len(sets)-1].WriteByte('\\')
default:
return nil
}
escaped = false
} else {
switch {
case isRegexMetaCharacter(pattern[i]):
if pattern[i] == '|' {
sets = append(sets, &strings.Builder{})
} else {
return nil
}
case pattern[i] == '\\':
escaped = true
default:
sets[len(sets)-1].WriteByte(pattern[i])
}
}
}
matches := make([]string, 0, len(sets))
for _, s := range sets {
if s.Len() > 0 {
matches = append(matches, s.String())
}
}
return matches
}
// PostingsForMatchers assembles a single postings iterator against the index reader
// based on the given matchers. The resulting postings are not ordered by series.
func PostingsForMatchers(ctx context.Context, ix IndexReader, ms ...*labels.Matcher) (index.Postings, error) {
var its, notIts []index.Postings
// See which label must be non-empty.
// Optimization for case like {l=~".", l!="1"}.
labelMustBeSet := make(map[string]bool, len(ms))
for _, m := range ms {
if !m.Matches("") {
labelMustBeSet[m.Name] = true
}
}
isSubtractingMatcher := func(m *labels.Matcher) bool {
if !labelMustBeSet[m.Name] {
return true
}
return (m.Type == labels.MatchNotEqual || m.Type == labels.MatchNotRegexp) && m.Matches("")
}
hasSubtractingMatchers, hasIntersectingMatchers := false, false
for _, m := range ms {
if isSubtractingMatcher(m) {
hasSubtractingMatchers = true
} else {
hasIntersectingMatchers = true
}
}
if hasSubtractingMatchers && !hasIntersectingMatchers {
// If there's nothing to subtract from, add in everything and remove the notIts later.
// We prefer to get AllPostings so that the base of subtraction (i.e. allPostings)
// doesn't include series that may be added to the index reader during this function call.
k, v := index.AllPostingsKey()
allPostings, err := ix.Postings(ctx, k, v)
if err != nil {
return nil, err
}
its = append(its, allPostings)
}
// Sort matchers to have the intersecting matchers first.
// This way the base for subtraction is smaller and
// there is no chance that the set we subtract from
// contains postings of series that didn't exist when
// we constructed the set we subtract by.
slices.SortStableFunc(ms, func(i, j *labels.Matcher) int {
if !isSubtractingMatcher(i) && isSubtractingMatcher(j) {
return -1
}
return +1
})
for _, m := range ms {
if ctx.Err() != nil {
return nil, ctx.Err()
}
switch {
case m.Name == "" && m.Value == "": // Special-case for AllPostings, used in tests at least.
k, v := index.AllPostingsKey()
allPostings, err := ix.Postings(ctx, k, v)
if err != nil {
return nil, err
}
its = append(its, allPostings)
case labelMustBeSet[m.Name]:
// If this matcher must be non-empty, we can be smarter.
matchesEmpty := m.Matches("")
isNot := m.Type == labels.MatchNotEqual || m.Type == labels.MatchNotRegexp
switch {
case isNot && matchesEmpty: // l!="foo"
// If the label can't be empty and is a Not and the inner matcher
// doesn't match empty, then subtract it out at the end.
inverse, err := m.Inverse()
if err != nil {
return nil, err
}
it, err := postingsForMatcher(ctx, ix, inverse)
if err != nil {
return nil, err
}
notIts = append(notIts, it)
case isNot && !matchesEmpty: // l!=""
// If the label can't be empty and is a Not, but the inner matcher can
// be empty we need to use inversePostingsForMatcher.
inverse, err := m.Inverse()
if err != nil {
return nil, err
}
it, err := inversePostingsForMatcher(ctx, ix, inverse)
if err != nil {
return nil, err
}
if index.IsEmptyPostingsType(it) {
return index.EmptyPostings(), nil
}
its = append(its, it)
default: // l="a"
// Non-Not matcher, use normal postingsForMatcher.
it, err := postingsForMatcher(ctx, ix, m)
if err != nil {
return nil, err
}
if index.IsEmptyPostingsType(it) {
return index.EmptyPostings(), nil
}
its = append(its, it)
}
default: // l=""
// If the matchers for a labelname selects an empty value, it selects all
// the series which don't have the label name set too. See:
// https://github.com/prometheus/prometheus/issues/3575 and
// https://github.com/prometheus/prometheus/pull/3578#issuecomment-351653555
it, err := inversePostingsForMatcher(ctx, ix, m)
if err != nil {
return nil, err
}
notIts = append(notIts, it)
}
}
it := index.Intersect(its...)
for _, n := range notIts {
it = index.Without(it, n)
}
return it, nil
}
func postingsForMatcher(ctx context.Context, ix IndexReader, m *labels.Matcher) (index.Postings, error) {
// This method will not return postings for missing labels.
// Fast-path for equal matching.
if m.Type == labels.MatchEqual {
return ix.Postings(ctx, m.Name, m.Value)
}
// Fast-path for set matching.
if m.Type == labels.MatchRegexp {
setMatches := findSetMatches(m.GetRegexString())
if len(setMatches) > 0 {
return ix.Postings(ctx, m.Name, setMatches...)
}
}
vals, err := ix.LabelValues(ctx, m.Name)
if err != nil {
return nil, err
}
var res []string
for _, val := range vals {
if m.Matches(val) {
res = append(res, val)
}
}
if len(res) == 0 {
return index.EmptyPostings(), nil
}
return ix.Postings(ctx, m.Name, res...)
}
// inversePostingsForMatcher returns the postings for the series with the label name set but not matching the matcher.
func inversePostingsForMatcher(ctx context.Context, ix IndexReader, m *labels.Matcher) (index.Postings, error) {
// Fast-path for MatchNotRegexp matching.
// Inverse of a MatchNotRegexp is MatchRegexp (double negation).
// Fast-path for set matching.
if m.Type == labels.MatchNotRegexp {
setMatches := findSetMatches(m.GetRegexString())
if len(setMatches) > 0 {
return ix.Postings(ctx, m.Name, setMatches...)
}
}
// Fast-path for MatchNotEqual matching.
// Inverse of a MatchNotEqual is MatchEqual (double negation).
if m.Type == labels.MatchNotEqual {
return ix.Postings(ctx, m.Name, m.Value)
}
vals, err := ix.LabelValues(ctx, m.Name)
if err != nil {
return nil, err
}
var res []string
// If the inverse match is ="", we just want all the values.
if m.Type == labels.MatchEqual && m.Value == "" {
res = vals
} else {
for _, val := range vals {
if !m.Matches(val) {
res = append(res, val)
}
}
}
return ix.Postings(ctx, m.Name, res...)
}
func labelValuesWithMatchers(ctx context.Context, r IndexReader, name string, matchers ...*labels.Matcher) ([]string, error) {
allValues, err := r.LabelValues(ctx, name)
if err != nil {
return nil, fmt.Errorf("fetching values of label %s: %w", name, err)
}
// If we have a matcher for the label name, we can filter out values that don't match
// before we fetch postings. This is especially useful for labels with many values.
// e.g. __name__ with a selector like {__name__="xyz"}
hasMatchersForOtherLabels := false
for _, m := range matchers {
if m.Name != name {
hasMatchersForOtherLabels = true
continue
}
// re-use the allValues slice to avoid allocations
// this is safe because the iteration is always ahead of the append
filteredValues := allValues[:0]
for _, v := range allValues {
if m.Matches(v) {
filteredValues = append(filteredValues, v)
}
}
allValues = filteredValues
}
if len(allValues) == 0 {
return nil, nil
}
// If we don't have any matchers for other labels, then we're done.
if !hasMatchersForOtherLabels {
return allValues, nil
}
p, err := PostingsForMatchers(ctx, r, matchers...)
if err != nil {
return nil, fmt.Errorf("fetching postings for matchers: %w", err)
}
valuesPostings := make([]index.Postings, len(allValues))
for i, value := range allValues {
valuesPostings[i], err = r.Postings(ctx, name, value)
if err != nil {
return nil, fmt.Errorf("fetching postings for %s=%q: %w", name, value, err)
}
}
indexes, err := index.FindIntersectingPostings(p, valuesPostings)
if err != nil {
return nil, fmt.Errorf("intersecting postings: %w", err)
}
values := make([]string, 0, len(indexes))
for _, idx := range indexes {
values = append(values, allValues[idx])
}
return values, nil
}
func labelNamesWithMatchers(ctx context.Context, r IndexReader, matchers ...*labels.Matcher) ([]string, error) {
p, err := PostingsForMatchers(ctx, r, matchers...)
if err != nil {
return nil, err
}
var postings []storage.SeriesRef
for p.Next() {
postings = append(postings, p.At())
}
if err := p.Err(); err != nil {
return nil, fmt.Errorf("postings for label names with matchers: %w", err)
}
return r.LabelNamesFor(ctx, postings...)
}
// seriesData, used inside other iterators, are updated when we move from one series to another.
type seriesData struct {
chks []chunks.Meta
intervals tombstones.Intervals
labels labels.Labels
}
// Labels implements part of storage.Series and storage.ChunkSeries.
func (s *seriesData) Labels() labels.Labels { return s.labels }
// blockBaseSeriesSet allows to iterate over all series in the single block.
// Iterated series are trimmed with given min and max time as well as tombstones.
// See newBlockSeriesSet and NewBlockChunkSeriesSet to use it for either sample or chunk iterating.
type blockBaseSeriesSet struct {
blockID ulid.ULID
p index.Postings
index IndexReader
chunks ChunkReader
tombstones tombstones.Reader
mint, maxt int64
disableTrimming bool
curr seriesData
bufChks []chunks.Meta
builder labels.ScratchBuilder
err error
}
func (b *blockBaseSeriesSet) Next() bool {
for b.p.Next() {
if err := b.index.Series(b.p.At(), &b.builder, &b.bufChks); err != nil {
// Postings may be stale. Skip if no underlying series exists.
if errors.Is(err, storage.ErrNotFound) {
continue
}
b.err = fmt.Errorf("get series %d: %w", b.p.At(), err)
return false
}
if len(b.bufChks) == 0 {
continue
}
intervals, err := b.tombstones.Get(b.p.At())
if err != nil {
b.err = fmt.Errorf("get tombstones: %w", err)
return false
}
// NOTE:
// * block time range is half-open: [meta.MinTime, meta.MaxTime).
// * chunks are both closed: [chk.MinTime, chk.MaxTime].
// * requested time ranges are closed: [req.Start, req.End].
var trimFront, trimBack bool
// Copy chunks as iterables are reusable.
// Count those in range to size allocation (roughly - ignoring tombstones).
nChks := 0
for _, chk := range b.bufChks {
if !(chk.MaxTime < b.mint || chk.MinTime > b.maxt) {
nChks++
}
}
chks := make([]chunks.Meta, 0, nChks)
// Prefilter chunks and pick those which are not entirely deleted or totally outside of the requested range.
for _, chk := range b.bufChks {
if chk.MaxTime < b.mint {
continue
}
if chk.MinTime > b.maxt {
continue
}
if (tombstones.Interval{Mint: chk.MinTime, Maxt: chk.MaxTime}.IsSubrange(intervals)) {
continue
}
chks = append(chks, chk)
// If still not entirely deleted, check if trim is needed based on requested time range.
if !b.disableTrimming {
if chk.MinTime < b.mint {
trimFront = true
}
if chk.MaxTime > b.maxt {
trimBack = true
}
}
}
if len(chks) == 0 {
continue
}
if trimFront {
intervals = intervals.Add(tombstones.Interval{Mint: math.MinInt64, Maxt: b.mint - 1})
}
if trimBack {
intervals = intervals.Add(tombstones.Interval{Mint: b.maxt + 1, Maxt: math.MaxInt64})
}
b.curr.labels = b.builder.Labels()
b.curr.chks = chks
b.curr.intervals = intervals
return true
}
return false
}
func (b *blockBaseSeriesSet) Err() error {
if b.err != nil {
return b.err
}
return b.p.Err()
}
func (b *blockBaseSeriesSet) Warnings() annotations.Annotations { return nil }
// populateWithDelGenericSeriesIterator allows to iterate over given chunk
// metas. In each iteration it ensures that chunks are trimmed based on given
// tombstones interval if any.
//
// populateWithDelGenericSeriesIterator assumes that chunks that would be fully
// removed by intervals are filtered out in previous phase.
//
// On each iteration currMeta is available. If currDelIter is not nil, it
// means that the chunk in currMeta is invalid and a chunk rewrite is needed,
// for which currDelIter should be used.
type populateWithDelGenericSeriesIterator struct {
blockID ulid.ULID
cr ChunkReader
// metas are expected to be sorted by minTime and should be related to
// the same, single series.
// It's possible for a single chunks.Meta to refer to multiple chunks.
// cr.ChunkOrIterator() would return an iterable and a nil chunk in this
// case.
metas []chunks.Meta
i int // Index into metas; -1 if not started yet.
err error
bufIter DeletedIterator // Retained for memory re-use. currDelIter may point here.
intervals tombstones.Intervals
currDelIter chunkenc.Iterator
// currMeta is the current chunks.Meta from metas. currMeta.Chunk is set to
// the chunk returned from cr.ChunkOrIterable(). As that can return a nil
// chunk, currMeta.Chunk is not always guaranteed to be set.
currMeta chunks.Meta
}
func (p *populateWithDelGenericSeriesIterator) reset(blockID ulid.ULID, cr ChunkReader, chks []chunks.Meta, intervals tombstones.Intervals) {
p.blockID = blockID
p.cr = cr
p.metas = chks
p.i = -1
p.err = nil
// Note we don't touch p.bufIter.Iter; it is holding on to an iterator we might reuse in next().
p.bufIter.Intervals = p.bufIter.Intervals[:0]
p.intervals = intervals
p.currDelIter = nil
p.currMeta = chunks.Meta{}
}
// If copyHeadChunk is true, then the head chunk (i.e. the in-memory chunk of the TSDB)
// is deep copied to avoid races between reads and copying chunk bytes.
// However, if the deletion intervals overlaps with the head chunk, then the head chunk is
// not copied irrespective of copyHeadChunk because it will be re-encoded later anyway.
func (p *populateWithDelGenericSeriesIterator) next(copyHeadChunk bool) bool {
if p.err != nil || p.i >= len(p.metas)-1 {
return false
}
p.i++
p.currMeta = p.metas[p.i]
p.bufIter.Intervals = p.bufIter.Intervals[:0]
for _, interval := range p.intervals {
if p.currMeta.OverlapsClosedInterval(interval.Mint, interval.Maxt) {
p.bufIter.Intervals = p.bufIter.Intervals.Add(interval)
}
}
hcr, ok := p.cr.(*headChunkReader)
var iterable chunkenc.Iterable
if ok && copyHeadChunk && len(p.bufIter.Intervals) == 0 {
// ChunkWithCopy will copy the head chunk.
var maxt int64
p.currMeta.Chunk, maxt, p.err = hcr.ChunkWithCopy(p.currMeta)
// For the in-memory head chunk the index reader sets maxt as MaxInt64. We fix it here.
p.currMeta.MaxTime = maxt
} else {
p.currMeta.Chunk, iterable, p.err = p.cr.ChunkOrIterable(p.currMeta)
}
if p.err != nil {
p.err = fmt.Errorf("cannot populate chunk %d from block %s: %w", p.currMeta.Ref, p.blockID.String(), p.err)
return false
}
// Use the single chunk if possible.
if p.currMeta.Chunk != nil {
if len(p.bufIter.Intervals) == 0 {
// If there is no overlap with deletion intervals and a single chunk is
// returned, we can take chunk as it is.
p.currDelIter = nil
return true
}
// Otherwise we need to iterate over the samples in the single chunk
// and create new chunks.
p.bufIter.Iter = p.currMeta.Chunk.Iterator(p.bufIter.Iter)
p.currDelIter = &p.bufIter
return true
}
// Otherwise, use the iterable to create an iterator.
p.bufIter.Iter = iterable.Iterator(p.bufIter.Iter)
p.currDelIter = &p.bufIter
return true
}
func (p *populateWithDelGenericSeriesIterator) Err() error { return p.err }
type blockSeriesEntry struct {
chunks ChunkReader
blockID ulid.ULID
seriesData
}
func (s *blockSeriesEntry) Iterator(it chunkenc.Iterator) chunkenc.Iterator {
pi, ok := it.(*populateWithDelSeriesIterator)
if !ok {
pi = &populateWithDelSeriesIterator{}
}
pi.reset(s.blockID, s.chunks, s.chks, s.intervals)
return pi
}
type chunkSeriesEntry struct {
chunks ChunkReader
blockID ulid.ULID
seriesData
}
func (s *chunkSeriesEntry) Iterator(it chunks.Iterator) chunks.Iterator {
pi, ok := it.(*populateWithDelChunkSeriesIterator)
if !ok {
pi = &populateWithDelChunkSeriesIterator{}
}
pi.reset(s.blockID, s.chunks, s.chks, s.intervals)
return pi
}
// populateWithDelSeriesIterator allows to iterate over samples for the single series.
type populateWithDelSeriesIterator struct {
populateWithDelGenericSeriesIterator
curr chunkenc.Iterator
}
func (p *populateWithDelSeriesIterator) reset(blockID ulid.ULID, cr ChunkReader, chks []chunks.Meta, intervals tombstones.Intervals) {
p.populateWithDelGenericSeriesIterator.reset(blockID, cr, chks, intervals)
p.curr = nil
}
func (p *populateWithDelSeriesIterator) Next() chunkenc.ValueType {
if p.curr != nil {
if valueType := p.curr.Next(); valueType != chunkenc.ValNone {
return valueType
}
}
for p.next(false) {
if p.currDelIter != nil {
p.curr = p.currDelIter
} else {
p.curr = p.currMeta.Chunk.Iterator(p.curr)
}
if valueType := p.curr.Next(); valueType != chunkenc.ValNone {
return valueType
}
}
return chunkenc.ValNone
}
func (p *populateWithDelSeriesIterator) Seek(t int64) chunkenc.ValueType {
if p.curr != nil {
if valueType := p.curr.Seek(t); valueType != chunkenc.ValNone {
return valueType
}
}
for p.Next() != chunkenc.ValNone {
if valueType := p.curr.Seek(t); valueType != chunkenc.ValNone {
return valueType
}
}
return chunkenc.ValNone
}
func (p *populateWithDelSeriesIterator) At() (int64, float64) {
return p.curr.At()
}
func (p *populateWithDelSeriesIterator) AtHistogram(h *histogram.Histogram) (int64, *histogram.Histogram) {
return p.curr.AtHistogram(h)
}
func (p *populateWithDelSeriesIterator) AtFloatHistogram(fh *histogram.FloatHistogram) (int64, *histogram.FloatHistogram) {
return p.curr.AtFloatHistogram(fh)
}
func (p *populateWithDelSeriesIterator) AtT() int64 {
return p.curr.AtT()
}
func (p *populateWithDelSeriesIterator) Err() error {
if err := p.populateWithDelGenericSeriesIterator.Err(); err != nil {
return err
}
if p.curr != nil {
return p.curr.Err()
}
return nil
}
type populateWithDelChunkSeriesIterator struct {
populateWithDelGenericSeriesIterator
// currMetaWithChunk is current meta with its chunk field set. This meta
// is guaranteed to map to a single chunk. This differs from
// populateWithDelGenericSeriesIterator.currMeta as that
// could refer to multiple chunks.
currMetaWithChunk chunks.Meta
// chunksFromIterable stores the chunks created from iterating through
// the iterable returned by cr.ChunkOrIterable() (with deleted samples
// removed).
chunksFromIterable []chunks.Meta
chunksFromIterableIdx int
}
func (p *populateWithDelChunkSeriesIterator) reset(blockID ulid.ULID, cr ChunkReader, chks []chunks.Meta, intervals tombstones.Intervals) {
p.populateWithDelGenericSeriesIterator.reset(blockID, cr, chks, intervals)
p.currMetaWithChunk = chunks.Meta{}
p.chunksFromIterable = p.chunksFromIterable[:0]
p.chunksFromIterableIdx = -1
}
func (p *populateWithDelChunkSeriesIterator) Next() bool {
if p.currMeta.Chunk == nil {
// If we've been creating chunks from the iterable, check if there are
// any more chunks to iterate through.
if p.chunksFromIterableIdx < len(p.chunksFromIterable)-1 {
p.chunksFromIterableIdx++
p.currMetaWithChunk = p.chunksFromIterable[p.chunksFromIterableIdx]
return true
}
}
// Move to the next chunk/deletion iterator.
// This is a for loop as if the current p.currDelIter returns no samples
// (which means a chunk won't be created), there still might be more
// samples/chunks from the rest of p.metas.
for p.next(true) {
if p.currDelIter == nil {
p.currMetaWithChunk = p.currMeta
return true
}
if p.currMeta.Chunk != nil {
// If ChunkOrIterable() returned a non-nil chunk, the samples in
// p.currDelIter will only form one chunk, as the only change
// p.currDelIter might make is deleting some samples.
if p.populateCurrForSingleChunk() {
return true
}
} else {
// If ChunkOrIterable() returned an iterable, multiple chunks may be
// created from the samples in p.currDelIter.
if p.populateChunksFromIterable() {
return true
}
}
}
return false
}
// populateCurrForSingleChunk sets the fields within p.currMetaWithChunk. This
// should be called if the samples in p.currDelIter only form one chunk.
func (p *populateWithDelChunkSeriesIterator) populateCurrForSingleChunk() bool {
valueType := p.currDelIter.Next()
if valueType == chunkenc.ValNone {
if err := p.currDelIter.Err(); err != nil {
p.err = fmt.Errorf("iterate chunk while re-encoding: %w", err)
}
return false
}
p.currMetaWithChunk.MinTime = p.currDelIter.AtT()
// Re-encode the chunk if iterator is provided. This means that it has
// some samples to be deleted or chunk is opened.
var (
newChunk chunkenc.Chunk
app chunkenc.Appender
t int64
err error
)
switch valueType {
case chunkenc.ValHistogram:
newChunk = chunkenc.NewHistogramChunk()
if app, err = newChunk.Appender(); err != nil {
break
}
for vt := valueType; vt != chunkenc.ValNone; vt = p.currDelIter.Next() {
if vt != chunkenc.ValHistogram {
err = fmt.Errorf("found value type %v in histogram chunk", vt)
break
}
var h *histogram.Histogram
t, h = p.currDelIter.AtHistogram(nil)
_, _, app, err = app.AppendHistogram(nil, t, h, true)
if err != nil {
break
}
}
case chunkenc.ValFloat:
newChunk = chunkenc.NewXORChunk()
if app, err = newChunk.Appender(); err != nil {
break
}
for vt := valueType; vt != chunkenc.ValNone; vt = p.currDelIter.Next() {
if vt != chunkenc.ValFloat {
err = fmt.Errorf("found value type %v in float chunk", vt)
break
}
var v float64
t, v = p.currDelIter.At()
app.Append(t, v)
}
case chunkenc.ValFloatHistogram:
newChunk = chunkenc.NewFloatHistogramChunk()
if app, err = newChunk.Appender(); err != nil {
break
}
for vt := valueType; vt != chunkenc.ValNone; vt = p.currDelIter.Next() {
if vt != chunkenc.ValFloatHistogram {
err = fmt.Errorf("found value type %v in histogram chunk", vt)
break
}
var h *histogram.FloatHistogram
t, h = p.currDelIter.AtFloatHistogram(nil)
_, _, app, err = app.AppendFloatHistogram(nil, t, h, true)
if err != nil {
break
}
}
default:
err = fmt.Errorf("populateCurrForSingleChunk: value type %v unsupported", valueType)
}
if err != nil {
p.err = fmt.Errorf("iterate chunk while re-encoding: %w", err)
return false
}
if err := p.currDelIter.Err(); err != nil {
p.err = fmt.Errorf("iterate chunk while re-encoding: %w", err)
return false
}
p.currMetaWithChunk.Chunk = newChunk
p.currMetaWithChunk.MaxTime = t
return true
}
// populateChunksFromIterable reads the samples from currDelIter to create
// chunks for chunksFromIterable. It also sets p.currMetaWithChunk to the first
// chunk.
func (p *populateWithDelChunkSeriesIterator) populateChunksFromIterable() bool {
p.chunksFromIterable = p.chunksFromIterable[:0]
p.chunksFromIterableIdx = -1
firstValueType := p.currDelIter.Next()
if firstValueType == chunkenc.ValNone {
if err := p.currDelIter.Err(); err != nil {
p.err = fmt.Errorf("populateChunksFromIterable: no samples could be read: %w", err)
return false
}
return false
}
var (
// t is the timestamp for the current sample.
t int64
cmint int64
cmaxt int64
currentChunk chunkenc.Chunk
app chunkenc.Appender
newChunk chunkenc.Chunk
recoded bool
err error
)
prevValueType := chunkenc.ValNone
for currentValueType := firstValueType; currentValueType != chunkenc.ValNone; currentValueType = p.currDelIter.Next() {
// Check if the encoding has changed (i.e. we need to create a new
// chunk as chunks can't have multiple encoding types).
// For the first sample, the following condition will always be true as
// ValNoneNone != ValFloat | ValHistogram | ValFloatHistogram.
if currentValueType != prevValueType {
if prevValueType != chunkenc.ValNone {
p.chunksFromIterable = append(p.chunksFromIterable, chunks.Meta{Chunk: currentChunk, MinTime: cmint, MaxTime: cmaxt})
}
cmint = p.currDelIter.AtT()
if currentChunk, err = currentValueType.NewChunk(); err != nil {
break
}
if app, err = currentChunk.Appender(); err != nil {
break
}
}
switch currentValueType {
case chunkenc.ValFloat:
{
var v float64
t, v = p.currDelIter.At()
app.Append(t, v)
}
case chunkenc.ValHistogram:
{
var v *histogram.Histogram
t, v = p.currDelIter.AtHistogram(nil)
// No need to set prevApp as AppendHistogram will set the
// counter reset header for the appender that's returned.
newChunk, recoded, app, err = app.AppendHistogram(nil, t, v, false)
}
case chunkenc.ValFloatHistogram:
{
var v *histogram.FloatHistogram
t, v = p.currDelIter.AtFloatHistogram(nil)
// No need to set prevApp as AppendHistogram will set the
// counter reset header for the appender that's returned.
newChunk, recoded, app, err = app.AppendFloatHistogram(nil, t, v, false)
}
}
if err != nil {
break
}
if newChunk != nil {
if !recoded {
p.chunksFromIterable = append(p.chunksFromIterable, chunks.Meta{Chunk: currentChunk, MinTime: cmint, MaxTime: cmaxt})
}
currentChunk = newChunk
cmint = t
}
cmaxt = t
prevValueType = currentValueType
}
if err != nil {
p.err = fmt.Errorf("populateChunksFromIterable: error when writing new chunks: %w", err)
return false
}
if err = p.currDelIter.Err(); err != nil {
p.err = fmt.Errorf("populateChunksFromIterable: currDelIter error when writing new chunks: %w", err)
return false
}
if prevValueType != chunkenc.ValNone {
p.chunksFromIterable = append(p.chunksFromIterable, chunks.Meta{Chunk: currentChunk, MinTime: cmint, MaxTime: cmaxt})
}
if len(p.chunksFromIterable) == 0 {
return false
}
p.currMetaWithChunk = p.chunksFromIterable[0]
p.chunksFromIterableIdx = 0
return true
}
func (p *populateWithDelChunkSeriesIterator) At() chunks.Meta { return p.currMetaWithChunk }
// blockSeriesSet allows to iterate over sorted, populated series with applied tombstones.
// Series with all deleted chunks are still present as Series with no samples.
// Samples from chunks are also trimmed to requested min and max time.
type blockSeriesSet struct {
blockBaseSeriesSet
}
func newBlockSeriesSet(i IndexReader, c ChunkReader, t tombstones.Reader, p index.Postings, mint, maxt int64, disableTrimming bool) storage.SeriesSet {
return &blockSeriesSet{
blockBaseSeriesSet{
index: i,
chunks: c,
tombstones: t,
p: p,
mint: mint,
maxt: maxt,
disableTrimming: disableTrimming,
},
}
}
func (b *blockSeriesSet) At() storage.Series {
// At can be looped over before iterating, so save the current values locally.
return &blockSeriesEntry{
chunks: b.chunks,
blockID: b.blockID,
seriesData: b.curr,
}
}
// blockChunkSeriesSet allows to iterate over sorted, populated series with applied tombstones.
// Series with all deleted chunks are still present as Labelled iterator with no chunks.
// Chunks are also trimmed to requested [min and max] (keeping samples with min and max timestamps).
type blockChunkSeriesSet struct {
blockBaseSeriesSet
}
func NewBlockChunkSeriesSet(id ulid.ULID, i IndexReader, c ChunkReader, t tombstones.Reader, p index.Postings, mint, maxt int64, disableTrimming bool) storage.ChunkSeriesSet {
return &blockChunkSeriesSet{
blockBaseSeriesSet{
blockID: id,
index: i,
chunks: c,
tombstones: t,
p: p,
mint: mint,
maxt: maxt,
disableTrimming: disableTrimming,
},
}
}
func (b *blockChunkSeriesSet) At() storage.ChunkSeries {
// At can be looped over before iterating, so save the current values locally.
return &chunkSeriesEntry{
chunks: b.chunks,
blockID: b.blockID,
seriesData: b.curr,
}
}
// NewMergedStringIter returns string iterator that allows to merge symbols on demand and stream result.
func NewMergedStringIter(a, b index.StringIter) index.StringIter {
return &mergedStringIter{a: a, b: b, aok: a.Next(), bok: b.Next()}
}
type mergedStringIter struct {
a index.StringIter
b index.StringIter
aok, bok bool
cur string
err error
}
func (m *mergedStringIter) Next() bool {
if (!m.aok && !m.bok) || (m.Err() != nil) {
return false
}
switch {
case !m.aok:
m.cur = m.b.At()
m.bok = m.b.Next()
m.err = m.b.Err()
case !m.bok:
m.cur = m.a.At()
m.aok = m.a.Next()
m.err = m.a.Err()
case m.b.At() > m.a.At():
m.cur = m.a.At()
m.aok = m.a.Next()
m.err = m.a.Err()
case m.a.At() > m.b.At():
m.cur = m.b.At()
m.bok = m.b.Next()
m.err = m.b.Err()
default: // Equal.
m.cur = m.b.At()
m.aok = m.a.Next()
m.err = m.a.Err()
m.bok = m.b.Next()
if m.err == nil {
m.err = m.b.Err()
}
}
return true
}
func (m mergedStringIter) At() string { return m.cur }
func (m mergedStringIter) Err() error {
return m.err
}
// DeletedIterator wraps chunk Iterator and makes sure any deleted metrics are not returned.
type DeletedIterator struct {
// Iter is an Iterator to be wrapped.
Iter chunkenc.Iterator
// Intervals are the deletion intervals.
Intervals tombstones.Intervals
}
func (it *DeletedIterator) At() (int64, float64) {
return it.Iter.At()
}
func (it *DeletedIterator) AtHistogram(h *histogram.Histogram) (int64, *histogram.Histogram) {
t, h := it.Iter.AtHistogram(h)
return t, h
}
func (it *DeletedIterator) AtFloatHistogram(fh *histogram.FloatHistogram) (int64, *histogram.FloatHistogram) {
t, h := it.Iter.AtFloatHistogram(fh)
return t, h
}
func (it *DeletedIterator) AtT() int64 {
return it.Iter.AtT()
}
func (it *DeletedIterator) Seek(t int64) chunkenc.ValueType {
if it.Iter.Err() != nil {
return chunkenc.ValNone
}
valueType := it.Iter.Seek(t)
if valueType == chunkenc.ValNone {
return chunkenc.ValNone
}
// Now double check if the entry falls into a deleted interval.
ts := it.AtT()
for _, itv := range it.Intervals {
if ts < itv.Mint {
return valueType
}
if ts > itv.Maxt {
it.Intervals = it.Intervals[1:]
continue
}
// We're in the middle of an interval, we can now call Next().
return it.Next()
}
// The timestamp is greater than all the deleted intervals.
return valueType
}
func (it *DeletedIterator) Next() chunkenc.ValueType {
Outer:
for valueType := it.Iter.Next(); valueType != chunkenc.ValNone; valueType = it.Iter.Next() {
ts := it.AtT()
for _, tr := range it.Intervals {
if tr.InBounds(ts) {
continue Outer
}
if ts <= tr.Maxt {
return valueType
}
it.Intervals = it.Intervals[1:]
}
return valueType
}
return chunkenc.ValNone
}
func (it *DeletedIterator) Err() error { return it.Iter.Err() }
type nopChunkReader struct {
emptyChunk chunkenc.Chunk
}
func newNopChunkReader() ChunkReader {
return nopChunkReader{
emptyChunk: chunkenc.NewXORChunk(),
}
}
func (cr nopChunkReader) ChunkOrIterable(chunks.Meta) (chunkenc.Chunk, chunkenc.Iterable, error) {
return cr.emptyChunk, nil, nil
}
func (cr nopChunkReader) Close() error { return nil }