This is technically BREAKING CHANGE, but it was like this from the beginning: I just notice that we rely in
Prometheus on remote read being sorted. This is because we use selected data from remote reads in MergeSeriesSet
which rely on sorting.
I found during work on https://github.com/prometheus/prometheus/pull/5882 that
we do so many repetitions because of this, for not good reason. I think
I found a good balance between convenience and readability with just one method.
Smaller the interface = better.
Also I don't know what TestSelectSorted was testing, but now it's testing sorting.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
I think the previous behavior is problematic as it will leave
`memSeries` around that still have `pendingCommit` set to `true`.
The only case where this can happen in this code path is a failure to
write to the WAL, in which case we are probably in trouble anyway. I
believe, however, we should still try to do the right thing and do the
full rollback. This will implicitly try to write to the WAL again, but
this time without samples, which may even succeed. (But we propagate
the previous error in any case.)
This also adds `a.head.putSeriesBuffer(a.sampleSeries)` to Rollback,
which was previously missing.
Signed-off-by: beorn7 <beorn@grafana.com>
This is part of https://github.com/prometheus/prometheus/pull/5882 that can be done to simplify things.
All todos I added will be fixed in follow up PRs.
* querier.Querier, querier.Appender, querier.SeriesSet, and querier.Series interfaces merged
with storage interface.go. All imports that.
* querier.SeriesIterator replaced by chunkenc.Iterator
* Added chunkenc.Iterator.Seek method and tests for xor implementation (?)
* Since we properly handle SelectParams for Select methods I adjusted min max
based on that. This should help in terms of performance for queries with functions like offset.
* added Seek to deletedIterator and test.
* storage/tsdb was removed as it was only a unnecessary glue with incompatible structs.
No logic was changed, only different source of abstractions, so no need for benchmarks.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
* Add Start/End to SelectParams
* Make remote read use the new selectParams for start/end
This commit will continue sending the start/end time of the remote read
query as the overarching promql time and the specific range of data that
the query is intersted in receiving a response to is now part of the
ReadHints (upstream discussion in #4226).
* Remove unused vendored code
The genproto.sh script was updated, but the code wasn't regenerated.
This simply removes the vendored deps that are no longer part of the
codegen output.
Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
Federation makes use of dedupedSeriesSet to merge SeriesSets for every
query into one output stream. If many match[] arguments are provided,
many dedupedSeriesSet objects will get chained. This has the downside of
causing a potential O(n*k) running time, where n is the number of series
and k the number of match[] arguments.
In the mean time, the storage package provides a mergeSeriesSet that
accomplishes the same with an O(n*log(k)) running time by making use of
a binary heap. Let's just get rid of dedupedSeriesSet and change all
existing callers to use mergeSeriesSet.
In order to compose different querier implementations more easily, this
change introduces a separate storage.Queryable interface grouping the
query (Querier) function of the storage.
Furthermore, it adds a QueryableFunc type to ease writing very simple
queryable implementations.
Currently all read queries are simply pushed to remote read clients.
This is fine, except for remote storage for wich it unefficient and
make query slower even if remote read is unnecessary.
So we need instead to compare the oldest timestamp in primary/local
storage with the query range lower boundary. If the oldest timestamp
is older than the mint parameter, then there is no need for remote read.
This is an optionnal behavior per remote read client.
Signed-off-by: Thibault Chataigner <t.chataigner@criteo.com>
* Re-add contexts to storage.Storage.Querier()
These are needed when replacing the storage by a multi-tenant
implementation where the tenant is stored in the context.
The 1.x query interfaces already had contexts, but they got lost in 2.x.
* Convert promql.Engine to use native contexts
With this change the scraping caches series references and only
allocates label sets if it has to retrieve a new reference.
pkg/textparse is used to do the conditional parsing and reduce
allocations from 900B/sample to 0 in the general case.
- Mostly docstring fixed/additions.
(Please review these carefully, since most of them were missing, I
had to guess them from an outsider's perspective. (Which on the
other hand proves how desperately required many of these docstrings
are.))
- Removed all uses of new(...) to meet our own style guide (draft).
- Fixed all other 'go vet' and 'golint' issues (except those that are
not fixable (i.e. caused by bugs in or by design of 'go vet' and
'golint')).
- Some trivial refactorings, like reorder functions, minor renames, ...
- Some slightly less trivial refactoring, mostly to reduce code
duplication by embedding types instead of writing many explicit
forwarders.
- Cleaned up the interface structure a bit. (Most significant probably
the removal of the View-like methods from MetricPersistenc. Now they
are only in View and not duplicated anymore.)
- Removed dead code. (Probably not all of it, but it's a first
step...)
- Fixed a leftover in storage/metric/end_to_end_test.go (that made
some parts of the code never execute (incidentally, those parts
were broken (and I fixed them, too))).
Change-Id: Ibcac069940d118a88f783314f5b4595dce6641d5
This used to work with Go 1.1, but only because of a compiler bug.
The bug is fixed in Go 1.2, so we have to fix our code now.
Change-Id: I5a9f3a15878afd750e848be33e90b05f3aa055e1
This commit introduces to Prometheus a batch database sample curator,
which corroborates the high watermarks for sample series against the
curation watermark table to see whether a curator of a given type
needs to be run.
The curator is an abstract executor, which runs various curation
strategies across the database. It remarks the progress for each
type of curation processor that runs for a given sample series.
A curation procesor is responsible for effectuating the underlying
batch changes that are request. In this commit, we introduce the
CompactionProcessor, which takes several bits of runtime metadata and
combine sparse sample entries in the database together to form larger
groups. For instance, for a given series it would be possible to
have the curator effectuate the following grouping:
- Samples Older than Two Weeks: Grouped into Bunches of 10000
- Samples Older than One Week: Grouped into Bunches of 1000
- Samples Older than One Day: Grouped into Bunches of 100
- Samples Older than One Hour: Grouped into Bunches of 10
The benefits hereof of such a compaction are 1. a smaller search
space in the database keyspace, 2. better employment of compression
for repetious values, and 3. reduced seek times.