Currently all read queries are simply pushed to remote read clients.
This is fine, except for remote storage for wich it unefficient and
make query slower even if remote read is unnecessary.
So we need instead to compare the oldest timestamp in primary/local
storage with the query range lower boundary. If the oldest timestamp
is older than the mint parameter, then there is no need for remote read.
This is an optionnal behavior per remote read client.
Signed-off-by: Thibault Chataigner <t.chataigner@criteo.com>
Instead, just make the anchoring part of the internal regex. This helps because
some users will want to read back the `Value` field and expect it to be the
same as the input value (e.g. some tests in Cortex), or use the value in
another context which is already expected to add its own anchoring, leading to
superfluous double anchoring (such as when we translate matchers into remote
read request matchers).
* Re-add contexts to storage.Storage.Querier()
These are needed when replacing the storage by a multi-tenant
implementation where the tenant is stored in the context.
The 1.x query interfaces already had contexts, but they got lost in 2.x.
* Convert promql.Engine to use native contexts
This can happen in the situation where the system scales up the number of shards massively (to deal with some backlog), then scales it down again as the number of samples sent during the time period is less than the number received.
* Fix error where we look into the future.
So currently we are adding values that are in the future for an older
timestamp. For example, if we have [(1, 1), (150, 2)] we will end up
showing [(1, 1), (2,2)].
Further it is not advisable to call .At() after Next() returns false.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Retuen early if done
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Handle Seek() where we reach the end of iterator
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Simplify code
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
This is in line with the v1.5 change in paradigm to not keep
chunk.Descs without chunks around after a series maintenance.
It's mainly motivated by avoiding excessive amounts of RAM usage
during crash recovery.
The code avoids to create memory time series with zero chunk.Descs as
that is prone to trigger weird effects. (Series maintenance would
archive series with zero chunk.Descs, but we cannot do that here
because the archive indices still have to be checked.)
The fpIter was kind of cumbersome to use and required a lock for each
iteration (which wasn't even needed for the iteration at startup after
loading the checkpoint).
The new implementation here has an obvious penalty in memory, but it's
only 8 byte per series, so 80MiB for a beefy server with 10M memory
time series (which would probably need ~100GiB RAM, so the memory
penalty is only 0.1% of the total memory need).
The big advantage is that now series maintenance happens in order,
which leads to the time between two maintenances of the same series
being less random. Ideally, after each maintenance, the next
maintenance would tackle the series with the largest number of
non-persisted chunks. That would be quite an effort to find out or
track, but with the approach here, the next maintenance will tackle
the series whose previous maintenance is longest ago, which is a good
approximation.
While this commit won't change the _average_ number of chunks
persisted per maintenance, it will reduce the mean time a given chunk
has to wait for its persistence and thus reduce the steady-state
number of chunks waiting for persistence.
Also, the map iteration in Go is non-deterministic but not truly
random. In practice, the iteration appears to be somewhat "bucketed".
You can often observe a bunch of series with similar duration since
their last maintenance, i.e. you see batches of series with similar
number of chunks persisted per maintenance. If that batch is
relatively young, a whole lot of series are maintained with very few
chunks to persist. (See screenshot in PR for a better explanation.)