This attempts to close#3973.
Handles cases where the length of the input vector to an aggregate topk
/ bottomk function is less than the K paramater. The change updates
Prometheus to allocate a result vector the same length as the input
vector in these cases.
Previously Prometheus would out-of-memory panic for large K values. This
change makes that unlikely unless the size of the input vector is
equally large.
Signed-off-by: David King <dave@davbo.org>
* parser test cleanup
- Test against the exported package functions instead of the private functions.
* Improves readability of TestParseSeries
- Moves package function closer to parser function
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
* TotalFoo suggested a comprehensive timing, but TotalEvalTime was part
of the Exec timings, together with Queue timings
* The other option was to rename ExecTotalTime to TotalExecTime, but
there was already ExecQueueTime, suggesting Exec to be some sort of
group
API consumers should be able to get insight into the query run times.
The UI currently measures total roundtrip times. This PR allows for more
fine grained metrics to be exposed.
* adds new timer for total execution time (queue + eval)
* expose new timer, queue timer, and eval timer in stats field of the
range query response:
```json
{
"status": "success",
"data": {
"resultType": "matrix",
"result": [],
"stats": {
"execQueueTimeNs": 4683,
"execTotalTimeNs": 2086587,
"totalEvalTimeNs": 2077851
}
}
}
```
* stats field is optional, only set when query parameter `stats` is not
empty
Try it via
```sh
curl 'http://localhost:9090/api/v1/query_range?query=up&start=1486480279&end=1486483879&step=14000&stats=true'
```
Review feedback
* moved query stats json generation to query_stats.go
* use seconds for all query timers
* expose all timers available
* Changed ExecTotalTime string representation from Exec queue total time to Exec total time
This means that if there is no stale marker, only the usual staleness
delta (5m) applies.
It has occured to me that there is an oddity in the heurestic. It works
fine as long as you have 2 points within the last 5m, but breaks down
when the time window advances to the point where you have just 1 point.
Consider you had points at t=0 and t=10. With the heurestic it goes stale
at t=51, up until t=300. However from t=301 until t=310 we only
see the t=10 point and the series comes back to life. That is not
desirable.
I don't see a way to keep this form of heurestic working given this
issue, so thus I'm removing it.
* Re-add contexts to storage.Storage.Querier()
These are needed when replacing the storage by a multi-tenant
implementation where the tenant is stored in the context.
The 1.x query interfaces already had contexts, but they got lost in 2.x.
* Convert promql.Engine to use native contexts
With the squaring of the timestamp, we run into the
limitations of the 53bit mantissa for a 64bit float.
By subtracting away a timestamp of one of the samples (which is how the
intercept is used) we avoid this issue in practice as it's unlikely
that it is used over a very long time range.
Fixes#2674
* Fix error where we look into the future.
So currently we are adding values that are in the future for an older
timestamp. For example, if we have [(1, 1), (150, 2)] we will end up
showing [(1, 1), (2,2)].
Further it is not advisable to call .At() after Next() returns false.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Retuen early if done
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Handle Seek() where we reach the end of iterator
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
* Simplify code
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
To cover the cases where stale markers may not be available,
we need to infer the interval and mark series stale based on that.
As we're lacking stale markers this is less accurate, however
it should be good enough for these cases.
We need 4 intervals as if say we had data at t=0 and t=10,
coming via federation. The next data point should be at t=20 however it
could take up to t=30 for it actually to be ingested, t=40 for it to be
scraped via federation and t=50 for it to be ingested.
We then add 10% on to that for slack, as we do elsewhere.
For instant vectors, if "stale" is the newest sample
ignore the timeseries.
For range vectors, filter out "stale" samples.
Make it possible to inject "stale" samples in promql tests.
Query and query_range should return the timestamp
at which an evaluation is performed, not the timestamp
of the data. This is as that's what query range asked
for, and we need to keep query consistent with that.
Query for a matrix remains unchanged, returning the literal
matrix.