This is technically BREAKING CHANGE, but it was like this from the beginning: I just notice that we rely in
Prometheus on remote read being sorted. This is because we use selected data from remote reads in MergeSeriesSet
which rely on sorting.
I found during work on https://github.com/prometheus/prometheus/pull/5882 that
we do so many repetitions because of this, for not good reason. I think
I found a good balance between convenience and readability with just one method.
Smaller the interface = better.
Also I don't know what TestSelectSorted was testing, but now it's testing sorting.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
* Fix bug with WAL watcher and Live Reader metrics usage.
Calling NewXMetrics when creating a Watcher or LiveReader results in a
registration error, which we're ignoring, and as a result other than the
first Watcher/Reader created, we had no metrics for either. So we would
only have metrics like Watcher Records Read for the first remote write
config in a users config file.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
This fixes#6992, which was introduced by #6777. There was an
intermediate component which translated TSDB errors into storage errors,
but that component was deleted and this bug went unnoticed, until we
were watching at the Prombench results. Without this, scrape will fail
instead of dropping samples or using "Add" when the series have been
garbage collected.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
This addresses fabxc's TODO.
More importantly, it now properly defers the
querier.Close(). Previously, if a panic happened after creation of the
querier within the populateSeries function, querier.Close() was never called.
The latter was responsible for #6977.
Signed-off-by: beorn7 <beorn@grafana.com>
With defer having less of a performance penalty, there is no reason
not to do those crucial operations via defer.
Context: With isolation in place, if we forget to Commit/Rollback, the
low watermark will get stuck forever.
The current code should not have any bugs, but moving to defer helps
to avoid future bugs.
This is also moving the `closeAppend` in the `Commit` implementation
itself to defer. If logging to the WAL fails, we would have missed the
`closeAppend`.
Signed-off-by: beorn7 <beorn@grafana.com>
Add extra meta labels which will be useful in the case
Prometheus discovery hypervisor .
Signed-off-by: pzqu <pzqu@qq.com>
Co-authored-by: pzqu <pzqu@example.com>
I think the previous behavior is problematic as it will leave
`memSeries` around that still have `pendingCommit` set to `true`.
The only case where this can happen in this code path is a failure to
write to the WAL, in which case we are probably in trouble anyway. I
believe, however, we should still try to do the right thing and do the
full rollback. This will implicitly try to write to the WAL again, but
this time without samples, which may even succeed. (But we propagate
the previous error in any case.)
This also adds `a.head.putSeriesBuffer(a.sampleSeries)` to Rollback,
which was previously missing.
Signed-off-by: beorn7 <beorn@grafana.com>
This PR fixes the regression tests for the issue fixed in #6931 .
The reason for that is that all of the invalid queries that triggered the regression have become more or less valid syntax in #6933 (they might still fail typechecking).
Signed-off-by: Tobias Guggenmos <tobias.guggenmos@uni-ulm.de>
This is most likely due to an endpoint not producing valid
metrics output, which we should treat the same as a failed
scrape, and thus not spam the application logs with it.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
This is taken from #6918. Since we probably won't merge #6918 before
the relase, we have to do this bit of it as it fixes an actual bug
(iso.closeAppend is not called if the append fails because of an error
logging to the WAL).
Signed-off-by: beorn7 <beorn@grafana.com>
* [comments] change word ‘wheter’ to ‘whether’
Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>
* [comments] change word ‘wheter’ to ‘whether’
Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>
We can assume that not all target groups are nil in normal scernarios,
so we can create targets[poolKey] outside the loop.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* tsdb: don't allow ingesting empty labelsets
When we ingest an empty labelset in the head, further blocks can not be
compacted, with the error:
```
level=error ts=2020-02-27T21:26:58.379Z caller=db.go:659 component=tsdb
msg="compaction failed" err="persist head block: write compaction:
add series: out-of-order series added with label set \"{}\" / prev:
\"{}\""
```
We should therefore reject those invalid empty labelsets upfront.
This can be reproduced with the following:
```
cat << END > prometheus.yml
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 1s
basic_auth:
username: test
password: test
metric_relabel_configs:
- regex: ".*"
action: labeldrop
static_configs:
- targets:
- 127.0.1.1:9090
END
./prometheus --storage.tsdb.min-block-duration=1m
```
And wait a few minutes.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
This fixes an issue where the /new/targets page will not load when there
are jobs with invalid CSS characters in them, such as the
namespace/service/0 form used by the Prometheus Operator.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>