"Rushed mode" is formerly known as "degraded mode", which is changed
with this commit, too. The name "degraded" was very misleading.
Also, switch into rushed mode if we have too many chunks in memory and
an at least reasonable amount of chunks to persist so that speeding up
persisting chunks can help.
If only very few chunks are to be truncated from a very large series
file, the rewrite of the file is a lorge overhead. With this change, a
certain ratio of the file has to be dropped to make it happen. While
only causing disk overhead at about the same ratio (by default 10%),
it will cut down I/O by a lot in above scenario.
Perhaps it would be even better to still warn in case the sample value has
changed but the timestamps are equal, but we don't have efficient access
to the last value.
For the label matching index-based preselection phase, don't do an OR
between equality and non-equality matchers. Execute only one of the two
(with equality matchers preferred when present).
Fixes https://github.com/prometheus/prometheus/issues/924
If all samples in consecutive chunks have the same timestamp, the way
we used to load chunks will fail. With this change, the persist
watermark is used to load the right amount of chunkDescs from disk.
This bug is a possible reason for the rare storage corruption we have
observed.
Fixes https://github.com/prometheus/prometheus/issues/481
While doing so, clean up and fix a few other things:
- Fix `go vet` warnings (@fabxc to blame ;).
- Fix a racey problem with unarchiving: Whenever we unarchive a
series, we essentially want to do something with it. However, until
we have done something with it, it appears like a series that is
ready to be archived or even purged. So e.g. it would be ignored
during checkpointing. With this fix, we always load the chunkDescs
upon unarchiving. This is wasteful if we only want to add a new
sample to an archived time series, but the (presumably more common)
case where we access an archived time series in a query doesn't
become more expensive.
- The change above streamlined the getOrCreateSeries ond
newMemorySeries flow. Also, the modTime is now always set correctly.
- Fix the leveldb-backed implementation of KeyValueStore.Delete. It
had the wrong behavior of still returning true, nil if a
non-existing key has been passed in.
This commit adds the honor_labels and params arguments to the scrape
config. This allows to specify query parameters used by the scrapers
and handling scraped labels with precedence.
Change #704 introduced a regression that started reading the queue only
after potential crash recovery. When more than the queue capacity was
indexed, Prometheus deadlocked.
This change is conceptually very simple, although the diff is large. It
switches logging from "github.com/golang/glog" to
"github.com/prometheus/log", while not actually changing any log
messages. V(1)-style logging has been changed to be log.Debug*().
The one central sample ingestion channel has caused a variety of
trouble. This commit removes it. Targets and rule evaluation call an
Append method directly now. To incorporate multiple storage backends
(like OpenTSDB), storage.Tee forks the Append into two different
appenders.
Note that the tsdb queue manager had its own queue anyway. It was a
queue after a queue... Much queue, so overhead...
Targets have their own little buffer (implemented as a channel) to
avoid stalling during an http scrape. But a new scrape will only be
started once the old one is fully ingested.
The contraption of three pipelined ingesters was removed. A Target is
an ingester itself now. Despite more logic in Target, things should be
less confusing now.
Also, remove lint and vet warnings in ast.go.