Two cases:
- An unarchived metric must have at least one chunk desc loaded upon
unarchival. Otherwise, the file is gone or has size 0, which is an
inconsistency (because the series is still indexed in the archive
index). Hence, quarantining is triggered.
- If loading the chunk descs of a series with a known chunkDescsOffset
(i.e. != -1), the number of chunks loaded must be equal to
chunkDescsOffset. If not, there is a data corruption. An error is
returned, which leads to qurantining.
In any case, there is a guard added to not access the 1st element of
an empty chunkDescs slice. (That's what triggered the crashes in issue
2249.) A time series with unknown chunkDescsOffset and no chunks in
memory and no chunks on disk either could trigger that case. I would
assume such a "null series" doesn't exist, but it's not entirely
unthinkable and unreasonable to happen (perhaps in future uses of the
storage). (Create a series, and then something tries to preload chunks
before the first sample is added.)
This extracts Querier as an instantiateable and closeable object
rather than just defining extending methods of the storage interface.
This improves composability and allows abstracting query transactions,
which can be useful for transaction-level caches, consistent data views,
and encapsulating teardown.
When using the chunking code in other projects (both Weave Prism and
ChronixDB ingester), you sometimes want to know how well you are
utilizing your chunks when closing/storing them.
These more specific methods have replaced `metricForLabelMatchers`
in cases where its `map[fingerprint]metric` result type was
not necessary or was used as an intermediate step
Avoids duplicated calls to `seriesForRange` from
`QueryRange` and `QueryInstant` methods.
This is a followup to https://github.com/prometheus/prometheus/pull/2011.
This publishes more of the methods and other names of the chunk code and
moves the chunk code to its own package. There's some unavoidable
ugliness: the chunk and chunkDesc metrics are used by both packages, so
I had to move them to the chunk package. That isn't great, but I don't
see how to do it better without a larger redesign of everything. Same
for the evict requests and some other types.
This is based on https://github.com/prometheus/prometheus/pull/1997.
This adds contexts to the relevant Storage methods and already passes
PromQL's new per-query context into the storage's query methods.
The immediate motivation supporting multi-tenancy in Frankenstein, but
this could also be used by Prometheus's normal local storage to support
cancellations and timeouts at some point.
CPUs have to serialise write access to a single cache line
effectively reducing level of possible parallelism. Placing
mutexes on different cache lines avoids this problem.
Most gains will be seen on NUMA servers where CPU interconnect
traffic is especially expensive
Before:
go test . -run none -bench BenchmarkFingerprintLocker
BenchmarkFingerprintLockerParallel-4 2000000 932 ns/op
BenchmarkFingerprintLockerSerial-4 30000000 49.6 ns/op
After:
go test . -run none -bench BenchmarkFingerprintLocker
BenchmarkFingerprintLockerParallel-4 3000000 569 ns/op
BenchmarkFingerprintLockerSerial-4 30000000 51.0 ns/op
This adds a flag -storage.local.engine which allows turning off local
storage in Prometheus. Instead of adding if-conditions and nil checks to
all parts of Prometheus that deal with Prometheus's local storage
(including the web interface), disabling local storage simply means
replacing the normal local storage with a noop version that throws
samples away and returns empty query results. We also don't add the noop
storage to the fanout appender to decrease internal overhead.
Instead of returning empty results, an alternate behavior could be to
return errors on any query that point out that the local storage is
disabled. Not sure which one is more preferable, so I went with the
empty result option for now.
Turns out its valid to have an overall chunk which is smaller than the
full doubleDeltaHeaderBytes size -- if it has a single sample, it
doesn't fill the whole header. Updated unmarshalling check to respect
this.
This is (hopefully) a fix for #1653
Specifically, this makes it so that if the length for the stored
delta/doubleDelta is somehow corrupted to be too small, the attempt to
unmarshal will return an error.
The current (broken) behavior is to return a malformed chunk, which can
then lead to a panic when there is an attempt to read header values.
The referenced issue proposed creating chunks with a minimum length -- I
instead opted to just error on the attempt to unmarshal, since I'm not
clear on how it could be safe to proceed when the length is
incorrect/unknown.
The issue also talked about possibly "quarantining series", but I don't
know the surrounding code well enough to understand how to make that
happen.
See discussion in
https://groups.google.com/forum/#!topic/prometheus-developers/bkuGbVlvQ9g
The main idea is that the user of a storage shouldn't have to deal with
fingerprints anymore, and should not need to do an individual preload
call for each metric. The storage interface needs to be made more
high-level to not expose these details.
This also makes it easier to reuse the same storage interface for remote
storages later, as fewer roundtrips are required and the fingerprint
concept doesn't work well across the network.
NOTE: this deliberately gets rid of a small optimization in the old
query Analyzer, where we dedupe instants and ranges for the same series.
This should have a minor impact, as most queries do not have multiple
selectors loading the same series (and at the same offset).