This fixes part 1) of https://github.com/prometheus/prometheus/issues/367 (the
storing of samples with the wrong fingerprint into a compacted chunk, thus
corrupting it).
Change-Id: I4c36d0d2e508e37a0aba90b8ca2ecc78ee03e3f1
This commit fixes a critique of the old storage API design, whereby
the input parameters were always as raw bytes and never Protocol
Buffer messages that encapsulated the data, meaning every place a
read or mutation was conducted needed to manually perform said
translations on its own. This is taxing.
Change-Id: I4786938d0d207cefb7782bd2bd96a517eead186f
While a hack, this change should allow us to serve queries
expeditiously during a flush operation.
Change-Id: I9a483fd1dd2b0638ab24ace960df08773c4a5079
The background curation should be staggered to ensure that disk
I/O yields to user-interactive operations in a timely manner. The
lack of routine prioritization necessitates this.
Change-Id: I9b498a74ccd933ffb856e06fedc167430e521d86
Move the stream to an interface, for a number of additional changes
around it are underway.
Conflicts:
storage/metric/memory.go
Change-Id: I4a5fc176f4a5274a64ebdb1cad52600954c463c3
AppendSample will be repcated with AppendSamples, which will take
advantage of bulks appends. This is a necessary step for indexing
pipeline decoupling.
Change-Id: Ia83811a87bcc89973d3b64d64b85a28710253ebc
This commit is the first of several and should not be regarded as the
desired end state for these cleanups. What this one does it, however,
is wrap the query index writing behind an interface type that can be
injected into the storage stack and have its lifecycle managed
separately as needed. It also would mean we can swap out underlying
implementations to support remote indexing, buffering, no-op indexing
very easily.
In the future, most of the individual index interface members in the
tiered storage will go away in favor of agents that can query and
resolve what they need from the datastore without the user knowing
how and why they work.
There are too many parameters to constructing a LevelDB storage
instance for a construction method, so I've opted to take an
idiomatic approach of embedding them in a struct for easier
mediation and versioning.
When samples get flushed to disk, they lose sub-second precision anyways. By
already dropping sub-second precision, data fetched from memory vs. disk will
behave the same. Later, we should consider also storing a more compact
representation than time.Time in memory if we're not going to use its full
precision.
Current series always get watermarks written out upon append now. This
drops support for old series without any watermarks by always reporting
them as too old (stale) during queries.
This also short-circuits optimize() for now, since it is complex to implement
for the new operator, and ops generated by the query layer already fulfill the
needed invariants. We should still investigate later whether to completely
delete operator optimization code or extend it to support
getValueRangeAtIntervalOp operators.
An design question was open for me in the beginning was whether to
serialize other types to disk, but Protocol Buffers quickly won out,
which allows us to drop support for other types. This is a good
start to cleaning up a lot of cruft in the storage stack and
can let us eventually decouple the various moving parts into
separate subsystems for easier reasoning.
This commit is not strictly required, but it is a start to making
the rest a lot more enjoyable to interact with.