- Parallelize AppendSamples as much as possible without breaking the
contract about temporal order.
- Allocate more fingerprint locker slots.
- Do not run early checkpoints if we are behind on chunk persistence.
- Increase fpMinWaitDuration to give the disk more time for more
important things.
Also, switch math.MaxInt64 and math.MinInt64 to the new constants.
Also, set a much higher default value.
Chunk persist requests can be quite spiky. If you collect a large
number of time series that are very similar, they will tend to finish
up a chunk at about the same time. There is no reason we need to back
up scraping just because of that. The rationale of the new default
value is "1/8 of the chunks in memory".
persistence.go is way too long anyway, and a lot of code is just crash
recovery, which is not important to understand the normal operation.
Also, remove unused `exists` function.
Previously, it would return an error instead. Now we can distinguish
the cases 'error while deleting known key' vs. 'key not in index'
without testing for leveldb-internal kinds of errors.
If queries are still running when the shutdown is initiated, they will
finish _during_ the shutdown. In that case, they might request chunk
eviction upon unpinning their pinned chunks. That might completely
fill the evict request queue _after_ draining it during storage
shutdown. If that ever happens (which is the case if there are _many_
queries still running during shutdown), the affected queries will be
stuck while keeping a fingerprint locked. The checkpointing can then
not process that fingerprint (or one that shares the same lock). And
then we are deadlocked.
- Move CONTRIBUTORS.md to the more common AUTHORS.
- Added the required NOTICE file.
- Changed "Prometheus Team" to "The Prometheus Authors".
- Reverted the erroneous changes to the Apache License.
This mimics the locking leveldb is performing anyway. Advantages of
doing it separately:
- Should we ever replace the leveldb implementation by one without
double-start protection, we are still good.
- In contrast to leveldb, the new code creates a meaningful error
message.
Usually, if you unarchive a series, it is to add something to it,
which will create a new head chunk. However, if a series in
unarchived, and before anything is added to it, it is handled by the
maintenance loop, it will be archived again. In that case, we have to
load the chunkDescs to know the lastTime of the series to be
archived. Usually, this case will happen only rarely (as a race, has
never happened so far, possibly because the locking around unarchiving
and the subsequent sample append is smart enough). However, during
crash recovery, we sometimes treat series as "freshly unarchived"
without directly appending a sample. We might add more cases of that
type later, so better deal with archiving properly and load chunkDescs
if required.
- Documented checkpoint file format.
- High-level description of series sanitation.
- Replace fp.LoadFromString panic with an error.
(Change in client_golang already submitted.)
- Introduced checks for series file size where appropriate.
- Removed two Law of Demeter violations.
Change-Id: I555d97a2c8f4769820c2fc8bf5d6f4e160222abc
- Delete unneeded file view_adapter.go.
- Assessed that we still need the fingerprints in nodes
(to create iterators).
- Turned numMemChunkDescs into a metric.
Change-Id: I29be963c795a075ec00c095f76bf26405535609d
Now only purge if there is something to purge.
Also, set savedFirstTime and archived time range appropriately.
(Which is needed for the optimization.)
Change-Id: Idcd33319a84def3ce0318d886f10c6800369e7f9
Fix the behavior if preload for non-existent series is requested.
Instead of returning an error (which triggers a panic further up),
simply count those incidents. They can happen regularly, we just want
to know if they happen too frequently because that would mean the
indexing is behind or broken.
Change-Id: I4b2d1b93c4146eeea897d188063cb9574a270f8b
The root cause was that after chunkDesc eviction, the offset between
memory representation of chunk layout (via chunkDescs in memory) was
shiftet against chunks as layed out on disk. Keeping the offset up to
date is by no means trivial, so this commit is pretty involved.
Also, found a race that for some reason didn't bite us so far:
Persisting chunks was completel unlocked, so if chunks were purged on
disk at the same time, disaster would strike. However, locking the
persisting of chunk revealed interesting dead locks. Basically, never
queue under the fp lock.
Change-Id: I1ea9e4e71024cabbc1f9601b28e74db0c5c55db8
Checkpointing interval is now a command line flag.
Along the way, several things were refactored.
- Restructure the way the storage is started and stopped..
- Number of series in checkpoint is now a uint64, not a varint.
(Breaks old checkpoints, needs wipe!)
- More consistent naming and order of methods.
Change-Id: I883d9170c9a608ee716bb0ab3d0ded8ca03760d9
Add gauge for chunks and chunkdescs in memory (backed by a global
variable to be used later not only for instrumentation but also for
memory management).
Refactored instrumentation code once more (instrumentation.go is back :).
Change-Id: Ife39947e22a48cac4982db7369c231947f446e17
- Staleness delta is no a proper function parameter and not replicated
from package ast.
- Named type 'chunks' replaced by explicit '[]chunk' to avoid confusion.
- For the same reason, replaced 'chunkDescs' by '[]*chunkDescs'.
- Verified that math.Modf is not a speed enhancement over conversion
(actually 5x slower).
- Renamed firstTimeField, lastTimeField into chunkFirstTime and
chunkLastTime.
- Verified unpin() is sufficiently goroutine-safe.
- Decided not to update archivedFingerprintToTimeRange upon series
truncation and added a rationale why.
Change-Id: I863b8d785e5ad9f71eb63e229845eacf1bed8534
- Head chunk persisting only happens in evictOlderThan, so do it
there. (With the previous code, it would never happen.)
- Raw accesses to chunkDesc.chunk are now done via isEvicted (with
locking).
Change-Id: I48b07b56dfea4899b50df159b4ea566954396fcd
Also, fix problems in shutdown.
Starting serving and shutdown still has to be cleaned up properly.
It's a mess.
Change-Id: I51061db12064e434066446e6fceac32741c4f84c
Some other improvements on the way, in particular codec -> codable
renaming and addition of LookupSet methods.
Change-Id: I978f8f3f84ca8e4d39a9d9f152ae0ad274bbf4e2
Most important, the heads file will now persist all the chunk descs,
too. Implicitly, it will serve as the persisted form of the
fp-to-series map.
Change-Id: Ic867e78f2714d54c3b5733939cc5aef43f7bd08d
BinaryMarshaler instead of encodable.
BinaryUnmarshaler instead of decodable.
Left 'codable' in place for lack of a better word.
Change-Id: I8a104be7d6db916e8dbc47ff95e6ff73b845ac22
Large delta values often imply a difference between a large base value
and the large delta value, potentially resulting in small numbers with
a huge precision error. Since large delta values need 8 bytes anyway,
we are not even saving memory.
As a solution, always save the absoluto value rather than a delta once
8 bytes would be needed for the delta. Timestamps are then saved as 8
byte integers, while values are always saved as float64 in that case.
Change-Id: I01100d600515e16df58ce508b50982ffd762cc49