Bjoern Rabenstein
ff24070a03
Fix embarrassing bug in crash recovery.
...
(And yes, we always knew we need tests for that. I have added a TODO now.)
Change-Id: I9cf52bbf98e263e0b79404bda4c442beba9696a8
2014-12-17 17:18:04 +01:00
Julius Volz
c9618d11e8
Introduce copy-on-write for metrics in AST.
...
This depends on changes in:
https://github.com/prometheus/client_golang/tree/cow-metrics .
Change-Id: I80b94833a60ddf954c7cd92fd2cfbebd8dd46142
2014-12-12 20:34:55 +01:00
Bjoern Rabenstein
afd864e7f4
Adjust to the new version of goleveldb.
...
(And yes, we do want vendoring for that... This is just the quick fix.)
Change-Id: I9d347a64d96de6b3390a0e35c8d466f14bb83e4e
2014-12-10 18:04:29 +01:00
Bjoern Rabenstein
fee88a7a77
Remove the remaining races, new and old.
...
Also, resolve a few other TODOs.
Change-Id: Icb39b5a5e8ca22ebcb48771cd8951c5d9e112691
2014-12-03 18:07:23 +01:00
Bjoern Rabenstein
66c80b5ebd
Fix typo.
...
Change-Id: I72608c7841c00145458807d3c3ee29db7b5ac2bc
2014-11-28 12:50:19 +01:00
Bjoern Rabenstein
674624f1c8
Completed more TODOs.
...
- Documented checkpoint file format.
- High-level description of series sanitation.
- Replace fp.LoadFromString panic with an error.
(Change in client_golang already submitted.)
- Introduced checks for series file size where appropriate.
- Removed two Law of Demeter violations.
Change-Id: I555d97a2c8f4769820c2fc8bf5d6f4e160222abc
2014-11-27 20:46:45 +01:00
Bjoern Rabenstein
7d11019aa2
Squash a few trivial TODOs.
...
- Delete unneeded file view_adapter.go.
- Assessed that we still need the fingerprints in nodes
(to create iterators).
- Turned numMemChunkDescs into a metric.
Change-Id: I29be963c795a075ec00c095f76bf26405535609d
2014-11-27 18:26:06 +01:00
Bjoern Rabenstein
49683c0c20
Avoid test flags in normal binary.
...
Change-Id: If1fba813a73bf93ea5918dcda326e3ffa81a797d
2014-11-27 18:04:48 +01:00
Bjoern Rabenstein
9bc05052ad
Add line that has mysteriously disappeared after rebase.
...
Change-Id: I3612eb0b626e66e607b363e9801f187d2ba637a3
2014-11-25 17:15:56 +01:00
Bjoern Rabenstein
14bda4180c
Changes after pair code review.
...
Change-Id: Ib72d40f8e9027818cfbbd32a7a7201eebda07455
2014-11-25 17:12:59 +01:00
Bjoern Rabenstein
9ea808cd8b
Remove debug log line.
...
Change-Id: Icdd2351b89f2d37ac2b615f9cf872e054c694ad1
2014-11-25 17:10:39 +01:00
Bjoern Rabenstein
bb42cc2e2d
Evict based on memory pressure. Evict recently used chunks last.
...
Change-Id: Ie6168f0cdb3917bdc63b6fe15585dd70c1e42afe
2014-11-25 17:10:39 +01:00
Bjoern Rabenstein
e23ee0f7cc
Fix race in test.
...
Change-Id: I53e1a4c5a6b5f846acd76043166b6cb7bf7d5dc7
2014-11-25 17:10:39 +01:00
Bjoern Rabenstein
d73e851b14
Tweak timing in the maintenance loop.
...
Change-Id: I9801c4f9a22c3b3dc1ce1af81fdd9e992a4f4dd7
2014-11-25 17:10:39 +01:00
Bjoern Rabenstein
2672aa8ece
Instrument series maintenance.
...
Change-Id: Ie4269d07ad4d23d44230c95a523088b472718e54
2014-11-25 17:10:39 +01:00
Bjoern Rabenstein
74c143c4c9
Improve scraper shutdown time.
...
- Stop target pools in parallel.
- Stop individual scrapers in goroutines, too.
- Timing tweaks.
Change-Id: I9dff1ee18616694f14b04408eaf1625d0f989696
2014-11-25 17:10:39 +01:00
Bjoern Rabenstein
3f61d304ce
Reorganize maintenance loop.
...
Change-Id: Iac10f988ba3e93ffb188f49c30f92e0b6adce5a3
2014-11-25 17:10:30 +01:00
Bjoern Rabenstein
c087ee35f7
Remove archiveMtx.
...
Change-Id: Ie8019f860bbda68621f74380c90a4e57930d3d7a
2014-11-25 17:10:30 +01:00
Bjoern Rabenstein
7af42eda65
Optimize purging.
...
Now only purge if there is something to purge.
Also, set savedFirstTime and archived time range appropriately.
(Which is needed for the optimization.)
Change-Id: Idcd33319a84def3ce0318d886f10c6800369e7f9
2014-11-25 17:10:30 +01:00
Bjoern Rabenstein
33b959b898
Persist savedFirstTime in checkpoint.
...
Change-Id: Ibdfdea16fad0608ec104fbccc749e824a171f227
2014-11-25 17:10:30 +01:00
Bjoern Rabenstein
904acd43da
Add crash recovery.
...
Fix the behavior if preload for non-existent series is requested.
Instead of returning an error (which triggers a panic further up),
simply count those incidents. They can happen regularly, we just want
to know if they happen too frequently because that would mean the
indexing is behind or broken.
Change-Id: I4b2d1b93c4146eeea897d188063cb9574a270f8b
2014-11-25 17:09:43 +01:00
Bjoern Rabenstein
7a9efc9c59
Fix typo in test.
...
Change-Id: I3c2fd76bc5f50446c58f8ef693d9c6595197feaa
2014-11-25 17:09:43 +01:00
Bjoern Rabenstein
4efc60174b
Tweak and verify a few parameters.
...
Remove TODOs accordingly.
Change-Id: Ic062e13b6ae89a9135d3f14011114fe1cca1cef8
2014-11-25 17:09:43 +01:00
Bjoern Rabenstein
5f8e9617ef
Add more tests.
...
Add an end-to-end fuzz and race test.
Fix a race exposed by the above.
Change-Id: Ifaa39a90cefbde8d4c29bda197cc92592ded21bb
2014-11-25 17:09:17 +01:00
Bjoern Rabenstein
d215e013b7
Fix the weird chunkDesc shuffling bug.
...
The root cause was that after chunkDesc eviction, the offset between
memory representation of chunk layout (via chunkDescs in memory) was
shiftet against chunks as layed out on disk. Keeping the offset up to
date is by no means trivial, so this commit is pretty involved.
Also, found a race that for some reason didn't bite us so far:
Persisting chunks was completel unlocked, so if chunks were purged on
disk at the same time, disaster would strike. However, locking the
persisting of chunk revealed interesting dead locks. Basically, never
queue under the fp lock.
Change-Id: I1ea9e4e71024cabbc1f9601b28e74db0c5c55db8
2014-11-25 17:09:17 +01:00
Bjoern Rabenstein
a617269b12
Avoid unnecessary cloning of the head chunk.
...
Change-Id: I5da774515d5493166a197b5814d0a720628cfaff
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
f1de5b0c4e
Run checkpointing of in-memory metrics and head chunks periodically.
...
Checkpointing interval is now a command line flag.
Along the way, several things were refactored.
- Restructure the way the storage is started and stopped..
- Number of series in checkpoint is now a uint64, not a varint.
(Breaks old checkpoints, needs wipe!)
- More consistent naming and order of methods.
Change-Id: I883d9170c9a608ee716bb0ab3d0ded8ca03760d9
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
74c9b34a5e
Improve storage instrumentation even more.
...
Add gauge for chunks and chunkdescs in memory (backed by a global
variable to be used later not only for instrumentation but also for
memory management).
Refactored instrumentation code once more (instrumentation.go is back :).
Change-Id: Ife39947e22a48cac4982db7369c231947f446e17
2014-11-25 17:09:04 +01:00
Julius Volz
c3fcea45e3
Support finer time resolutions than 1 second.
...
Change-Id: I4c5f1d6d2361e841999b23283d1961b1bd0c2859
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
443dd33805
Improve instrumentation in storage.
...
Also, fix some other minor bugs.
Change-Id: If72f1c058b0f47d3e378fdf80228d7e9a8db06c7
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
1936a40e75
Minor loging improvement.
...
Change-Id: I7875d1a58ef9c5ff149f18e36f65959a4712fea2
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
192bf52c41
Evict chunkDescs, too.
...
Change-Id: I8b70f22fbf1dfcbc49f9ec391985144649e6ce9c
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
95f392fb2c
Prevent an indexing death spiral.
...
Change-Id: I86b20cd0830d02f87b2f020767257e2d3fb2033c
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
40354eaa29
Reduce directory depth by one.
...
Change-Id: I7f89df61135ff19169ed97633a662685d414c448
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
096fa0f8b2
Squash a number of TODOs.
...
- Staleness delta is no a proper function parameter and not replicated
from package ast.
- Named type 'chunks' replaced by explicit '[]chunk' to avoid confusion.
- For the same reason, replaced 'chunkDescs' by '[]*chunkDescs'.
- Verified that math.Modf is not a speed enhancement over conversion
(actually 5x slower).
- Renamed firstTimeField, lastTimeField into chunkFirstTime and
chunkLastTime.
- Verified unpin() is sufficiently goroutine-safe.
- Decided not to update archivedFingerprintToTimeRange upon series
truncation and added a rationale why.
Change-Id: I863b8d785e5ad9f71eb63e229845eacf1bed8534
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
427c8d53a5
Fix handling of empty chunkDescs while preloading chunks.
...
Change-Id: I73ce89fe0ef90c6eda78218e5be2cbfa0207c364
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
ecee5d8281
Fix head chunk persisting and a chunkDesc race condition.
...
- Head chunk persisting only happens in evictOlderThan, so do it
there. (With the previous code, it would never happen.)
- Raw accesses to chunkDesc.chunk are now done via isEvicted (with
locking).
Change-Id: I48b07b56dfea4899b50df159b4ea566954396fcd
2014-11-25 17:09:04 +01:00
Bjoern Rabenstein
6b37e47f9e
Remove unused metrics.
...
Change-Id: Icf03ba4ce92a5e38daf12930f9661daba79c83bb
2014-11-25 17:09:03 +01:00
Bjoern Rabenstein
2b4ff620aa
Return a nop iterator for series that have been purged completely.
...
Change-Id: I6e92cac4472486feefdecba8593c17867e8c710d
2014-11-25 17:09:03 +01:00
Bjoern Rabenstein
6e3a366f91
Only archive a time series when none of its chunks is pinned.
...
Change-Id: I7e4b67c34b417b8980173bc5dc3b213bd7d698e5
2014-11-25 17:09:03 +01:00
Julius Volz
bfa64248b7
Deal with missing series in preloading.
...
Change-Id: Ibf3a57b329f40a3d5e0b98464a2f45d2f1bd07bf
2014-11-25 17:09:03 +01:00
Bjoern Rabenstein
ca42a22e20
Add safety panic to seriesMap.put.
...
Change-Id: I4d4d2e45cc0f908a33eb1ae6e3ee6796adfcbd1e
2014-11-25 17:09:03 +01:00
Bjoern Rabenstein
83b4fa868d
Fix GetBoundaryValues.
...
Change-Id: I8f8bbdb88e9b24e4c37ff869126ed9343f261ce2
2014-11-25 17:08:45 +01:00
Bjoern Rabenstein
b3ed9aa7a2
Clean up start-up and shut-down.
...
Change-Id: Idff4bbb0a15a9f879bfbb3da5b1025179cab5e2c
2014-11-25 17:08:45 +01:00
Bjoern Rabenstein
4447708c9f
Fix a race in target.go.
...
Also, fix problems in shutdown.
Starting serving and shutdown still has to be cleaned up properly.
It's a mess.
Change-Id: I51061db12064e434066446e6fceac32741c4f84c
2014-11-25 17:08:45 +01:00
Bjoern Rabenstein
fd6600850a
Fix race in chunkDesc.
...
Change-Id: Id7bae115d75886e10d44184a690a76777b1531fe
2014-11-25 17:08:45 +01:00
Bjoern Rabenstein
1c53c09558
Treat empty chunkDescs properly in preloadChunksForRange.
...
Change-Id: Ida1bd3fe1f9fb0ea2d5dbb9704be926f0824f873
2014-11-25 17:08:45 +01:00
Bjoern Rabenstein
38fc24d0ed
Fix targetpool_test.go and other tests.
...
Change-Id: I91a4dd1d39e01f174e1aaae653ce1ed7aecaa624
2014-11-25 17:08:26 +01:00
Julius Volz
7f5d3c2c29
Fix and improve the fp locker.
...
Benchmark:
$ go test -bench 'Fingerprint' -test.run 'Fingerprint' -test.cpu=1,2,4
OLD
BenchmarkFingerprintLockerParallel 500000 3618 ns/op
BenchmarkFingerprintLockerParallel-2 100000 12257 ns/op
BenchmarkFingerprintLockerParallel-4 500000 10164 ns/op
BenchmarkFingerprintLockerSerial 10000000 283 ns/op
BenchmarkFingerprintLockerSerial-2 10000000 284 ns/op
BenchmarkFingerprintLockerSerial-4 10000000 288 ns/op
NEW
BenchmarkFingerprintLockerParallel 1000000 1018 ns/op
BenchmarkFingerprintLockerParallel-2 1000000 1164 ns/op
BenchmarkFingerprintLockerParallel-4 2000000 910 ns/op
BenchmarkFingerprintLockerSerial 50000000 56.0 ns/op
BenchmarkFingerprintLockerSerial-2 50000000 47.9 ns/op
BenchmarkFingerprintLockerSerial-4 50000000 54.5 ns/op
Change-Id: I3c65a43822840e7e64c3c3cfe759e1de51272581
2014-11-25 17:07:45 +01:00
Bjoern Rabenstein
7ad55ef83c
Actually close the iterator channels.
...
Change-Id: I6f6a2aef5ff55c6b2d21ad91d02ae6b0ecba4ae8
2014-11-25 17:07:45 +01:00
Bjoern Rabenstein
8fba3302bc
Bold changes to concurrency.
...
(WIP. Probably doesn't work yet.)
Change-Id: Id1537dfcca53831a1d428078a5863ece7bdf4875
2014-11-25 17:07:45 +01:00
Bjoern Rabenstein
fcdf5a8ee7
Fix bugs in chunk evict code.
...
Also, simplify code by re-looking up metric in metric map.
Change-Id: Ib2092f9184374e5a543e87d3a9f4a74fda64b193
2014-11-25 17:07:45 +01:00
Bjoern Rabenstein
7e6a03fbf9
Fix a few concurrency issues before starting to use the new fp locker.
...
Change-Id: I8615e8816e79ef0882e123163ee590c739b79d12
2014-11-25 17:07:45 +01:00
Julius Volz
db92620163
Instrument eviction and purge durations.
...
Change-Id: Ia5b2319363ad2644674c9b7a94162a89bcc296fb
2014-11-25 17:07:45 +01:00
Julius Volz
e0ee7ec7ab
Add fingerprintLocker for locking individual fingerprints.
...
Change-Id: Id41ba555715229edf7d6543f56736b82f6eff1ef
2014-11-25 17:07:45 +01:00
Julius Volz
df1b2a2422
Fix indexing latency instrumentation.
...
Change-Id: I532c170121cd2996d1a378adbb1fd551cd5a4e38
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
01dd618a20
Fix a locking bug.
...
Change-Id: I183780785991d0b4165ce9186f53eb8201fb3ed5
2014-11-25 17:07:44 +01:00
Julius Volz
a746fbb8bc
Instrument indexing: queue length, batch sizes and latencies.
...
Change-Id: I60bcbd24b160e47d418a485d8cffa39344a257c6
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
aea32b0b4b
Avoid redundant fingerprint calculation.
...
Change-Id: Ief8a165dcfa5030226953346ec9dfe4a7787df1f
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
e9ff29c547
Comment/code cleanup.
...
Change-Id: I38736e3d0fec79759a2bafa35aecf914480ff810
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
0031a448e2
Add WaitForIndexing.
...
Change-Id: I5a5c975c4246632f937413322c855bbe63d00802
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
c7aad110fb
Add an indexing queue and batch the ops.
...
Some other improvements on the way, in particular codec -> codable
renaming and addition of LookupSet methods.
Change-Id: I978f8f3f84ca8e4d39a9d9f152ae0ad274bbf4e2
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
71206dbc06
More code cleanups.
...
Add license text everywhere.
And others....
Change-Id: I11ccde267a2ef7eb366c4788ba7aeae14ba7545c
2014-11-25 17:07:44 +01:00
Julius Volz
f0d5d4bda3
Fix bug around index purging.
...
Change-Id: I8cea00e03f72bbeead2cbd2d26b34d986059ced0
2014-11-25 17:07:44 +01:00
Julius Volz
630b5a087a
Also consider on-disk fingerprints during purge.
...
This reintroduces LevelDB iterators so that we can iterate through all
the on-disk fingerprints.
Change-Id: I007ee4638d038d2a4461bbda27f30fcaad411474
2014-11-25 17:07:35 +01:00
Bjoern Rabenstein
f5f9f3514a
Major code cleanup.
...
- Make it go-vet and golint clean.
- Add comments, TODOs, etc.
Change-Id: If1392d96f3d5b4cdde597b10c8dff1769fcfabe2
2014-11-25 17:02:53 +01:00
Bjoern Rabenstein
3592dc2359
Implement series eviction.
...
Change-Id: I7a503e0ba78aae3761d032851b06f2807122b085
2014-11-25 17:02:52 +01:00
Bjoern Rabenstein
bbf49200ab
Implement methods in persistence.go.
...
Change-Id: I804cdd0b30420e171825fd86fe1281eca0d5e638
2014-11-25 17:02:23 +01:00
Bjoern Rabenstein
5a128a04a9
Major reorganization of the storage.
...
Most important, the heads file will now persist all the chunk descs,
too. Implicitly, it will serve as the persisted form of the
fp-to-series map.
Change-Id: Ic867e78f2714d54c3b5733939cc5aef43f7bd08d
2014-11-25 17:02:01 +01:00
Bjoern Rabenstein
e7cb9ddb9f
Use a sync.pool for the staging buffer in codec.go.
...
Change-Id: I1aae6847f77b5a7c75582b07c199b1943cf90552
2014-11-25 17:02:01 +01:00
Bjoern Rabenstein
4770cf76a4
Make index package more self-contained.
...
Moved interna from diskPersistence into the indexer.
TotalIndexer now called diskIndexer.
Change-Id: I6c8c62cb171f12bbd8a5474773af7786d71ba388
2014-11-25 17:02:01 +01:00
Bjoern Rabenstein
89f10e8eb2
Move to using the standard library interfaces for encoding/decoding.
...
BinaryMarshaler instead of encodable.
BinaryUnmarshaler instead of decodable.
Left 'codable' in place for lack of a better word.
Change-Id: I8a104be7d6db916e8dbc47ff95e6ff73b845ac22
2014-11-25 17:02:01 +01:00
Bjoern Rabenstein
af77d5ef0b
Added a few missing implementations in index.go.
...
Also, added closing of persistence and mem storage.
Change-Id: Iacf0d22c3520dd2584d9546984c1f8a5ed6cd54e
2014-11-25 17:02:01 +01:00
Julius Volz
cca7ebe906
Some more cleanups / obsolete code removals.
...
Change-Id: I584144ceeeedafdb114266d8a6d2513e67b1d010
2014-11-25 17:02:00 +01:00
Julius Volz
7e85711df0
Beginnings of a tiered index implementation.
...
This reintroduces a LevelDB-based metrics index.
Change-Id: I4111540301c52255a07b2f570761707a32f72c05
2014-11-25 17:02:00 +01:00
Julius Volz
8dfaa5ecd2
Remove use of freelists for chunk bufs.
...
Change-Id: Ib887fdb61e1d96da0cd32545817b925ba88831c1
2014-11-25 17:02:00 +01:00
Julius Volz
7b35e0f0b8
Use constants from math package instead of literals.
...
Change-Id: I55427ba32c2cbb32ee42ec1e3153160965ab8b3c
2014-11-25 17:02:00 +01:00
Julius Volz
15929eece2
Unpin any already loaded chunks upon preloading error.
...
Change-Id: Ib451136e3ef21bce8b814c21b66eaab727ab341b
2014-11-25 17:02:00 +01:00
Julius Volz
fd01d07589
Check that chunk buffer length fits in 16 bit.
...
Change-Id: Id086a54aa8a1990c1979e747c1c02e53bed6d447
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
1ca7f24137
Remove float diff tolerance altogether.
...
Change-Id: I9ea9683a4665d5800fca75560bb4b8a8b4406d55
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
d742edfe0d
Fix precision loss.
...
Large delta values often imply a difference between a large base value
and the large delta value, potentially resulting in small numbers with
a huge precision error. Since large delta values need 8 bytes anyway,
we are not even saving memory.
As a solution, always save the absoluto value rather than a delta once
8 bytes would be needed for the delta. Timestamps are then saved as 8
byte integers, while values are always saved as float64 in that case.
Change-Id: I01100d600515e16df58ce508b50982ffd762cc49
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
dc2e463a97
Improvements after review.
...
Change-Id: I484359282d4c7113518bbbb131f4f18383c08fdb
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
52c9dc43a3
Improve testing.
...
In particular, create a fuzz test for time series.
Change-Id: I523a17912405a0b6b46bd395c781d201dfe55036
2014-11-25 17:02:00 +01:00
Julius Volz
3b25867d61
Add chunk persistence tests, fix storage tests.
...
Change-Id: Id0b8f5382e99efa839cc0f826e92bbda985fe9a9
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
ecdf5ab14f
Index-persistence switched from gob to a hand-coded solution.
...
Change-Id: Ib4ec42535bd08df16d34d4774bb638e35c5a1841
2014-11-25 17:02:00 +01:00
Julius Volz
e7ed39c9a6
Initial experimental snapshot of next-gen storage.
...
Change-Id: Ifb8709960dbedd1d9f5efd88cdd359ee9fa9d26d
2014-11-25 17:02:00 +01:00