Commit graph

35 commits

Author SHA1 Message Date
Julius Volz 7a577b86b7 Fix interval op special case.
In the case that a getValuesAtIntervalOp's ExtractSamples() is called
with a current time after the last chunk time, we return without
extracting any further values beyond the last one in the chunk
(correct), but also without advancing the op's time (incorrect). This
leads to an infinite loop in renderView(), since the op is called
repeatedly without ever being advanced and consumed.

This adds handling for this special case. When detecting this case, we
immediately set the op to be consumed, since we would always get a value
after the current time passed in if there was one.

Change-Id: Id99149e07b5188d655331382b8b6a461b677005c
2014-03-26 13:29:03 +01:00
Julius Volz 9d5c367745 Fix incorrect interval op advancement.
This fixes a bug where an interval op might advance too far past the end
of the currently extracted chunk, effectively skipping over relevant
(to-be-extracted) values in the subsequent chunk. The result: missing
samples at chunk boundaries in the resulting view.

Change-Id: Iebf5d086293a277d330039c69f78e1eaf084b3c8
2014-03-18 16:22:50 +01:00
Bjoern Rabenstein c3b282bd14 Add regression tests for 'loop until op is consumed' bug.
- Most of this is the actual regression test in tiered_test.go.

- Working on that regression tests uncovered problems in
  tiered_test.go that are fixed in this commit.

- The 'op.consumed = false' line added to freelist.go was actually not
  fixing a bug. Instead, there was no bug at all. So this commit
  removes that line again, but adds a regression test to make sure
  that the assumed bug is indeed not there (cf. freelist_test.go).

- Removed more code duplication in operation.go (following the same
  approach as before, i.e. embedding op type A into op type B if
  everything in A is the same as in B with the exception of String()
  and ExtractSample()). (This change make struct literals for ops more
  clunky, but that only affects tests. No code change whatsoever was
  necessary in the actual code after this refactoring.)

- Fix another op leak in tiered.go.

Change-Id: Ia165c52e33290ad4f6aba9c83d92318d4f583517
2014-03-12 18:40:24 +01:00
Bjoern Rabenstein 9ea9189dd1 Remove the multi-op-per-fingerprint capability.
Currently, rendering a view is capable of handling multiple ops for
the same fingerprint efficiently. However, this capability requires a
lot of complexity in the code, which we are not using at all because
the way we assemble a viewRequest will never have more than one
operation per fingerprint.

This commit weeds out the said capability, along with all the code
needed for it. It is still possible to have more than one operation
for the same fingerprint, it will just be handled in a less efficient
way (as proven by the unit tests).

As a result, scanjob.go could be removed entirely.

This commit also contains a few related refactorings and removals of
dead code in operation.go, view,go, and freelist.go. Also, the
docstrings received some love.

Change-Id: I032b976e0880151c3f3fdb3234fb65e484f0e2e5
2014-03-04 16:29:56 +01:00
Bjoern Rabenstein 6bc083f38b Major code cleanup in storage.
- Mostly docstring fixed/additions.
  (Please review these carefully, since most of them were missing, I
  had to guess them from an outsider's perspective. (Which on the
  other hand proves how desperately required many of these docstrings
  are.))

- Removed all uses of new(...) to meet our own style guide (draft).

- Fixed all other 'go vet' and 'golint' issues (except those that are
  not fixable (i.e. caused by bugs in or by design of 'go vet' and
  'golint')).

- Some trivial refactorings, like reorder functions, minor renames, ...

- Some slightly less trivial refactoring, mostly to reduce code
  duplication by embedding types instead of writing many explicit
  forwarders.

- Cleaned up the interface structure a bit. (Most significant probably
  the removal of the View-like methods from MetricPersistenc. Now they
  are only in View and not duplicated anymore.)

- Removed dead code. (Probably not all of it, but it's a first
  step...)

- Fixed a leftover in storage/metric/end_to_end_test.go (that made
  some parts of the code never execute (incidentally, those parts
  were broken (and I fixed them, too))).

Change-Id: Ibcac069940d118a88f783314f5b4595dce6641d5
2014-02-27 15:22:37 +01:00
Julius Volz 740d448983 Use custom timestamp type for sample timestamps and related code.
So far we've been using Go's native time.Time for anything related to sample
timestamps. Since the range of time.Time is much bigger than what we need, this
has created two problems:

- there could be time.Time values which were out of the range/precision of the
  time type that we persist to disk, therefore causing incorrectly ordered keys.
  One bug caused by this was:

  https://github.com/prometheus/prometheus/issues/367

  It would be good to use a timestamp type that's more closely aligned with
  what the underlying storage supports.

- sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit
  Unix timestamp (possibly even a 32-bit one). Since we store samples in large
  numbers, this seriously affects memory usage. Furthermore, copying/working
  with the data will be faster if it's smaller.

*MEMORY USAGE RESULTS*
Initial memory usage comparisons for a running Prometheus with 1 timeseries and
100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my
tests, this advantage for some reason decreased a bit the more samples the
timeseries had (to 5-7% for millions of samples). This I can't fully explain,
but perhaps garbage collection issues were involved.

*WHEN TO USE THE NEW TIMESTAMP TYPE*
The new clientmodel.Timestamp type should be used whenever time
calculations are either directly or indirectly related to sample
timestamps.

For example:
- the timestamp of a sample itself
- all kinds of watermarks
- anything that may become or is compared to a sample timestamp (like the timestamp
  passed into Target.Scrape()).

When to still use time.Time:
- for measuring durations/times not related to sample timestamps, like duration
  telemetry exporting, timers that indicate how frequently to execute some
  action, etc.

*NOTE ON OPERATOR OPTIMIZATION TESTS*
We don't use operator optimization code anymore, but it still lives in
the code as dead code. It still has tests, but I couldn't get all of them to
pass with the new timestamp format. I commented out the failing cases for now,
but we should probably remove the dead code soon. I just didn't want to do that
in the same change as this.

Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f
2013-12-03 09:11:28 +01:00
Julius Volz d2da21121c Implement getValueRangeAtIntervalOp for faster range queries.
This also short-circuits optimize() for now, since it is complex to implement
for the new operator, and ops generated by the query layer already fulfill the
needed invariants. We should still investigate later whether to completely
delete operator optimization code or extend it to support
getValueRangeAtIntervalOp operators.
2013-06-26 18:10:36 +02:00
Matt T. Proud 30b1cf80b5 WIP - Snapshot of Moving to Client Model. 2013-06-25 15:52:42 +02:00
Julius Volz f2b4067b7b Speedup and clean up operation optimization. 2013-06-20 03:01:13 +02:00
Julius Volz f2b48b8c4a Make getValuesAtIntervalOp consume all chunk data in one pass.
This is mainly a small performance improvement, since we skip past the last
extracted time immediately if it was also the last sample in the chunk, instead
of trying to extract non-existent values before the chunk end again and again
and only gradually approaching the end of the chunk.
2013-05-22 18:14:45 +02:00
Julius Volz 83d60bed89 extractValuesAroundTime() code simplification. 2013-05-22 18:14:45 +02:00
Julius Volz 71a3172abb Fix and optimize getValuesAtIntervalOp data extraction.
- only the data extracted in the last loop iteration of ExtractSamples() was
  emitted as output
- if e.g. op interval < sample interval, there were situations where the same
  sample was added multiple times to the output
2013-05-14 13:55:17 +02:00
Julius Volz 05afa970d2 Slice expression simplifications. 2013-05-07 13:22:29 +02:00
Julius Volz 99dcbe0f94 Integrate memory and disk layers in view rendering. 2013-04-19 16:01:27 +02:00
Julius Volz a33d2726bc Mark range op as consumed if it receives no data points in range. 2013-03-22 11:50:02 +01:00
Julius Volz becc278eb6 Fix two bugs in range op time advancement. 2013-03-21 18:15:52 +01:00
Matt T. Proud ceb6611957 Fix regression in subsequent range op. compactions.
We have an anomaly whereby subsequent range operations fail to be
compacted into one single range operation.  This fixes such
behavior.
2013-03-21 18:11:04 +01:00
Matt T. Proud bd8bb0edfd One additional reduction. 2013-03-21 18:11:03 +01:00
Matt T. Proud 73b463e814 Additional simplifications. 2013-03-21 18:11:03 +01:00
Matt T. Proud fd47ac570f Implied simplifications. 2013-03-21 18:11:03 +01:00
Matt T. Proud 51a0f21cf8 Interim documentation 2013-03-21 18:11:03 +01:00
Matt T. Proud b470f925b7 Extract rewriting of interval queries. 2013-03-21 18:11:03 +01:00
Matt T. Proud eb721fd220 Include note about greediest range. 2013-03-21 18:11:03 +01:00
Julius Volz e0dbc8c561 Fix edge cases in data extraction for point and interval ops. 2013-03-21 18:11:02 +01:00
Matt T. Proud 896e172463 Extract time group optimizations. 2013-03-21 18:08:48 +01:00
Matt T. Proud 5a71814778 Additional greediness. 2013-03-21 18:08:48 +01:00
Matt T. Proud b00ca7e422 Refactor some greediness computations. 2013-03-21 18:08:48 +01:00
Matt T. Proud 978acd4e96 Simplify time group optimizations.
The old code performed well according to the benchmarks, but the
new code shaves 1/6th of the time off the original and with less
code.
2013-03-21 18:08:48 +01:00
Matt T. Proud d7b534e624 Update documentation. 2013-03-21 18:08:48 +01:00
Matt T. Proud 615e6d13d7 Run `make format`. 2013-03-21 18:08:47 +01:00
Julius Volz caeb759ed7 Add tests for and fix getValuesAlongRangeOp value extraction. 2013-03-21 18:08:47 +01:00
Julius Volz e2fb497eba Add operator value extraction tests. 2013-03-21 18:08:47 +01:00
Julius Volz 12a8863582 Add data extraction methods to operator types. 2013-03-21 18:08:47 +01:00
Matt T. Proud d5380897c3 Cleanups and adds performance regression. 2013-03-21 18:06:51 +01:00
Matt T. Proud 41068c2e84 Checkpoint. 2013-03-21 18:06:51 +01:00