Commit graph

8 commits

Author SHA1 Message Date
Julius Volz 1eee448bc1 Store samples in custom binary encoding.
This has been shown to provide immense decoding speed benefits.

See also:

https://groups.google.com/forum/#!topic/prometheus-developers/FeGl_qzGrYs

Change-Id: I7d45b4650e44ddecaa91dad9d7fdb3cd0b9f15fe
2014-03-09 22:31:38 +01:00
Bjoern Rabenstein 6bc083f38b Major code cleanup in storage.
- Mostly docstring fixed/additions.
  (Please review these carefully, since most of them were missing, I
  had to guess them from an outsider's perspective. (Which on the
  other hand proves how desperately required many of these docstrings
  are.))

- Removed all uses of new(...) to meet our own style guide (draft).

- Fixed all other 'go vet' and 'golint' issues (except those that are
  not fixable (i.e. caused by bugs in or by design of 'go vet' and
  'golint')).

- Some trivial refactorings, like reorder functions, minor renames, ...

- Some slightly less trivial refactoring, mostly to reduce code
  duplication by embedding types instead of writing many explicit
  forwarders.

- Cleaned up the interface structure a bit. (Most significant probably
  the removal of the View-like methods from MetricPersistenc. Now they
  are only in View and not duplicated anymore.)

- Removed dead code. (Probably not all of it, but it's a first
  step...)

- Fixed a leftover in storage/metric/end_to_end_test.go (that made
  some parts of the code never execute (incidentally, those parts
  were broken (and I fixed them, too))).

Change-Id: Ibcac069940d118a88f783314f5b4595dce6641d5
2014-02-27 15:22:37 +01:00
Matt T. Proud 4a87c002e8 Update low-level i'faces to reflect wireformats.
This commit fixes a critique of the old storage API design, whereby
the input parameters were always as raw bytes and never Protocol
Buffer messages that encapsulated the data, meaning every place a
read or mutation was conducted needed to manually perform said
translations on its own.  This is taxing.

Change-Id: I4786938d0d207cefb7782bd2bd96a517eead186f
2013-09-04 17:13:58 +02:00
Matt T. Proud a73f061d3c Persist solely Protocol Buffers.
An design question was open for me in the beginning was whether to
serialize other types to disk, but Protocol Buffers quickly won out,
which allows us to drop support for other types.  This is a good
start to cleaning up a lot of cruft in the storage stack and
can let us eventually decouple the various moving parts into
separate subsystems for easier reasoning.

This commit is not strictly required, but it is a start to making
the rest a lot more enjoyable to interact with.
2013-06-08 11:02:35 +02:00
Matt T. Proud 4e0c932a4f Simplify Encoder's encoding signature.
The reality is that if we ever try to encode a Protocol Buffer and it
fails, it's likely that such an error is ultimately not a runtime error
and should be fixed forthwith.  Thusly, we should rename
``Encoder.Encode`` to ``Encoder.MustEncode`` and drop the error return
value.
2013-05-16 00:54:18 +03:00
Matt T. Proud b3e34c6658 Implement batch database sample curator.
This commit introduces to Prometheus a batch database sample curator,
which corroborates the high watermarks for sample series against the
curation watermark table to see whether a curator of a given type
needs to be run.

The curator is an abstract executor, which runs various curation
strategies across the database.  It remarks the progress for each
type of curation processor that runs for a given sample series.

A curation procesor is responsible for effectuating the underlying
batch changes that are request.  In this commit, we introduce the
CompactionProcessor, which takes several bits of runtime metadata and
combine sparse sample entries in the database together to form larger
groups.  For instance, for a given series it would be possible to
have the curator effectuate the following grouping:

- Samples Older than Two Weeks: Grouped into Bunches of 10000
- Samples Older than One Week: Grouped into Bunches of 1000
- Samples Older than One Day: Grouped into Bunches of 100
- Samples Older than One Hour: Grouped into Bunches of 10

The benefits hereof of such a compaction are 1. a smaller search
space in the database keyspace, 2. better employment of compression
for repetious values, and 3. reduced seek times.
2013-04-27 17:38:18 +02:00
Matt T. Proud a55602df4a Validate diskFrontier domain for series candidate.
It is the case with the benchmark tool that we thought that we
generated multiple series and saved them to the disk as such, when
in reality, we overwrote the fields of the outgoing metrics via
Go map reference behavior.  This was accidental.  In the course of
diagnosing this, a few errors were found:

1. ``newSeriesFrontier`` should check to see if the candidate fingerprint is within the given domain of the ``diskFrontier``.  If not, as the contract in the docstring stipulates, a ``nil`` ``seriesFrontier`` should be emitted.

2. In the interests of aiding debugging, the raw LevelDB ``levigoIterator`` type now includes a helpful forensics ``String()`` method.

This work produced additional cleanups:

1. ``Close() error`` with the storage stack is technically incorrect, since nowhere in the bowels of it does an error actually occur.  The interface has been simplified to remove this for now.
2013-04-09 11:47:16 +02:00
Matt T. Proud 8f6b55be71 Several interface cleanups.
- Kill Close in Persistent and document interface.
 - Extract batching behavior into interface.
 - Kill IteratorManager, which was used for unknown reasons.
2013-03-24 07:35:43 +01:00