A lot of this code was hacked together, literally during a
hackathon. This commit intends not to change the code substantially,
but just make the code obey the usual style practices.
A (possibly incomplete) list of areas:
* Generally address linter warnings.
* The `pgk` directory is deprecated as per dev-summit. No new packages should
be added to it. I moved the new `pkg/histogram` package to `model`
anticipating what's proposed in #9478.
* Make the naming of the Sparse Histogram more consistent. Including
abbreviations, there were just too many names for it: SparseHistogram,
Histogram, Histo, hist, his, shs, h. The idea is to call it "Histogram" in
general. Only add "Sparse" if it is needed to avoid confusion with
conventional Histograms (which is rare because the TSDB really has no notion
of conventional Histograms). Use abbreviations only in local scope, and then
really abbreviate (not just removing three out of seven letters like in
"Histo"). This is in the spirit of
https://github.com/golang/go/wiki/CodeReviewComments#variable-names
* Several other minor name changes.
* A lot of formatting of doc comments. For one, following
https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences
, but also layout question, anticipating how things will look like
when rendered by `godoc` (even where `godoc` doesn't render them
right now because they are for unexported types or not a doc comment
at all but just a normal code comment - consistency is queen!).
* Re-enabled `TestQueryLog` and `TestEndopints` (they pass now,
leaving them disabled was presumably an oversight).
* Bucket iterator for histogram.Histogram is now created with a
method.
* HistogramChunk.iterator now allows iterator recycling. (I think
@dieterbe only commented it out because he was confused by the
question in the comment.)
* HistogramAppender.Append panics now because we decided to treat
staleness marker differently.
Signed-off-by: beorn7 <beorn@grafana.com>
It's a prefectly valid use case to have a sparse histogram with a zero
threshold of zero (i.e. only observations of exactly zero go into the
zero bucket). Even if the current PoC implementation of client_golang
doesn't allow that, such a case should be ingested properly.
However, there is now the edge case af a sparse histogram with a zero
threshold of zero and no observations yet. Such a histogram would look
the same if it was meant to be a conventional histogram. For now, we
ingest this case as a conventional histogram, but the final format
should have means to unambiguously express if a histogram is meant to
be ingested as a sparse histogram or as a conventional histogram.
Signed-off-by: beorn7 <beorn@grafana.com>
Parser now supports summaries and legacy histograms including
exemplars.
It also adds the option of specifying exemplars together with a sparse
histogram by simply using the legacy bucket section, too. The buckets
will be ignored, but the exemplars will be ingested.
Signed-off-by: beorn7 <beorn@grafana.com>
This "brings back" protobuf parsing, with the only goal to play with
the new sparse histograms.
The Prom-2.x style parser is highly adapted to the structure of the
Prometheus text format (and later OpenMetrics). Some jumping through
hoops is required to feed protobuf into it.
This is not meant to be a model for the final implementation. It
should just enable sparse histogram ingestion at a reasonable
efficiency.
Following known shortcomings and flaws:
- No tests yet.
- Summaries and legacy histograms, i.e. without sparse buckets, are
ignored.
- Staleness doesn't work (but this could be fixed in the appender, to
be discussed).
- No tricks have been tried that would be similar to the tricks the
text parsers do (like direct pointers into the HTTP response
body). That makes things weird here. Tricky optimizations only make
sense once the final format is specified, which will almost
certainly not be the old protobuf format. (Interestingly, I expect
this implementation to be in fact much more efficient than the
original protobuf ingestion in Prom-1.x.)
- This is using a proto3 version of metrics.proto (mostly to be
consistent with the other protobuf uses). However, proto3 sees no
difference between an unset field. We depend on that to distinguish
between an unset timestamp and the timestamp 0 (1970-01-01, 00:00:00
UTC). In this experimental code, we just assume that timestamp is
never specified and therefore a timestamp of 0 always is interpreted
as "not set".
Signed-off-by: beorn7 <beorn@grafana.com>