This adds a flag -storage.local.engine which allows turning off local
storage in Prometheus. Instead of adding if-conditions and nil checks to
all parts of Prometheus that deal with Prometheus's local storage
(including the web interface), disabling local storage simply means
replacing the normal local storage with a noop version that throws
samples away and returns empty query results. We also don't add the noop
storage to the fanout appender to decrease internal overhead.
Instead of returning empty results, an alternate behavior could be to
return errors on any query that point out that the local storage is
disabled. Not sure which one is more preferable, so I went with the
empty result option for now.
- fold metric name into labels
- return initialization errors back to main
- add snappy compression
- better context handling
- pre-allocation of labels
- remove generic naming
- other cleanups
This uses a new proto format, with scope for multiple samples per
timeseries in future. This will allow users to pump samples out to
whatever they like without having to change the core Prometheus code.
There's also an example receiver to save users figuring out the
boilerplate themselves.
In https://github.com/prometheus/prometheus/pull/1782 , we moved to a
custom flag set to avoid getting test flags into the main prometheus
binary. However, that removed the logging flags, too. This commit
updates the vendoring to a version of the log package that allows
adding the log flags to our flag set explicitly.
This commit extends the notifier to dispatch alert batches
to multiple Alertmanagers concurrently.
It changes the `-alertmanager.url` flag to accept a comma
separated list of URLs and/or to be set multiple times.
With a lot of series accessed in a short timeframe (by a query, a
large scrape, checkpointing, ...), there is actually quite a
significant amount of lock contention if something similar is running
at the same time.
In those cases, the number of locks needs to be increased.
On the same front, as our fingerprints don't have a lot of entropy, I
introduced some additional shuffling. With the current state, anly
changes in the least singificant bits of a FP would matter.
I got feedback from different sources about rules and targets being
too heavy in the status tab if their are lots of them.
This change also allows for more fine-granular locking.
This is not a verbatim implementation of the Gorilla encoding. First
of all, it could not, even if we wanted, because Prometheus has a
different chunking model (constant size, not constant time). Second,
this adds a number of changes that improve the encoding in general or
at least for the specific use case of Prometheus (and are partially
only possible in the context of Prometheus). See comments in the code
for details.
I needed this today for debugging. It can certainly be improved, but
it's already quite helpful.
I refactored the reading of heads.db files out of persistence, which
is an improvement, too.
I made minor changes to the cli package to allow outputting via the
io.Writer interface.
This gives up on the idea to communicate throuh the Append() call (by
either not returning as it is now or returning an error as
suggested/explored elsewhere). Here I have added a Throttled() call,
which has the advantage that it can be called before a whole _batch_
of Append()'s. Scrapes will happen completely or not at all. Same for
rule group evaluations. That's a highly desired behavior (as discussed
elsewhere). The code is even simpler now as the whole ingestion buffer
could be removed.
Logging of throttled mode has been streamlined and will create at most
one message per minute.
Since we are not overestimating the number of chunks to persist
anymore, this commit also adjusts the default value for
-storage.local.memory-chunks. Update of documentation will follow.
If only very few chunks are to be truncated from a very large series
file, the rewrite of the file is a lorge overhead. With this change, a
certain ratio of the file has to be dropped to make it happen. While
only causing disk overhead at about the same ratio (by default 10%),
it will cut down I/O by a lot in above scenario.
Allows to use graphite over tcp or udp. Metrics labels
and values are used to construct a valid Graphite path
in a way that will allow us to eventually read them back
and reconstruct the metrics.
For example, this metric:
model.Metric{
model.MetricNameLabel: "test:metric",
"testlabel": "test:value",
"testlabel2": "test:value",
)
Will become:
test:metric.testlabel=test:value.testlabel2=test:value
escape.go takes care of escaping values to match Graphite
character set, it basically uses percent-encoding as a fallback
wich will work pretty will in the graphite/grafana world.
The remote storage module also has an optional 'prefix' parameter
to prefix all metrics with a path (for example, 'prometheus.').
Graphite URLs are simply in the form tcp://host:port or
udp://host:port.
Because the InfluxDB client library currently pulls in multiple MBs of
unnecessary dependencies, I have modified and cut up the vendored
version to only pull in the few pieces that are actually needed.
On InfluxDB's side, this dependency issue is tracked in:
https://github.com/influxdb/influxdb/issues/3447
Hopefully, it will be resolved soon.
If a password is needed for InfluxDB, it may be supplied via the
INFLUXDB_PW environment variable.