Commit graph

7122 commits

Author SHA1 Message Date
Fabian Reinartz 22fd3ef24e Deal with zero-length segments
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:34:18 -04:00
Fabian Reinartz 92e1b20957 Fix close handling
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:34:18 -04:00
Fabian Reinartz 1a5573b4ce Migrate write ahead log
On startup, rewrite the old write ahead log into the new format once.

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:34:18 -04:00
Fabian Reinartz 3e76f0163e Address comments
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:25:30 -04:00
Fabian Reinartz 0ad2b8a349 docs: add new WAL format
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:25:30 -04:00
Fabian Reinartz 3f538817f8 move WAL lock
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:25:30 -04:00
Fabian Reinartz d951140ab8 wal: avoid heap allocation in WAL reader
The buffers we allocated were escaping to the heap, resulting in large
memory usage spikes during startup and checkpointing in Prometheus.
This attaches the buffer to the reader object to prevent this.

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:25:30 -04:00
Fabian Reinartz 7841d417b3 Ensure blocks are time-ordered in memory
We assume in multiple places that the block list held by DB
has blocks sequential by time.
A regression caused us to hold them ordered by ULID, i.e. by creation
time instead.

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:25:30 -04:00
Fabian Reinartz def912ce0e Integrate new WAL and checkpoints
Remove the old WAL and drop in the new one

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:25:30 -04:00
Fabian Reinartz 008399a6e0 Add checkpointing of WAL segments
Create checkpoints from a sequence of WAL segments while filtering
out obsolete data. The checkpoint format is again a sequence of WAL
segments, which allows us to reuse the serialization format and
implementation.

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:24:40 -04:00
Fabian Reinartz 449a2d0db7 wal: add segment type and repair procedure
Allow to repair the WAL based on the error returned by a reader
during a full scan over all records.

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:24:40 -04:00
Fabian Reinartz 8e1f97fad4 wal: add write ahead log package
This adds a new WAL that's agnostic to the actual record contents.
It's much simpler and should be more resilient than the existing one.

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-07-19 07:24:40 -04:00
Thomas Jackson 56daa1f28a Only add LookbackDelta to vector selectors (#4399)
Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>

Related to #4226
2018-07-19 06:16:05 +01:00
Julius Volz d8153ac5d5
Update internal architecture diagram (#4398)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 22:10:23 +02:00
Adam Shannon 566c80b47c web: add named anchors for each rule group (#4130)
* web: add named anchors for each rule group

Signed-off-by: Adam Shannon <adamkshannon@gmail.com>
2018-07-18 16:34:41 +01:00
Daisy T a3376e8f36 add query labels command to promtool (#4346)
Signed-off-by: Daisy T <daisyts@gmx.com>
2018-07-18 16:27:28 +02:00
Bryan Boreham afdb66dfac Expose Group.CopyState() (#4304)
This makes the `rules` package more useful to projects that use
Prometheus as a library.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-07-18 15:14:38 +02:00
Alin Sinpalean 372e7652b7 Reuse (copy) overlapping matrix samples between range evaluation steps (#4315)
* Reuse (copy) overlapping matrix samples between range evaluation steps.

Signed-off-by: Alin Sinpalean <alin.sinpalean@gmail.com>
2018-07-18 11:14:02 +01:00
Jannick Fahlbusch ฏ๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎ 0be25f92e2 EC2 Discovery: Allow to set a custom endpoint (#4333)
Allowing to set a custom endpoint makes it easy to monitor targets on non AWS providers with EC2 compliant APIs.

Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
2018-07-18 10:48:14 +01:00
Julius Volz 95dfb1b1dd
Add missing import to promtool, fix build (#4395)
Sorry, I used GitHub's web-based merge-conflict-resolution editor on
https://github.com/prometheus/prometheus/pull/4308 and it didn't show me
test errors afterwards, but maybe they didn't run again or I should have
waited or something.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 10:26:45 +02:00
Shubheksha 125da3b812 promtool: add command for querying series (#4308)
Signed-off-by: Shubheksha Jalan <jshubheksha@gmail.com>
2018-07-18 10:15:58 +02:00
Julius Volz a215aed9b6
Document internal Prometheus server architecture (#4295)
* Document internal Prometheus server architecture

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Review fixups

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 10:06:41 +02:00
Julius Volz 03aa3a3de8
main: Improve / clean up error messages (#4286)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 09:58:40 +02:00
Chih-Hung Yeh 912d19fb85 Add 3 commands in promtool for getting debug information from prometheus server (#4247)
`debug all` - all information
`debug metrics` - metrics  information
`debug pprof` - profiling  information

the final result is compressed in a `tar.gz` file

Signed-off-by: chyeh <chyeh.taiwan@gmail.com>
2018-07-18 10:52:01 +03:00
Tony Lee bcdaf8e2d2 add unused pointslices to the pool (#4363)
Signed-off-by: Tony Lee <tl@hudson-trading.com>
2018-07-18 05:29:21 +01:00
Ivan Voronchihin 1c6f2a1b68 Update aws-sdk-go (#4153)
Signed-off-by: bege13mot <bege13mot@gmail.com>
2018-07-18 05:26:04 +01:00
Ivan Voronchihin 59d214d277 Update autorest vedoring (#4147)
Signed-off-by: bege13mot <bege13mot@gmail.com>
2018-07-18 05:24:15 +01:00
Goutham Veeramachaneni c28cc5076c Saner defaults and metrics for remote-write (#4279)
* Rename queueCapacity to shardCapacity
* Saner defaults for remote write
* Reduce allocs on retries

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2018-07-18 05:15:16 +01:00
Alin Sinpalean e3b775b78b Simplify BufferedSeriesIterator usage (#4294)
* Allow for BufferedSeriesIterator instances to be created without an underlying iterator, to simplify their usage.

Signed-off-by: Alin Sinpalean <alin.sinpalean@gmail.com>
2018-07-18 05:10:28 +01:00
Sneha Inguva 295a95329e Update vendoring of Prometheus Go client (#4283)
This is to pickup changes from
https://github.com/prometheus/client_golang/pull/414. It leads to
better error output in promtool.

Signed-off-by: Sneha Inguva <singuva@digitalocean.com>
2018-07-18 05:08:38 +01:00
Julius Volz 219e477272 Fix some (valid) lint errors (#4287)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 05:07:33 +01:00
Romain Baugue b41be4ef52 Discovery consul service meta (#4280)
* Upgrade Consul client
* Add ServiceMeta to the labels in ConsulSD

Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>
2018-07-18 05:06:56 +01:00
Martin Lee d0f11a3cc6 Forbid rule-abiding robots from indexing. (#4266)
* Resolves github issue #4257

Signed-off-by: Martin Lee <martin@billforward.net>
2018-07-18 05:01:57 +01:00
Thomas Jackson 92c6f0c92e Add offset to selectParams (#4226)
* Add Start/End to SelectParams
* Make remote read use the new selectParams for start/end

This commit will continue sending the start/end time of the remote read
query as the overarching promql time and the specific range of data that
the query is intersted in receiving a response to is now part of the
ReadHints (upstream discussion in #4226).

* Remove unused vendored code

The genproto.sh script was updated, but the code wasn't regenerated.
This simply removes the vendored deps that are no longer part of the
codegen output.

Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
2018-07-18 04:58:00 +01:00
Alin Sinpalean 96fb0b2155 Optimize PromQL aggregations (#4248)
* Compute hash of label subsets without creating a LabelSet first.

Signed-off-by: Alin Sinpalean <alin.sinpalean@gmail.com>
2018-07-18 04:56:27 +01:00
Julius Volz 9e3171f6e3 rules: Minor naming/comment cleanups (#4328)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 04:54:33 +01:00
Brian Brazil 3ee7a6a6c2
Merge pull request #4375 from prometheus/release-2.3
Merge 2.3.2 release back to master
2018-07-16 14:41:43 +01:00
Simon Pasquier f32acc0b7b discovery/openstack: remove unneeded assignment
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-07-15 12:37:57 +01:00
Simon Pasquier ed99af0b05 docs: fix OpenStack SD for the hypervisor role
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-07-15 12:37:57 +01:00
Tom Wilkie 3228814456 Don't forget to register query_duration_seconds{slice="queue_time"} (#4381)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-07-15 12:24:37 +01:00
Tom Wilkie f83155b11e Review feedback.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-07-13 19:31:23 +01:00
Paul Gier cfb3f31538 add prefix "common-" to make target names
This allows rules to be overridden with warnings about conflicting
target names.

Signed-off-by: Paul Gier <pgier@redhat.com>
2018-07-12 16:53:34 -05:00
Peter Gallerani a9d5034add Fix missing 'msg' in remote storage adapter main.go .Log info message (#4377)
Signed-off-by: Peter Gallerani <peter.gallerani@gmail.com>
2018-07-12 20:54:21 +02:00
Brian Brazil 5b596b97bc
Merge branch 'master' into release-2.3 2018-07-12 16:44:11 +01:00
Julius Volz 05d6d6a2e5
k8s SD: Fix "schema" -> "scheme" typo (#4371)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-12 16:12:32 +02:00
Brian Brazil 71af5e29e8
Merge pull request #4370 from prometheus/232
Release 2.3.2
2018-07-12 15:00:12 +01:00
Brian Brazil ebe107b71b Release 2.3.2
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-07-12 14:44:46 +01:00
Brian Brazil fc2a9c986b Update vendoring for tsdb (#4369)
This pulls in tsdb PRs 330 344 348 353 354 356

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-07-11 15:55:39 +01:00
Brian Brazil 508662fb24 Reorder startup and shutdown to prevent panics. (#4321)
Start rule manager only after tsdb and config is loaded.
Stop rule manager before tsdb to avoid writing to closed storage.
Wait for any in-progress reloads to complete before shutting
down rule manager, so that rule manager doesn't get updated after
being shut down.

Remove incorrect comment around shutting down query enginge.
Log when config reload is completed.

Fixes #4133
Fixes #4262

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-07-11 15:55:30 +01:00
Michael Khalil 5e9056d2f3 return error exit status in prometheus cli (#4296)
Signed-off-by: mikeykhalil <mikeyfkhalil@gmail.com>
2018-07-11 15:55:15 +01:00